ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Identifying complex duplicates (https://www.excelbanter.com/excel-programming/350483-identifying-complex-duplicates.html)

[email protected]

Identifying complex duplicates
 
Hi

I have a spreadsheet with 40,000 rows. An abbreviated example:
Date RecNum Code Amt Secondary
1/1/2005 453654 44324 550.54 no
12/24/2004 546676 54161 65.66 no
12/24/2004 546676 44970 900.44 no
12/24/2004 546676 43830 no
7/2/2003 54161 54640 773.21 no


The problem:
I need to identify dates where the RecNum occurs more than once. All
but one occurrence on the same date should have the secondary column as
'yes'. (In other words, if one RecNum (=person) has several codes on
the same day, one of the codes is primary, and all the others are
secondary. Ideally, I would look at amount to decide which is the
primary code and which are the secondary (biggest amount = primary).
Sometimes, no amount is given (ie, the data is crappy).

I assume I need to create a new column of date + RecNum (?), then check
for duplicates, and use an if..then conditional to compare amounts...

Any guidance would be appreciated....

TIA


MartynGriffin[_2_]

Identifying complex duplicates
 

Without supplying an exact solution, consider the following approach a
a set of macros:

1) Sort entire table as descending amount
2) turn on autofilters for all columns
3) start with the first data row, moving down towards the last row o
each iteration
4) extract the date value on the current row.
5) turn on a filter for that date only
6) set the first result row as yes, the others as no
7) undo the autofilter selection by date
8) increment your selection row, go back to 4)

Of course, add in needed error checking etc to deal with faulty data

--
MartynGriffi
-----------------------------------------------------------------------
MartynGriffin's Profile: http://www.excelforum.com/member.php...fo&userid=3049
View this thread: http://www.excelforum.com/showthread.php?threadid=50150


Toppers

Identifying complex duplicates
 
Hi,

This assumes data (columns A to E) is sorted by DATE (Ascending),
RECNUM(Ascending) and AMT (Descending). I have done limited testing!

HTH

Sub FindDuplicates()

Dim DateRng As Range
Dim lastrow As Long, r As Long
Dim nrec As Long, nrecnum As Long, ndate As Long

lastrow = Cells(Rows.Count, 1).End(xlUp).Row
Set DateRng = Range("a2:a" & lastrow)

r = 2
Compdate = Range("a" & r).Value
Do
ndate = Application.CountIf(DateRng, Compdate)
If ndate 1 Then
nrec = 0
Do
nrecnum = Application.CountIf(Range(Cells(r, 2), Cells(ndate + r - 1,
2)), Cells(r, 2))
Cells(r, 5) = "Yes"
If nrecnum 1 Then
Cells(r + 1, 5).Resize(nrecnum - 1, 1) = "No"
End If
r = r + nrecnum
nrec = nrec + nrecnum
Loop While nrec < ndate
Else
Cells(r, 5) = "Yes"
r = r + 1
End If


Compdate = Range("a" & r).Value

Loop While r <= lastrow
End Sub

" wrote:

Hi

I have a spreadsheet with 40,000 rows. An abbreviated example:
Date RecNum Code Amt Secondary
1/1/2005 453654 44324 550.54 no
12/24/2004 546676 54161 65.66 no
12/24/2004 546676 44970 900.44 no
12/24/2004 546676 43830 no
7/2/2003 54161 54640 773.21 no


The problem:
I need to identify dates where the RecNum occurs more than once. All
but one occurrence on the same date should have the secondary column as
'yes'. (In other words, if one RecNum (=person) has several codes on
the same day, one of the codes is primary, and all the others are
secondary. Ideally, I would look at amount to decide which is the
primary code and which are the secondary (biggest amount = primary).
Sometimes, no amount is given (ie, the data is crappy).

I assume I need to create a new column of date + RecNum (?), then check
for duplicates, and use an if..then conditional to compare amounts...

Any guidance would be appreciated....

TIA



Bob Phillips[_6_]

Identifying complex duplicates
 
Does this do what you want?

In row 2

=IF(SUMPRODUCT(--($A$2:A2=A2),--($B$2:B2=B2))1,"no","yes")

and copy down

--

HTH

RP
(remove nothere from the email address if mailing direct)


wrote in message
oups.com...
Hi

I have a spreadsheet with 40,000 rows. An abbreviated example:
Date RecNum Code Amt Secondary
1/1/2005 453654 44324 550.54 no
12/24/2004 546676 54161 65.66 no
12/24/2004 546676 44970 900.44 no
12/24/2004 546676 43830 no
7/2/2003 54161 54640 773.21 no


The problem:
I need to identify dates where the RecNum occurs more than once. All
but one occurrence on the same date should have the secondary column as
'yes'. (In other words, if one RecNum (=person) has several codes on
the same day, one of the codes is primary, and all the others are
secondary. Ideally, I would look at amount to decide which is the
primary code and which are the secondary (biggest amount = primary).
Sometimes, no amount is given (ie, the data is crappy).

I assume I need to create a new column of date + RecNum (?), then check
for duplicates, and use an if..then conditional to compare amounts...

Any guidance would be appreciated....

TIA





All times are GMT +1. The time now is 03:20 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com