View Single Post
  #2   Report Post  
Posted to microsoft.public.excel.programming
p45cal[_172_] p45cal[_172_] is offline
external usenet poster
 
Posts: 1
Default finding duplicates and deleting based on another column


tpeter;536940 Wrote:
I have a spreadsheet of data that I have compiled from 6 different
workbooks.
I have used a true false statement to identify the duplicated and now I
need
to delete them based on which spreadsheet they came from. The
consolidated
spreadsheet I have currently has 28,000 records, I am currently
deletleting
them manually but this will take me until I am 100 to go thru. Any help
on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to find
the
duplicates in column A, then evaluted column J to see where the source
of
the data came from. The choices a


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Raw
02-06 is
an option then delete the rest of the duplicates leaving only this one.
If
there is a duplicate and raw isn't available then pick option 2 and so
on.

Thank you for your help, it is greatly appreciated.

Tim Peter


If this is a one-off exercise, then beacuse the data source names don't
sort naturally into your order of preference I would do find and replace
3 times on column J to put a numeral in front to get:
1Raw 02-06
2PG &E Composite Data
3PG&E Data 08
(you'll reverse that later)
then sort your consolidated sheet primarily sorting on column A, but
secondarily on column B ascending. Now for each block of duplicates, the
1Raw 02-06 ones(s) should be at the top.

Now it's just a case of running this macro, which deletes all the lower
duplicates, after selecting the entire block rows-wise, but only column
A:

Sub blah()
toprow = Selection.Row
bottomrow = Selection.Rows.Count + toprow - 1
For i = bottomrow To toprow + 1 Step -1
If Cells(i, "A").Value = Cells(i - 1, "A").Value Then Rows(i).Delete
Next i
End Sub

Now do a find and replace (well 3 actually) on column J to restore the
original data source names.


--
p45cal

*p45cal*
------------------------------------------------------------------------
p45cal's Profile: http://www.thecodecage.com/forumz/member.php?userid=558
View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=147576