View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.misc
Pete
 
Posts: n/a
Default Deleting Duplicates, All records unique

Here's another fairly quick way. I assume your data is not sorted by
filename and I presume you want to keep the sequence you have at the
moment. Assume your four fields occupy columns A to D, and that the
data starts in row 2 (after the headings) and goes down to row 5000.

Add the heading "seq" in column E and in E2 enter 1. Highlight cells E2
to E5000 then Edit | Fill | Series and check Linear with a step value
of 1. Click OK - this will fill a sequence down this column to enable
you to get the data back into the same order.

Highlight A1 to E5000 and sort the data using filename (column B). Add
the heading "Check" in column F, and in cell F2 enter the following
formula:

=IF(OR(B2=B1,B2=B3),"duplicate","unique")

Copy this down to F5000 (double-click the fill handle). Select Data |
Filter | Autofilter (on). Filter column F for "duplicate". Highlight
all visible rows between Row 1 and Row 5001, and Edit | Delete Row. Use
the filter pull-down on column F to select "All", then Data | Filter |
Autofilter (off).

Re-sort the remaining data using column E (seq) for sort order.
Finally, delete columns E and F.

Hope this helps.

Pete