View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.newusers
Roger Govier Roger Govier is offline
external usenet poster
 
Posts: 2,886
Default removing dupliates from 2 lists with similar content

Hi Marc

In XL2007 there is a feature to Remove Duplicates under the Data tab.

--
Regards

Roger Govier


"Marc the confused" wrote in
message ...
Thanks Dave.

You have answered my question perfectly and I really appreciate it. I
always suspected there had to be a way to do this built into Excel,
but I
never found it. De-duping lists like these has always been a task of
heroic
proportions for me from within Excel, which is why I always ended up
breaking
the list out and using external tools/ Shell scripts. Your advice
will be a
real boon when I need to stay inside Office.

Sorry to ask so many questions. Is there a way to do this via
formulae?
Is my best course of action to do up a simple macro if I need to
create a
sheet that others can just drop their lists into?

Thanks again.

Marc

"Dave Peterson" wrote:

Build a giant list made up of all the other lists.
Select the key column
Data|Filter|advanced filter|Check unique records only

Select your range and paste into a new sheet.

The visible cells should be the only ones pasted onto that new sheet.

excel-lookuper wrote:

I have many large lists. Each contains unique entries, as well as
duplicate
entires. I need to combine all lists into one long list with no
duplicates.
Using one field as a primary key value, I want to remove all
duplicate
entries. Where the data is different but the key entry is the
same, I only
need to keep the first occurrence.

e.g.
List 1
key field | name | address | city
abcd | A bunch of data, some empty fields.....
efgh
zzww
qnbc

List 2
key field | name | address | City| Phone Number
kdkd | A bunch of data, some empty fields.....
abcd
werf
asdf
asdd
zzww
poio

----------------------
Note that entries abcd and zzww occur in both lists.

Aligning the columns and putting all lists into one big list is
simple.
Please help me with a method to identify the duplicates and keep
only the
first occurence of them, based on the key field value.

Note that these lists could have many thousands of entries and some
entries
occur many times, so the simplest method would be most appreciated
for the
sake of speed. For now, I am pulling the data into a full power
text editor
and that includes sort and remove duplicate functions. I think I
can figure
out how to do this in Excel with multiple sorts and 'if this cell
is the same
as one above it' things, but I am finding it is taking many many
steps to
accomplish this, specially if there are more than 2 duplicates. Is
there a
simple function like the POSIX uniq command to do this, preferably
in one
step?

thanks


--

Dave Peterson