Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 28
Default Finding Similar Data in worksheets

I have a list of contacts and would like to compare it to another list from a
different department to find duplicates.

The issue is that some of the information on both lists are recorded with
slight variations (one includes Inc. while other doesn't, etc...). Normally,
I could mark duplicates using conditional formatting but it only considers
exact duplicates rather than looking for similarity.

Does anyone know of any good plugins to accomplish this or a way to create a
find duplicate function that has a bit more tolerance for variation?

Thanks.
  #2   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 8
Default Finding Similar Data in worksheets

Why not try scrubbing the data first, using find & replace to take out Inc,
LLC, or other unwanted characters in both tables.

Then in one of the tables add an additional column that uses a vlookup
formula to compare the Company Name. If it gets a match the Company Name
should pop up in the column, indicating a match. ...If you have other
columns of data such as phone number, url, or email address you might not
have to scrub the data first?



"MEAD5432" wrote:

I have a list of contacts and would like to compare it to another list from a
different department to find duplicates.

The issue is that some of the information on both lists are recorded with
slight variations (one includes Inc. while other doesn't, etc...). Normally,
I could mark duplicates using conditional formatting but it only considers
exact duplicates rather than looking for similarity.

Does anyone know of any good plugins to accomplish this or a way to create a
find duplicate function that has a bit more tolerance for variation?

Thanks.

  #3   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 28
Default Finding Similar Data in worksheets

Thanks for the suggestion.

Removing certain parts of the name would probably improve my ability to
check both lists and I am definitely going to give that a shot.

There is, however, the pesky problem of catching names that are spelled
slightly different; things like transposed letters or misplaced hyphens.

This is where is gets problematic. Maybe this is a problem that simply
requires human logic that can't easily be programmed...

Thanks again.

"Tfrup12" wrote:

Why not try scrubbing the data first, using find & replace to take out Inc,
LLC, or other unwanted characters in both tables.

Then in one of the tables add an additional column that uses a vlookup
formula to compare the Company Name. If it gets a match the Company Name
should pop up in the column, indicating a match. ...If you have other
columns of data such as phone number, url, or email address you might not
have to scrub the data first?



"MEAD5432" wrote:

I have a list of contacts and would like to compare it to another list from a
different department to find duplicates.

The issue is that some of the information on both lists are recorded with
slight variations (one includes Inc. while other doesn't, etc...). Normally,
I could mark duplicates using conditional formatting but it only considers
exact duplicates rather than looking for similarity.

Does anyone know of any good plugins to accomplish this or a way to create a
find duplicate function that has a bit more tolerance for variation?

Thanks.

  #4   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 8
Default Finding Similar Data in worksheets

Perhaps once you have identified the ones that do not have an exact match you
can sort them out, create two new tables out of the remainder, and then use
'true' in your vlookup formula, (vs. false which yields only exact matches.)

=VLOOKUP(D12,B$12:B$18,1,TRUE)

Make sure you sort your table array 'B$12:B$18' into alphabetical order, or
else you will get miss-matches...which you might get regardless, but its the
only step that I am aware of inside of excel that lies between exact matches
and 'human logic'.

Good Luck!


"MEAD5432" wrote:

Thanks for the suggestion.

Removing certain parts of the name would probably improve my ability to
check both lists and I am definitely going to give that a shot.

There is, however, the pesky problem of catching names that are spelled
slightly different; things like transposed letters or misplaced hyphens.

This is where is gets problematic. Maybe this is a problem that simply
requires human logic that can't easily be programmed...

Thanks again.

"Tfrup12" wrote:

Why not try scrubbing the data first, using find & replace to take out Inc,
LLC, or other unwanted characters in both tables.

Then in one of the tables add an additional column that uses a vlookup
formula to compare the Company Name. If it gets a match the Company Name
should pop up in the column, indicating a match. ...If you have other
columns of data such as phone number, url, or email address you might not
have to scrub the data first?



"MEAD5432" wrote:

I have a list of contacts and would like to compare it to another list from a
different department to find duplicates.

The issue is that some of the information on both lists are recorded with
slight variations (one includes Inc. while other doesn't, etc...). Normally,
I could mark duplicates using conditional formatting but it only considers
exact duplicates rather than looking for similarity.

Does anyone know of any good plugins to accomplish this or a way to create a
find duplicate function that has a bit more tolerance for variation?

Thanks.

  #5   Report Post  
Posted to microsoft.public.excel.misc
external usenet poster
 
Posts: 8
Default Finding Similar Data in worksheets

I can't emphasize enough how careful you have to be when using 'True' in a
vlookup formula for this purpose. It will attempt to match everything, at
which point you will need to visually check your results. But if you have
legitimate similarities only off by a letter or two it will do a pretty good
job of pointing out the potential relationship. However, it will go too far
in most cases.

"MEAD5432" wrote:

Thanks for the suggestion.

Removing certain parts of the name would probably improve my ability to
check both lists and I am definitely going to give that a shot.

There is, however, the pesky problem of catching names that are spelled
slightly different; things like transposed letters or misplaced hyphens.

This is where is gets problematic. Maybe this is a problem that simply
requires human logic that can't easily be programmed...

Thanks again.

"Tfrup12" wrote:

Why not try scrubbing the data first, using find & replace to take out Inc,
LLC, or other unwanted characters in both tables.

Then in one of the tables add an additional column that uses a vlookup
formula to compare the Company Name. If it gets a match the Company Name
should pop up in the column, indicating a match. ...If you have other
columns of data such as phone number, url, or email address you might not
have to scrub the data first?



"MEAD5432" wrote:

I have a list of contacts and would like to compare it to another list from a
different department to find duplicates.

The issue is that some of the information on both lists are recorded with
slight variations (one includes Inc. while other doesn't, etc...). Normally,
I could mark duplicates using conditional formatting but it only considers
exact duplicates rather than looking for similarity.

Does anyone know of any good plugins to accomplish this or a way to create a
find duplicate function that has a bit more tolerance for variation?

Thanks.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Finding similar data in a spreadsheet 101guy Excel Discussion (Misc queries) 1 March 16th 07 10:12 PM
Two Worksheets similar data Muddypaws Excel Discussion (Misc queries) 4 September 14th 06 02:22 PM
How do I merge similar data across different excel worksheets? MK Excel Discussion (Misc queries) 1 August 11th 05 12:17 AM
Finding similar data or numbers in two columns sayemasof New Users to Excel 1 June 1st 05 08:41 PM
Comparing data in two similar worksheets HiRllr21 Excel Discussion (Misc queries) 0 February 2nd 05 05:01 PM


All times are GMT +1. The time now is 12:57 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"