Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 2
Default not exact duplicates...

ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad
  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 144
Default not exact duplicates...

try using SUBSTITUTE or REPLACE to remove the characters you don't want.

HTH
--
Your feedback is very much appreciate, pls click on the Yes button below if
this posting is helpful.

Thank You

cheers, francis









"Tadder" wrote:

ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad

  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 2
Default not exact duplicates...

right, but there are 10,000 rows and over 9900 company names...

"xlmate" wrote:

try using SUBSTITUTE or REPLACE to remove the characters you don't want.

HTH
--
Your feedback is very much appreciate, pls click on the Yes button below if
this posting is helpful.

Thank You

cheers, francis









"Tadder" wrote:

ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad

  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 4,393
Default not exact duplicates...

Let the first item be in A1
In B1 enter =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,".",""),","," ")," inc","")
The removes " inc", commas and periods
Copy this down the column
In C1 enter =COUNTIF(B1:$B$7,B1) (extent the range to cover all items)
Copy this down the column
Select all the C entries, use Copy followed by Edit | Paste Special with
Values specified to change formulas to values
Select all the data and sort on column C; delete all entries with C values
greater than 1.

best wishes
--
Bernard V Liengme
Microsoft Excel MVP
http://people.stfx.ca/bliengme
remove caps from email

"Tadder" wrote in message
...
ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark
them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Condensing a list with duplicates to a list with non-duplicates Nuclear Excel Worksheet Functions 2 July 29th 08 08:03 PM
Routine to find exact Row matches in Col1 Col2 Col3 but exact offsetting numbers in Col4 [email protected] Excel Discussion (Misc queries) 0 June 11th 08 11:30 PM
Duplicates in excel that aren't 100% DUPLICATES ... [email protected] Excel Discussion (Misc queries) 4 May 2nd 08 06:43 PM
== INSTEAD OF EXACT? FARAZ QURESHI Excel Discussion (Misc queries) 10 February 8th 08 03:45 PM
Need to do an EXACT LOOKUP archsmooth Excel Worksheet Functions 1 June 4th 07 06:16 PM


All times are GMT +1. The time now is 06:20 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"