ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Worksheet Functions (https://www.excelbanter.com/excel-worksheet-functions/)
-   -   not exact duplicates... (https://www.excelbanter.com/excel-worksheet-functions/217028-not-exact-duplicates.html)

Tadder

not exact duplicates...
 
ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad

xlmate

not exact duplicates...
 
try using SUBSTITUTE or REPLACE to remove the characters you don't want.

HTH
--
Your feedback is very much appreciate, pls click on the Yes button below if
this posting is helpful.

Thank You

cheers, francis









"Tadder" wrote:

ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad


Tadder

not exact duplicates...
 
right, but there are 10,000 rows and over 9900 company names...

"xlmate" wrote:

try using SUBSTITUTE or REPLACE to remove the characters you don't want.

HTH
--
Your feedback is very much appreciate, pls click on the Yes button below if
this posting is helpful.

Thank You

cheers, francis









"Tadder" wrote:

ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad


Bernard Liengme

not exact duplicates...
 
Let the first item be in A1
In B1 enter =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,".",""),","," ")," inc","")
The removes " inc", commas and periods
Copy this down the column
In C1 enter =COUNTIF(B1:$B$7,B1) (extent the range to cover all items)
Copy this down the column
Select all the C entries, use Copy followed by Edit | Paste Special with
Values specified to change formulas to values
Select all the data and sort on column C; delete all entries with C values
greater than 1.

best wishes
--
Bernard V Liengme
Microsoft Excel MVP
http://people.stfx.ca/bliengme
remove caps from email

"Tadder" wrote in message
...
ok here's the deal - i have a list of companies with contact information.
some companies have multiple contacts and some contacts are duplicated.
sounds simple..but - some of the company names are spelled differently or
have ',' '.' or some other difference. remove duplicates does not work
because they aren't EXACT duplicates...how can i either delete or mark
them
somehow using a variable column? here's a sample:

abc solutions marco polo 123 main
abc solutions, inc marco polo 123 main
abc solutions inc. mark jon 123 main
abc solutions inc mark jon 123 main
abc solutions inc mark jon 123 main
abc sol. Inc. ted frank 123 main
abc sol. Inc ted frank 123 main

this list should be 3 rows afterwards...thanks in advance - Tad





All times are GMT +1. The time now is 12:15 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com