ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Discussion (Misc queries) (https://www.excelbanter.com/excel-discussion-misc-queries/)
-   -   Compare data in two columns to find duplicates. (https://www.excelbanter.com/excel-discussion-misc-queries/48594-compare-data-two-columns-find-duplicates.html)

Dan

Compare data in two columns to find duplicates.
 
I have two columns of siganl names. Column A is "From" data. Column B is "To"
data. In some cases the dulpicates are the same signal path in reverse. An
example would be Col A: Pin_1 Col B: GND. Further down the list would be Col
A: GND Col B: Pin_1. I have 60,000+ rows of signal names. It would be great
to find the duplicates and eliminate doing twice the work.

Thanks
Dan


david

Hm.

I inserted a Column C, D, E, & F.

C contains a copy of A.
D contains a unique ID for each row (just auto-numbered 1-whatever)
E and F contain formulas that do comparisons on the To and From fields
respectively.

I named columns A-D checkTo and named columns B-D checkFrom.

For E (the To comparison), each cell has this formula:

=IF(VLOOKUP(A2,checkFrom,2,0)=B2,VLOOKUP(A2,checkF rom,3,0),0)

This says: IF a lookup of the To value in the From list yields a match AND
the respective To value of the From match = the From value THEN display the
ID for that record. IF the lookup succeeds but the respective To value
doesn't match the From value, display a 0. IF the lookup fails, excel simply
displays #N/A.

The F column does the reverse, doing lookups on the From value:

=IF(VLOOKUP(B2,checkTo,2,0)=C2,VLOOKUP(B2,checkTo, 4,0),0)

IF a lookup of the From value in the To list yields a match AND the
respective From value of the To match = the To value THEN display the ID for
that record. Otherwise, as with the other formula, display a 0 or #N/A.

Just remember that both matches will reference each other's ID, so be
careful not to delete both duplicates...

"Dan" wrote:

I have two columns of siganl names. Column A is "From" data. Column B is "To"
data. In some cases the dulpicates are the same signal path in reverse. An
example would be Col A: Pin_1 Col B: GND. Further down the list would be Col
A: GND Col B: Pin_1. I have 60,000+ rows of signal names. It would be great
to find the duplicates and eliminate doing twice the work.

Thanks
Dan


Dan

David,

Thanks, that worked great! This will save alot of time.

"david" wrote:

Hm.

I inserted a Column C, D, E, & F.

C contains a copy of A.
D contains a unique ID for each row (just auto-numbered 1-whatever)
E and F contain formulas that do comparisons on the To and From fields
respectively.

I named columns A-D checkTo and named columns B-D checkFrom.

For E (the To comparison), each cell has this formula:

=IF(VLOOKUP(A2,checkFrom,2,0)=B2,VLOOKUP(A2,checkF rom,3,0),0)

This says: IF a lookup of the To value in the From list yields a match AND
the respective To value of the From match = the From value THEN display the
ID for that record. IF the lookup succeeds but the respective To value
doesn't match the From value, display a 0. IF the lookup fails, excel simply
displays #N/A.

The F column does the reverse, doing lookups on the From value:

=IF(VLOOKUP(B2,checkTo,2,0)=C2,VLOOKUP(B2,checkTo, 4,0),0)

IF a lookup of the From value in the To list yields a match AND the
respective From value of the To match = the To value THEN display the ID for
that record. Otherwise, as with the other formula, display a 0 or #N/A.

Just remember that both matches will reference each other's ID, so be
careful not to delete both duplicates...

"Dan" wrote:

I have two columns of siganl names. Column A is "From" data. Column B is "To"
data. In some cases the dulpicates are the same signal path in reverse. An
example would be Col A: Pin_1 Col B: GND. Further down the list would be Col
A: GND Col B: Pin_1. I have 60,000+ rows of signal names. It would be great
to find the duplicates and eliminate doing twice the work.

Thanks
Dan



All times are GMT +1. The time now is 02:07 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com