ExcelBanter - Finding Duplicate Entries

ExcelBanter (https://www.excelbanter.com/)

- Excel Discussion (Misc queries) (https://www.excelbanter.com/excel-discussion-misc-queries/)

- - Finding Duplicate Entries (https://www.excelbanter.com/excel-discussion-misc-queries/113739-finding-duplicate-entries.html)

Finding Duplicate Entries

I have a list of formulae for about 2000 products. In the table the Product
Name repeats and in the rows to the right of the name are the ingredients. I
can group the data by product, leaving "x" number of blank rows to separate
one column from another.

In that 2000 products, I have maybe 8 that use the same ingredient twice;
once near the beginning, once near the end. I want to be able to identify
those 8 products.

I need some way to consider each product as a discreet group and then
evaluate that group, looking for repeats in the ingredients column. Then I
need someway to highlight or otherwise bring products of this type to my
attention.

Just in case the above was too murky, here's and example of what I'm
looking for:

Product A Step 1 Ingredient-1000
Product A Step 2 Ingredient-8000
Product A Step 3 Ingredient-9000
Product A Step 4 Ingredient-1000
Product A Step 5 Ingredient-2000

I want to spot the two additions of ingredient-1000 (in steps 1 and 4). Can
this be done? Can it be done forcing the macro (or logical argument) to
consider a single grouping of ingredients, a single product, at a time?

Thanks

Finding Duplicate Entries

I should add two comments: 1) it's not always the same ingredient that will
repeat within any given formula (the ingredient will vary from formula to
formula) and 2) the ingredient that repeats in a single formula may be used
in making 100+ products in which its use is not repeated. In other words a
simple "Find", isn't applicable.

I picture sorting by ingredients, introducing a break there, then an
argument, =IF (A2=A1, "FLAG", ""); wherein A1 through A"x" are product names
(alphebetized before inserting the break).

I hope someone's got a better idea. :)

"ConfusedNHouston" wrote:

I have a list of formulae for about 2000 products. In the table the Product
Name repeats and in the rows to the right of the name are the ingredients. I
can group the data by product, leaving "x" number of blank rows to separate
one column from another.

In that 2000 products, I have maybe 8 that use the same ingredient twice;
once near the beginning, once near the end. I want to be able to identify
those 8 products.

I need some way to consider each product as a discreet group and then
evaluate that group, looking for repeats in the ingredients column. Then I
need someway to highlight or otherwise bring products of this type to my
attention.

Just in case the above was too murky, here's and example of what I'm
looking for:

Product A Step 1 Ingredient-1000
Product A Step 2 Ingredient-8000
Product A Step 3 Ingredient-9000
Product A Step 4 Ingredient-1000
Product A Step 5 Ingredient-2000

I want to spot the two additions of ingredient-1000 (in steps 1 and 4). Can
this be done? Can it be done forcing the macro (or logical argument) to
consider a single grouping of ingredients, a single product, at a time?

Thanks

Finding Duplicate Entries

Venturing some thoughts ..

Assuming source data is in cols A to C, from row2 down

Product A Step 1 Ingredient-1000
Product A Step 2 Ingredient-8000
Product A Step 3 Ingredient-9000
Product A Step 4 Ingredient-1000
Product A Step 5 Ingredient-2000
etc

and you
.. want to spot the two additions of ingredient-1000 (in steps 1 and 4)

Place in D2:
=IF(SUMPRODUCT(($A$2:A2=A2)*($C$2:C2=C2)*($A$2:A2< "")*($C$2:C2<""))=2,"Dup","")

Copy D2 down to cover the max expected extent of source data. Col D will
flag duplicates based on the data in cols A and C, ie the "Product" &
"Ingredient" cols, if any, as "Dup". We could then select col D, and do a
Data Filter Autofilter to filter out all lines with "Dup" for closer
inspection ..
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
"ConfusedNHouston" wrote:
I should add two comments: 1) it's not always the same ingredient that will
repeat within any given formula (the ingredient will vary from formula to
formula) and 2) the ingredient that repeats in a single formula may be used
in making 100+ products in which its use is not repeated. In other words a
simple "Find", isn't applicable.

I picture sorting by ingredients, introducing a break there, then an
argument, =IF (A2=A1, "FLAG", ""); wherein A1 through A"x" are product names
(alphebetized before inserting the break).

I hope someone's got a better idea. :)

"ConfusedNHouston" wrote:

I have a list of formulae for about 2000 products. In the table the Product
Name repeats and in the rows to the right of the name are the ingredients. I
can group the data by product, leaving "x" number of blank rows to separate
one column from another.

In that 2000 products, I have maybe 8 that use the same ingredient twice;
once near the beginning, once near the end. I want to be able to identify
those 8 products.

I need some way to consider each product as a discreet group and then
evaluate that group, looking for repeats in the ingredients column. Then I
need someway to highlight or otherwise bring products of this type to my
attention.

Just in case the above was too murky, here's and example of what I'm
looking for:

Product A Step 1 Ingredient-1000
Product A Step 2 Ingredient-8000
Product A Step 3 Ingredient-9000
Product A Step 4 Ingredient-1000
Product A Step 5 Ingredient-2000

I want to spot the two additions of ingredient-1000 (in steps 1 and 4). Can
this be done? Can it be done forcing the macro (or logical argument) to
consider a single grouping of ingredients, a single product, at a time?

Thanks