ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Worksheet Functions (https://www.excelbanter.com/excel-worksheet-functions/)
-   -   Find duplicate rows then deleting them (https://www.excelbanter.com/excel-worksheet-functions/6094-find-duplicate-rows-then-deleting-them.html)

SSHO_99

Find duplicate rows then deleting them
 
I have 12 separate months of data, each in their own spreadsheet. I want to
simply copy/paste all of the data from the 12 separate spreadsheets into one
master worksheet, then find all of the duplicates and somehow identity them
so I can manually remove the duplicates (or automate a way to 'eliminate'
them). To slightly complicate this, let's say the data has 6 columns. The
first 4 columns can have duplicates while the date in columns 5 and 6 may
vary. A simple way to describe this would be to "Find duplicates in column
"A" and after the first unique value, turn all other duplicate rows yellow".

So far I have copied/pasted the data into one master spreadsheet, then I
have sorted the data which made the duplicates one on top of the other. I
can see the duplicates and elimiate them one by one, however I have thousands
of rows, so I want to automate it somehow. Here is an example:


A B D E F
1 111111 Lamaya 1319.00 359.00 354.60
2 222222 John 2755.81 286.06 0.00
3 333333 Steve 2873.12 0.00 85.00
4 333333 Steve 2873.12 44.20 0.00
5 333333 Steve 2873.12 0.00 368.30
6 444444 Gail 2450.00 0.00 23.98
7 555555 Joe 1086.57 887.87 226.30
8 555555 Joe 1086.57 665.21 0.00
9 666666 Bob 96.40 0.00 201.30
10 777777 Jenn 2075.00 5531.00 101.20
11 777777 Jenn 2075.00 2040.00 20.30
12 777777 Jenn 2075.00 1020.00 512.30
13 777777 Jenn 2075.00 119.00 71.00
14 888888 Peter 391.30 0.00 1.99
15 888888 Peter 391.30 0.00 35.03
16 999999 Tony 3077.00 110.12 0.00

In this example I want to 'eliminate (or somehow idenity) rows 4, 5, 8, 11,
12, 13 and 15 as they are the duplicates.

I think I could do a conditional format, but I don't know if that will do
what I need. Possibly a macro? I'm ok with macro's and not afraid to play
with them, I'm just not a programmer and have hit a dead end. Possibly have
another worksheet that picks each unique date in column A and ignores the
duplicates?

Thanks,

Steve

N Harkawat

1) Put the data of all 12 sheets in one single sheet
2) On this new sheet on column G add this formula
=b2&c2&d2&e2
assuming that the column B contains "111111", Column C contains"Lamaya" etc.
If not then sadjust the formula to select those 4 column that you need
compared
3) Copy this formula down to the last row.
4) Sort the data based on Column G
5) On ColumnH type the formula
=G2=G3
All duplicates will show True.
6) Select Column G and Column H then copy then edit-Paste Special-- Values
7) Sort the data again based on col H . That way all true comes together and
delete those rows



"SSHO_99" wrote in message
...
I have 12 separate months of data, each in their own spreadsheet. I want

to
simply copy/paste all of the data from the 12 separate spreadsheets into

one
master worksheet, then find all of the duplicates and somehow identity

them
so I can manually remove the duplicates (or automate a way to 'eliminate'
them). To slightly complicate this, let's say the data has 6 columns.

The
first 4 columns can have duplicates while the date in columns 5 and 6 may
vary. A simple way to describe this would be to "Find duplicates in

column
"A" and after the first unique value, turn all other duplicate rows

yellow".

So far I have copied/pasted the data into one master spreadsheet, then I
have sorted the data which made the duplicates one on top of the other. I
can see the duplicates and elimiate them one by one, however I have

thousands
of rows, so I want to automate it somehow. Here is an example:


A B D E F
1 111111 Lamaya 1319.00 359.00 354.60
2 222222 John 2755.81 286.06 0.00
3 333333 Steve 2873.12 0.00 85.00
4 333333 Steve 2873.12 44.20 0.00
5 333333 Steve 2873.12 0.00 368.30
6 444444 Gail 2450.00 0.00 23.98
7 555555 Joe 1086.57 887.87 226.30
8 555555 Joe 1086.57 665.21 0.00
9 666666 Bob 96.40 0.00 201.30
10 777777 Jenn 2075.00 5531.00 101.20
11 777777 Jenn 2075.00 2040.00 20.30
12 777777 Jenn 2075.00 1020.00 512.30
13 777777 Jenn 2075.00 119.00 71.00
14 888888 Peter 391.30 0.00 1.99
15 888888 Peter 391.30 0.00 35.03
16 999999 Tony 3077.00 110.12 0.00

In this example I want to 'eliminate (or somehow idenity) rows 4, 5, 8,

11,
12, 13 and 15 as they are the duplicates.

I think I could do a conditional format, but I don't know if that will do
what I need. Possibly a macro? I'm ok with macro's and not afraid to

play
with them, I'm just not a programmer and have hit a dead end. Possibly

have
another worksheet that picks each unique date in column A and ignores the
duplicates?

Thanks,

Steve




Frank Kabel

Hi
as a starting point:
http://www.cpearson.com/excel/deleti...eDuplicateRows

--
Regards
Frank Kabel
Frankfurt, Germany

"SSHO_99" schrieb im Newsbeitrag
...
I have 12 separate months of data, each in their own spreadsheet. I

want to
simply copy/paste all of the data from the 12 separate spreadsheets

into one
master worksheet, then find all of the duplicates and somehow

identity them
so I can manually remove the duplicates (or automate a way to

'eliminate'
them). To slightly complicate this, let's say the data has 6

columns. The
first 4 columns can have duplicates while the date in columns 5 and 6

may
vary. A simple way to describe this would be to "Find duplicates in

column
"A" and after the first unique value, turn all other duplicate rows

yellow".

So far I have copied/pasted the data into one master spreadsheet,

then I
have sorted the data which made the duplicates one on top of the

other. I
can see the duplicates and elimiate them one by one, however I have

thousands
of rows, so I want to automate it somehow. Here is an example:


A B D E F
1 111111 Lamaya 1319.00 359.00 354.60
2 222222 John 2755.81 286.06 0.00
3 333333 Steve 2873.12 0.00 85.00
4 333333 Steve 2873.12 44.20 0.00
5 333333 Steve 2873.12 0.00 368.30
6 444444 Gail 2450.00 0.00 23.98
7 555555 Joe 1086.57 887.87 226.30
8 555555 Joe 1086.57 665.21 0.00
9 666666 Bob 96.40 0.00 201.30
10 777777 Jenn 2075.00 5531.00 101.20
11 777777 Jenn 2075.00 2040.00 20.30
12 777777 Jenn 2075.00 1020.00 512.30
13 777777 Jenn 2075.00 119.00 71.00
14 888888 Peter 391.30 0.00 1.99
15 888888 Peter 391.30 0.00 35.03
16 999999 Tony 3077.00 110.12 0.00

In this example I want to 'eliminate (or somehow idenity) rows 4, 5,

8, 11,
12, 13 and 15 as they are the duplicates.

I think I could do a conditional format, but I don't know if that

will do
what I need. Possibly a macro? I'm ok with macro's and not afraid

to play
with them, I'm just not a programmer and have hit a dead end.

Possibly have
another worksheet that picks each unique date in column A and ignores

the
duplicates?

Thanks,

Steve



SSHO_99

Thank you both N Harkawat and Frank Kabel for your answers. I'm all set now
:-)

"SSHO_99" wrote:

I have 12 separate months of data, each in their own spreadsheet. I want to
simply copy/paste all of the data from the 12 separate spreadsheets into one
master worksheet, then find all of the duplicates and somehow identity them
so I can manually remove the duplicates (or automate a way to 'eliminate'
them). To slightly complicate this, let's say the data has 6 columns. The
first 4 columns can have duplicates while the date in columns 5 and 6 may
vary. A simple way to describe this would be to "Find duplicates in column
"A" and after the first unique value, turn all other duplicate rows yellow".

So far I have copied/pasted the data into one master spreadsheet, then I
have sorted the data which made the duplicates one on top of the other. I
can see the duplicates and elimiate them one by one, however I have thousands
of rows, so I want to automate it somehow. Here is an example:


A B D E F
1 111111 Lamaya 1319.00 359.00 354.60
2 222222 John 2755.81 286.06 0.00
3 333333 Steve 2873.12 0.00 85.00
4 333333 Steve 2873.12 44.20 0.00
5 333333 Steve 2873.12 0.00 368.30
6 444444 Gail 2450.00 0.00 23.98
7 555555 Joe 1086.57 887.87 226.30
8 555555 Joe 1086.57 665.21 0.00
9 666666 Bob 96.40 0.00 201.30
10 777777 Jenn 2075.00 5531.00 101.20
11 777777 Jenn 2075.00 2040.00 20.30
12 777777 Jenn 2075.00 1020.00 512.30
13 777777 Jenn 2075.00 119.00 71.00
14 888888 Peter 391.30 0.00 1.99
15 888888 Peter 391.30 0.00 35.03
16 999999 Tony 3077.00 110.12 0.00

In this example I want to 'eliminate (or somehow idenity) rows 4, 5, 8, 11,
12, 13 and 15 as they are the duplicates.

I think I could do a conditional format, but I don't know if that will do
what I need. Possibly a macro? I'm ok with macro's and not afraid to play
with them, I'm just not a programmer and have hit a dead end. Possibly have
another worksheet that picks each unique date in column A and ignores the
duplicates?

Thanks,

Steve


Rayvaz

Find duplicate rows then deleting them
 
Did you try subtotalling ?
I have used it to eliminate duplicate telephone numbers in a database.
Sort the column you need to identify
Subtotals - (Data - subtotals) for every change use function 'count'.
collapse the group so you only have subtotals displayed
then sort the column in decending order. Bingo! all your duplicates will now
be easy to eliminate.
You have to delete manually though.
Any other suggestions are most welcome as I am looking at a similar problem.

--
Rayvaz


"SSHO_99" wrote:

I have 12 separate months of data, each in their own spreadsheet. I want to
simply copy/paste all of the data from the 12 separate spreadsheets into one
master worksheet, then find all of the duplicates and somehow identity them
so I can manually remove the duplicates (or automate a way to 'eliminate'
them). To slightly complicate this, let's say the data has 6 columns. The
first 4 columns can have duplicates while the date in columns 5 and 6 may
vary. A simple way to describe this would be to "Find duplicates in column
"A" and after the first unique value, turn all other duplicate rows yellow".

So far I have copied/pasted the data into one master spreadsheet, then I
have sorted the data which made the duplicates one on top of the other. I
can see the duplicates and elimiate them one by one, however I have thousands
of rows, so I want to automate it somehow. Here is an example:


A B D E F
1 111111 Lamaya 1319.00 359.00 354.60
2 222222 John 2755.81 286.06 0.00
3 333333 Steve 2873.12 0.00 85.00
4 333333 Steve 2873.12 44.20 0.00
5 333333 Steve 2873.12 0.00 368.30
6 444444 Gail 2450.00 0.00 23.98
7 555555 Joe 1086.57 887.87 226.30
8 555555 Joe 1086.57 665.21 0.00
9 666666 Bob 96.40 0.00 201.30
10 777777 Jenn 2075.00 5531.00 101.20
11 777777 Jenn 2075.00 2040.00 20.30
12 777777 Jenn 2075.00 1020.00 512.30
13 777777 Jenn 2075.00 119.00 71.00
14 888888 Peter 391.30 0.00 1.99
15 888888 Peter 391.30 0.00 35.03
16 999999 Tony 3077.00 110.12 0.00

In this example I want to 'eliminate (or somehow idenity) rows 4, 5, 8, 11,
12, 13 and 15 as they are the duplicates.

I think I could do a conditional format, but I don't know if that will do
what I need. Possibly a macro? I'm ok with macro's and not afraid to play
with them, I'm just not a programmer and have hit a dead end. Possibly have
another worksheet that picks each unique date in column A and ignores the
duplicates?

Thanks,

Steve



All times are GMT +1. The time now is 04:56 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com