ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   finding duplicates and deleting based on another column (https://www.excelbanter.com/excel-programming/435367-finding-duplicates-deleting-based-another-column.html)

tpeter

finding duplicates and deleting based on another column
 
I have a spreadsheet of data that I have compiled from 6 different workbooks.
I have used a true false statement to identify the duplicated and now I need
to delete them based on which spreadsheet they came from. The consolidated
spreadsheet I have currently has 28,000 records, I am currently deletleting
them manually but this will take me until I am 100 to go thru. Any help on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to find the
duplicates in column A, then evaluted column J to see where the source of
the data came from. The choices a


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Raw 02-06 is
an option then delete the rest of the duplicates leaving only this one. If
there is a duplicate and raw isn't available then pick option 2 and so on.

Thank you for your help, it is greatly appreciated.

Tim Peter


p45cal[_172_]

finding duplicates and deleting based on another column
 

tpeter;536940 Wrote:
I have a spreadsheet of data that I have compiled from 6 different
workbooks.
I have used a true false statement to identify the duplicated and now I
need
to delete them based on which spreadsheet they came from. The
consolidated
spreadsheet I have currently has 28,000 records, I am currently
deletleting
them manually but this will take me until I am 100 to go thru. Any help
on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to find
the
duplicates in column A, then evaluted column J to see where the source
of
the data came from. The choices a


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Raw
02-06 is
an option then delete the rest of the duplicates leaving only this one.
If
there is a duplicate and raw isn't available then pick option 2 and so
on.

Thank you for your help, it is greatly appreciated.

Tim Peter


If this is a one-off exercise, then beacuse the data source names don't
sort naturally into your order of preference I would do find and replace
3 times on column J to put a numeral in front to get:
1Raw 02-06
2PG &E Composite Data
3PG&E Data 08
(you'll reverse that later)
then sort your consolidated sheet primarily sorting on column A, but
secondarily on column B ascending. Now for each block of duplicates, the
1Raw 02-06 ones(s) should be at the top.

Now it's just a case of running this macro, which deletes all the lower
duplicates, after selecting the entire block rows-wise, but only column
A:

Sub blah()
toprow = Selection.Row
bottomrow = Selection.Rows.Count + toprow - 1
For i = bottomrow To toprow + 1 Step -1
If Cells(i, "A").Value = Cells(i - 1, "A").Value Then Rows(i).Delete
Next i
End Sub

Now do a find and replace (well 3 actually) on column J to restore the
original data source names.


--
p45cal

*p45cal*
------------------------------------------------------------------------
p45cal's Profile: http://www.thecodecage.com/forumz/member.php?userid=558
View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=147576


tpeter

finding duplicates and deleting based on another column
 
I have renamed the 3 criteria and sorted in assending order by column A and
secondary assending by column B. When I run the macro it says variable not
defined.
a b c d
e f
25123670 AC250 LIVERMORE 0.85 4/27/2004
25151368 AC250 BENICIA 0.62 6/7/2004
25168891 AC250 SAN FRANCISCO 0.05 12/23/2002
g h i j
k
8/17/2004 American Test 5-20-07 FALSE
3/29/2007 9/29/2004 American Test 5-20-07 FALSE
8/6/2003 American Test 5-20-07 FALSE

"tpeter" wrote:

I have a spreadsheet of data that I have compiled from 6 different workbooks.
I have used a true false statement to identify the duplicated and now I need
to delete them based on which spreadsheet they came from. The consolidated
spreadsheet I have currently has 28,000 records, I am currently deletleting
them manually but this will take me until I am 100 to go thru. Any help on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to find the
duplicates in column A, then evaluted column J to see where the source of
the data came from. The choices a


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Raw 02-06 is
an option then delete the rest of the duplicates leaving only this one. If
there is a duplicate and raw isn't available then pick option 2 and so on.

Thank you for your help, it is greatly appreciated.

Tim Peter


tpeter

finding duplicates and deleting based on another column
 
I have added numbers to the other 3 options and sorted it by column A and J
but still get varible is not defined on the macro stoping as soon as it
starts.

"tpeter" wrote:

I have renamed the 3 criteria and sorted in assending order by column A and
secondary assending by column B. When I run the macro it says variable not
defined.
a b c d
e f
25123670 AC250 LIVERMORE 0.85 4/27/2004
25151368 AC250 BENICIA 0.62 6/7/2004
25168891 AC250 SAN FRANCISCO 0.05 12/23/2002
g h i j
k
8/17/2004 American Test 5-20-07 FALSE
3/29/2007 9/29/2004 American Test 5-20-07 FALSE
8/6/2003 American Test 5-20-07 FALSE

"tpeter" wrote:

I have a spreadsheet of data that I have compiled from 6 different workbooks.
I have used a true false statement to identify the duplicated and now I need
to delete them based on which spreadsheet they came from. The consolidated
spreadsheet I have currently has 28,000 records, I am currently deletleting
them manually but this will take me until I am 100 to go thru. Any help on
this would be great.

In column A I have numbers:
56088769
57499354
60175071
60175071
60175071
5608437X
5608437X
5608437X

As you can see there could be 2 to 6 duplicate numbers. I need to find the
duplicates in column A, then evaluted column J to see where the source of
the data came from. The choices a


Raw 02-06
PG &E Composite Data
PG&E Data 08

This is also the order of choice, if there are 4 duplicates and Raw 02-06 is
an option then delete the rest of the duplicates leaving only this one. If
there is a duplicate and raw isn't available then pick option 2 and so on.

Thank you for your help, it is greatly appreciated.

Tim Peter


p45cal[_173_]

finding duplicates and deleting based on another column
 

EITHER[/b] REMOVE
*OPTION EXPLICIT
from the top of the module where you have the macro
[b]or* add the following line at the top of the macro:

Dim Toprow, bottomrow, i


--
p45cal

*p45cal*
------------------------------------------------------------------------
p45cal's Profile: http://www.thecodecage.com/forumz/member.php?userid=558
View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=147576


tpeter

finding duplicates and deleting based on another column
 
Thanks for all your help. I sorted by column A and J. I was not seleting
everything I wanted sorted so it was breaking. Went from 23,650 records to
12,303 when I ran by check for duplicates =countif(a:a,a1)1 everything came
back false. Thanks again for all of your help it saved me a lot of sanity.
Another interesting note is I brought this file home to work on and I have
2007 and there is a remove duplicates button that gave the exact same results.

Tim Peter

"p45cal" wrote:


EITHER[/b] REMOVE
*OPTION EXPLICIT
from the top of the module where you have the macro
[b]or* add the following line at the top of the macro:

Dim Toprow, bottomrow, i


--
p45cal

*p45cal*
------------------------------------------------------------------------
p45cal's Profile: http://www.thecodecage.com/forumz/member.php?userid=558
View this thread: http://www.thecodecage.com/forumz/sh...d.php?t=147576

.



All times are GMT +1. The time now is 12:45 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
ExcelBanter.com