ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Duplicate Row Removal Solution (https://www.excelbanter.com/excel-programming/306191-duplicate-row-removal-solution.html)

lists[_2_]

Duplicate Row Removal Solution
 
I have an issue where I'm trying to remove duplicate rows from a table
(leaving one so that it is unique) and then another issue where I'm
trying to remove all duplicate rows in a table such that there are no
rows containing that data left.

The spread sheet consists of rows like this:

Col 1 Col 2 Col 3 Col 4
Stuff Stuff Something 1
Stuff Stuff Something 1
Stuff Stuff Stuff 0
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Stuff 0
Stuff Stuff Stew 0

In this case Column 3 is the one that needs testing to see if they are
dups, all other columns are irrelavent. Basically what I was thinking
was, in terms of an algorithm:

Mark all duplicates with 1 whether they are above or below (thus
needing the OR statement):
=(if below == above, 1 OR if above == below, 1) Then delete all rows
with 1 in that Column 4.

The standard if that marks all but last duplicate could be:
"=if(A2=A1, 1, 0)" After this things really start getting fuzzy as
I'm not familiar enough with programmatically working with Excel.

I'm at a loss as to how to do this in Excel or if there would be an
even better way of doing this. I would appreciate it much if anyone
can demonstrate how this would be accomplished.

Amedee Van Gasse[_3_]

Duplicate Row Removal Solution
 
lists wrote:

I have an issue where I'm trying to remove duplicate rows from a table
(leaving one so that it is unique) and then another issue where I'm
trying to remove all duplicate rows in a table such that there are no
rows containing that data left.


*snip* long text

Would this help?
http://www.cpearson.com/excel/deleti...eDuplicateRows

--
Amedee Van Gasse using XanaNews 1.16.3.1
If it has an "X" in the name, it must be Linux?
Please don't thank me in advance. Thank me afterwards if it works or
hit me in the face if it doesn't. ;-)

patrick molloy

Duplicate Row Removal Solution
 
method is easy enough. you need to sort by the column
that you test first. maybe you'll need to reset to the
original order following your cleanup.

1) add a column and number each row
2) sort by the column to be testes
3) remove duplicates
4) restore the order

so add a standard code module and copy this:

Option Explicit
Sub Test()
Remove_Dupes 3
End Sub
private Sub Remove_Dupes(testcol As Long)
Dim col As Long
Dim lastrow As Long
Dim thisrow As Long

' get the last column, then
' add the row numbers
col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row

' add a column fro the original row order
With Range(Cells(1, col), Cells(lastrow, col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, col))

.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1

If Cells(thisrow, testcol).Value = _
Cells(thisrow - 1, testcol).Value Then
Rows(thisrow).Delete
End If
Next

'restore whats left to the original order
.Sort Cells(1, col)

End With
End Sub

HTH
Patrick Molloy
Microsoft Excel MVP


-----Original Message-----
I have an issue where I'm trying to remove duplicate

rows from a table
(leaving one so that it is unique) and then another

issue where I'm
trying to remove all duplicate rows in a table such that

there are no
rows containing that data left.

The spread sheet consists of rows like this:

Col 1 Col 2 Col 3 Col 4
Stuff Stuff Something 1
Stuff Stuff Something 1
Stuff Stuff Stuff 0
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Stuff 0
Stuff Stuff Stew 0

In this case Column 3 is the one that needs testing to

see if they are
dups, all other columns are irrelavent. Basically what

I was thinking
was, in terms of an algorithm:

Mark all duplicates with 1 whether they are above or

below (thus
needing the OR statement):
=(if below == above, 1 OR if above == below, 1) Then

delete all rows
with 1 in that Column 4.

The standard if that marks all but last duplicate could

be:
"=if(A2=A1, 1, 0)" After this things really start

getting fuzzy as
I'm not familiar enough with programmatically working

with Excel.

I'm at a loss as to how to do this in Excel or if there

would be an
even better way of doing this. I would appreciate it

much if anyone
can demonstrate how this would be accomplished.
.


lists[_2_]

Duplicate Row Removal Solution
 
Excellent help. Thank you so much. That works like a charm. Now, I
do apologize for my lack of knowledge in VB and Excel in this vein,
but I would like to carry this a little further. Duplication removal
is perfect, however what would I do if I need to not only remove the
duplicates, but the originals as well such that the original and all
duplicates (of that data column since we are not comparing entire row
contents but just that one cell) of that cell in all rows would be
removed?

Thanks again!!!

"Patrick Molloy" wrote in message ...
method is easy enough. you need to sort by the column
that you test first. maybe you'll need to reset to the
original order following your cleanup.

1) add a column and number each row
2) sort by the column to be testes
3) remove duplicates
4) restore the order

so add a standard code module and copy this:

Option Explicit
Sub Test()
Remove_Dupes 3
End Sub
private Sub Remove_Dupes(testcol As Long)
Dim col As Long
Dim lastrow As Long
Dim thisrow As Long

' get the last column, then
' add the row numbers
col = Range("A1").End(xlToRight).Column + 1
' get the last row
lastrow = Range("A1").End(xlDown).Row

' add a column fro the original row order
With Range(Cells(1, col), Cells(lastrow, col))
.Formula = "=Row()"
.Value = .Value
End With

' sort the table by the test column
With Range(Cells(1, 1), Cells(lastrow, col))

.Sort Cells(1, testcol)
' remove duplicate
For thisrow = lastrow To 2 Step -1

If Cells(thisrow, testcol).Value = _
Cells(thisrow - 1, testcol).Value Then
Rows(thisrow).Delete
End If
Next

'restore whats left to the original order
.Sort Cells(1, col)

End With
End Sub

HTH
Patrick Molloy
Microsoft Excel MVP


-----Original Message-----
I have an issue where I'm trying to remove duplicate

rows from a table
(leaving one so that it is unique) and then another

issue where I'm
trying to remove all duplicate rows in a table such that

there are no
rows containing that data left.

The spread sheet consists of rows like this:

Col 1 Col 2 Col 3 Col 4
Stuff Stuff Something 1
Stuff Stuff Something 1
Stuff Stuff Stuff 0
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Nada 1
Stuff Stuff Stuff 0
Stuff Stuff Stew 0

In this case Column 3 is the one that needs testing to

see if they are
dups, all other columns are irrelavent. Basically what

I was thinking
was, in terms of an algorithm:

Mark all duplicates with 1 whether they are above or

below (thus
needing the OR statement):
=(if below == above, 1 OR if above == below, 1) Then

delete all rows
with 1 in that Column 4.

The standard if that marks all but last duplicate could

be:
"=if(A2=A1, 1, 0)" After this things really start

getting fuzzy as
I'm not familiar enough with programmatically working

with Excel.

I'm at a loss as to how to do this in Excel or if there

would be an
even better way of doing this. I would appreciate it

much if anyone
can demonstrate how this would be accomplished.
.


Inabus[_5_]

Duplicate Row Removal Solution
 
Got a question,
I have imported this code myself and it works great however it doesn
do "exactly" what I want.

I have the 3 columns Date, IP and Hostname. Your macro will only remov
a duplicate if the Date, IP and Hostname are all duplicated.

The problem I face, however, is that I want to remove the duplicate ro
based on the Hostname column not all 3.

Is that possible using your code??

--
Message posted from http://www.ExcelForum.com


...Patrick[_5_]

Duplicate Row Removal Solution
 
This maybe ...
try on a sample file

Sub remDoublons()
Dim Mcell, cell, i
Dim Lastr
Lastr = Range("C65000").End(xlUp).Row
For i = Lastr To 2 Step -1
Mcell = Cells(i, 3).Value
If Mcell = Cells(i - 1, 3) Then
Cells(i, 3).EntireRow.Delete
End If
Next
End Sub


"Inabus " a écrit dans le message de
...
Got a question,
I have imported this code myself and it works great however it doesnt
do "exactly" what I want.

I have the 3 columns Date, IP and Hostname. Your macro will only remove
a duplicate if the Date, IP and Hostname are all duplicated.

The problem I face, however, is that I want to remove the duplicate row
based on the Hostname column not all 3.

Is that possible using your code???


---
Message posted from http://www.ExcelForum.com/




Inabus[_6_]

Duplicate Row Removal Solution
 
Found another macro after a bit of searching that does exactly as
want. So therefore ignore this!

Ta,
Pau

--
Message posted from http://www.ExcelForum.com



All times are GMT +1. The time now is 08:05 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com