Thread: Compare columns
View Single Post
  #2   Report Post  
Dave Peterson
 
Posts: n/a
Default

Try this against a copy of the workbook.

It inserts 3 columns (just row numbers, if the row matches the next row, and a
formula to get the group "matchedness").

It sorts a few times (so it could be slow), does its work and then resorts to
put it in the original order.

It does assume that you have headers in Row 1 and your data is sorted by column
A.

Option Explicit
Sub testme()

Dim wks As Worksheet
Dim LastRow As Long

Set wks = Worksheets("sheet1")

With wks
LastRow = .Cells(.Rows.Count, "A").End(xlUp).Row
.Range("a1").Resize(1, 3).EntireColumn.Insert
.Range("A1").Resize(1, 3).Value = Array("row#", "ok", "GroupOk")
With .Range("a2:a" & LastRow)
.Formula = "=row()-1"
.Value = .Value
End With

With .Range("b2:B" & LastRow)
.Formula _
= "=IF(D2<D3,""zmatch"",IF(E2=E3,""zmatch"",""amism atch""))"
.Value = .Value
End With

With .Range("a1:E" & LastRow)
.Sort key1:=.Columns(4), order1:=xlAscending, _
key2:=.Columns(2), order2:=xlAscending, _
header:=xlYes
End With

With .Range("c2:c" & LastRow)
.Formula = "=INDEX(B:B,MATCH(D2,D:D,0))"
.Value = .Value
End With

With .Range("a1:E" & LastRow)
.Sort key1:=.Columns(3), order1:=xlAscending, _
header:=xlYes
End With

.Range("C:C").AutoFilter field:=1, Criteria1:="zmatch"

With .AutoFilter.Range
If .Rows.Count 1 Then
'found some
.Resize(.Rows.Count - 1, 1).Offset(1, 0) _
.Cells.SpecialCells(xlCellTypeVisible).EntireRow.D elete
End If
End With

With .Range("a1:E" & LastRow)
.Sort key1:=.Columns(1), order1:=xlAscending, _
header:=xlYes
End With

.Range("a:c").EntireColumn.Delete
End With

End Sub

=========
If it's too slow, maybe you could break your data into a couple/few worksheets.

You may want to even try it against a smaller subset of your data--just to see
if it does what you want.

If you're new to macros, you may want to read David McRitchie's intro at:
http://www.mvps.org/dmcritchie/excel/getstarted.htm


PiedmontJohn wrote:

I have 60,000 rows from an ODBC strip that has UPC codes in column A and SKU
numbers in column B (sorted by column A). Each row is a location record. I
am looking for UPC codes that 'point' to different SKU numbers and would like
to delete the rows that are correct. So .... I want to look in column A -
and for as many rows as this value is the same (may be 1 to 20 rows) .... I
want to look at column B - and if all column B entries for the same rows are
also equal I want to delete that block of rows. If there are different SKU
values in column B I want to keep that entire block of rows. Then on to the
next UPC .... etc., etc.

I know the logic I'm looking for ... just don't know how to explain it to my
machine!


--

Dave Peterson