Home |
Search |
Today's Posts |
|
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
on 1/16/2012, Jim Cone supposed :
Garry, It may turn out to be one of those days, especially if the snow sticks. I plugged in a new set of sample data into xl2010 this morning. It appears that your code is returning mismatched items... items in col A that are not in Col B. But it is not eliminating duplicates. Column A has 360,000 random 6 digit numbers. Column B has 240,000 random 6 digit numbers. Ron's code returns 231,414 unique entries. Your code returns 279,200 entries: 231,514 unique and 47,686 duplicates. (i ran my own unique counter on your returned data) It's too early in the day for me to try to figure out why. <g '--- Regards, Jim Cone "GS" wrote in message ... Ron Rosenfeld wrote : Thanks, Jim. Can you run my final version on your sample data and report back. I'd be curious to know the results. I'm running XP SP3 and did the test in xl2007. Thanks in advance... -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc "GS" wrote in message ... Jim Cone explained on 1/15/2012 : Garry, More FWIW... I ran Ron's code on my XP, i3 machine (xl2007) against 600,000 6 digit random numbers. 360,000 in col A and 240,000 in col B. It took about 7 1/2 seconds. It returned ~87000 numbers not in col B. '--- Jim Cone Jim, Ron explains the what/why of my version of the task fairly well. What I'm more interested in is how long it took on your machine to process the same amount of data as when you ran Ron's version. -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
Garry,
Ran your code and Ron's twice each. Ron: 14.0 seconds Garry: 8.2 seconds Times were identical for tests on each. Xl2010 on WindowsXP - 360,000 nums in col A, 240,000 nums in col B. Changes from yesterday: xl2010 vs. xl2007 and more data overlap between columns. I'm thinking that the xl2010 vba Rnd function may be different. '--- Jim Cone Portland, Oregon USA http://blog.contextures.com/archives...ith-excel-vba/ (workbook with "universal" Last Row function code - free) "GS" wrote in message ... Jim, Ron explains the what/why of my version of the task fairly well. What I'm more interested in is how long it took on your machine to process the same amount of data as when you ran Ron's version. -- Garry |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
Jim Cone wrote :
Garry, Ran your code and Ron's twice each. Ron: 14.0 seconds Garry: 8.2 seconds Times were identical for tests on each. Xl2010 on WindowsXP - 360,000 nums in col A, 240,000 nums in col B. Changes from yesterday: xl2010 vs. xl2007 and more data overlap between columns. I'm thinking that the xl2010 vba Rnd function may be different. '--- Jim Cone Portland, Oregon USA http://blog.contextures.com/archives...ith-excel-vba/ (workbook with "universal" Last Row function code - free) "GS" wrote in message ... Jim, Ron explains the what/why of my version of the task fairly well. What I'm more interested in is how long it took on your machine to process the same amount of data as when you ran Ron's version. -- Garry Thanks, Jim. I guess I was expecting a slower time as compared to Ron's (approx RonsTime * 0.75), but I'm very happy that you report it was better by almost half. Obviously, the Scripting Dictionary is the better way to compare columns of data. What I find interesting is how slow doing VB comparison using arrays is. I've learned something valuable here..! My thanks to you and Ron for your efforts... -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
On Mon, 16 Jan 2012 13:57:59 -0500, GS wrote:
Jim, Ron explains the what/why of my version of the task fairly well. What I'm more interested in is how long it took on your machine to process the same amount of data as when you ran Ron's version. -- Garry Garry, I tried it on a dataset similar to Jim's. 360,000 entries in column A; 240,000 entries in Column B. They were 13 digit text strings with values from 1 to 10^6. Removing duplicates by using two dictionaries: 28 seconds Not removing duplicates using code similar to yours: 16.1 seconds Ignore the timer stuff. It depends on a class installed in my personal .xlam add-in ===================== Option Explicit Sub PreserveDups() Dim oTimer As RonsLibrary.CHiResTimer Set oTimer = RonsLibrary.New_CHiResTimer oTimer.StartTimer 'Requires setting reference (tools/references) to ' Microsoft Scripting Runtime Dim ws As Worksheet Dim rColA As Range, rColB As Range Dim vColA As Variant, vColB As Variant Dim vResults As Variant Dim dColA As Dictionary, dColB As Dictionary Dim i As Long Dim lBlanks As Long Dim d As Variant Dim rDest As Range Set dColA = New Dictionary Set dColB = New Dictionary Set ws = ActiveSheet With ws Set rColA = Range(.Cells(1, "A"), .Cells(.Rows.Count, "A").End(xlUp)) Set rColB = Range(.Cells(1, "B"), .Cells(.Rows.Count, "B").End(xlUp)) Set rDest = .Cells(1, 7) End With vColB = rColB vColA = rColA For i = LBound(vColB, 1) To UBound(vColB, 1) With dColB If Not .Exists(Key:=vColB(i, 1)) Then .Add Key:=vColB(i, 1), Item:=vColB(i, 1) End With Next i For i = LBound(vColA, 1) To UBound(vColA, 1) If dColB.Exists(Key:=vColA(i, 1)) Then vColA(i, 1) = "" lBlanks = lBlanks + 1 End If Next i ReDim vResults(1 To UBound(vColA) - lBlanks, 1 To 1) i = 0 For Each d In vColA If d < "" Then i = i + 1 vResults(i, 1) = d End If Next d rDest.EntireColumn.ClearContents rDest.EntireColumn.NumberFormat = "@" Set rDest = rDest.Resize(rowsize:=UBound(vResults, 1)) rDest = vResults oTimer.StopTimer Debug.Print oTimer.Elapsed End Sub =========================== |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
Thanks, Ron. I really appreciate your efforts!
I was thinking to now create a function that returns a boolean on success, and accepts "Optional AllowDupes As Boolean = True" as its arg so the faster code runs unless the user wants to remove dupes in ColA. Does that make sense? -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
Something to play with...
Function StripDupes(Optional AllowDupes As Boolean = True) As Boolean ' Compares colA to colB and removes colA matches found in colB. ' Args In: AllowDupes: True by default ' Keeps duplicate values found in colA that are not found in colB; ' If False, duplicate values in colA not found in colB are removed. ' ' Returns: True if matches are found --AND-- no error occurs; ' False if matches are NOT found --OR-- error occurs. ' ' Sources: Ron Rosenfeld, Jim Cone, Garry Sansom Dim i&, j&, lRows1&, lRows2&, lMatchesFound& 'as long Dim vRngA, vRngB, vRngOut() 'as variant Dim dRngB As New Dictionary On Error GoTo ErrExit lRows1 = Cells(Rows.Count, "A").End(xlUp).Row lRows2 = Cells(Rows.Count, "B").End(xlUp).Row vRngA = Range("A1:A" & lRows1): vRngB = Range("B1:B" & lRows2) For j = LBound(vRngB) To UBound(vRngB) With dRngB If Not .Exists(Key:=vRngB(j, 1)) Then _ .Add Key:=vRngB(j, 1), Item:=vRngB(j, 1) End With Next 'j If AllowDupes Then '//fastest For i = LBound(vRngA) To UBound(vRngA) If dRngB.Exists(Key:=vRngA(i, 1)) Then _ vRngA(i, 1) = "": lMatchesFound = lMatchesFound + 1 Next 'i j = 0: ReDim vRngOut(UBound(vRngA) - lMatchesFound, 0) For i = LBound(vRngA) To UBound(vRngA) If Not vRngA(i, 1) = "" Then _ vRngOut(j, 0) = vRngA(i, 1): j = j + 1 Next 'i Else '//slowest Dim dRngA As New Dictionary For i = LBound(vRngA) To UBound(vRngA) If Not dRngB.Exists(vRngA(i, 1)) Then With dRngA If Not .Exists(Key:=vRngA(i, 1)) Then _ .Add Key:=vRngA(i, 1), Item:=vRngA(i, 1) End With 'dRngA End If 'Not dRngB.Exists(vRngA(i, 1)) Next 'i Dim v As Variant j = 0: ReDim vRngOut(dRngA.Count, 0) For Each v In dRngA vRngOut(j, 0) = dRngA(v): j = j + 1 Next 'v End If 'AllowDupes Range("A1:A" & lRows1).ClearContents Range("A1").Resize(UBound(vRngOut), 1) = vRngOut ErrExit: StripDupes = (Err = 0) End Function 'StripDupes() -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
We can shave off even more time if we eliminate the checks when adding
items to the dictionary because the dictionary won't allow dupes... Function StripDupes(Optional AllowDupes As Boolean = True) As Boolean ' Compares colA to colB and removes colA matches found in colB. ' Args In: AllowDupes: True by default. Keeps duplicate values ' found in colA that are not found in colB. If False, ' duplicate values in colA not found in colB are removed. ' ' Returns: True if matches found and no error occurs; ' False if matches not found --OR-- error occurs. ' ' Sources: Ron Rosenfeld, Jim Cone, Garry Sansom Dim i&, j&, lRows1&, lRows2&, lMatchesFound& 'as long Dim vRngA, vRngB, vRngOut() 'as variant Dim dRngB As New Dictionary On Error GoTo ErrExit lRows1 = Cells(Rows.Count, "A").End(xlUp).Row lRows2 = Cells(Rows.Count, "B").End(xlUp).Row vRngA = Range("A1:A" & lRows1): vRngB = Range("B1:B" & lRows2) On Error Resume Next For j = LBound(vRngB) To UBound(vRngB) dRngB.Add Key:=vRngB(j, 1), Item:=vRngB(j, 1) Next 'j On Error GoTo 0 If AllowDupes Then '//fastest For i = LBound(vRngA) To UBound(vRngA) If dRngB.Exists(Key:=vRngA(i, 1)) Then _ vRngA(i, 1) = "": lMatchesFound = lMatchesFound + 1 Next 'i j = 0: ReDim vRngOut(UBound(vRngA) - lMatchesFound, 0) For i = LBound(vRngA) To UBound(vRngA) If Not vRngA(i, 1) = "" Then _ vRngOut(j, 0) = vRngA(i, 1): j = j + 1 Next 'i Else '//slowest Dim dRngA As New Dictionary On Error Resume Next For i = LBound(vRngA) To UBound(vRngA) If Not dRngB.Exists(vRngA(i, 1)) Then _ dRngA.Add Key:=vRngA(i, 1), Item:=vRngA(i, 1) Next 'i On Error GoTo 0 Dim v As Variant j = 0: ReDim vRngOut(dRngA.Count, 0) For Each v In dRngA vRngOut(j, 0) = dRngA(v): j = j + 1 Next 'v End If 'AllowDupes Range("A1:A" & lRows1).ClearContents Range("A1").Resize(UBound(vRngOut), 1) = vRngOut ErrExit: StripDupes = (Err = 0) End Function 'StripDupes() -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
On Mon, 16 Jan 2012 18:50:24 -0500, GS wrote:
We can shave off even more time if we eliminate the checks when adding items to the dictionary because the dictionary won't allow dupes... Your efforts prompted me to study whether dictionary or collection would work faster. And it turns out that my "prune" routine, which eliminates duplicates in Col A, when rewritten using Collections, runs in about 1/6 the time! My last effort, using dictionaries for col a and col b, on the Jim style database (240,000 entries colA; 360,000 entries col b) took about 29 sec to run. The following process that same data base in 5.5 seconds!! ============================= Option Explicit Sub PruneColA2() Dim ws As Worksheet Dim rColA As Range, rColB As Range Dim vColA As Variant, vColB As Variant Dim vResults As Variant Dim cColB As Collection Dim i As Long Dim lBlanks As Long Dim v As Variant Dim rDest As Range Set cColB = New Collection Set ws = ActiveSheet With ws Set rColA = Range(.Cells(1, "A"), .Cells(.Rows.Count, "A").End(xlUp)) Set rColB = Range(.Cells(1, "B"), .Cells(.Rows.Count, "B").End(xlUp)) Set rDest = .Cells(1, 10) End With vColB = rColB vColA = rColA On Error Resume Next For i = LBound(vColB, 1) To UBound(vColB, 1) With cColB .Add Key:=vColB(i, 1), Item:=vColB(i, 1) End With Next i On Error GoTo 0 On Error GoTo NotUniqueItem For i = LBound(vColA, 1) To UBound(vColA, 1) cColB.Add Item:=vColA(i, 1), Key:=vColA(i, 1) Next i ReDim vResults(1 To UBound(vColA) - lBlanks, 1 To 1) i = 0 For Each v In vColA If v < "" Then i = i + 1 vResults(i, 1) = v End If Next v rDest.EntireColumn.ClearContents rDest.EntireColumn.NumberFormat = "@" Set rDest = rDest.Resize(rowsize:=UBound(vResults, 1)) rDest = vResults Exit Sub NotUniqueItem: vColA(i, 1) = "" lBlanks = lBlanks + 1 Resume Next End Sub ================================= |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
I was thinking the same thing! I'll report back in a new thread as Jim
suggested... -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
I like how you were able to eliminate the final If...Then construct.<g
-- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
Uh.., I'm seeing that this approach raises a 'Type Mismatch' error in
the first loop where it loads colB into the collection. Also, this approach empties colA and raises an error trying to write the output array because the UBound(vColA) and lBlanks are the same value. What am I missing? -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
Garry,
I wonder if gary (the original poster) is still around? He would have his moneys worth by now. <g Its time for further posts, if any, on this subject to go into a brand new post. It has gotten a little unwieldy. '--- Jim Cone "GS" wrote in message ... We can shave off even more time if we eliminate the checks when adding items to the dictionary because the dictionary won't allow dupes... Function StripDupes(Optional AllowDupes As Boolean = True) As Boolean ' Compares colA to colB and removes colA matches found in colB. ' Args In: AllowDupes: True by default. Keeps duplicate values ' found in colA that are not found in colB. If False, ' duplicate values in colA not found in colB are removed. ' ' Returns: True if matches found and no error occurs; ' False if matches not found --OR-- error occurs. ' ' Sources: Ron Rosenfeld, Jim Cone, Garry Sansom Dim i&, j&, lRows1&, lRows2&, lMatchesFound& 'as long Dim vRngA, vRngB, vRngOut() 'as variant Dim dRngB As New Dictionary On Error GoTo ErrExit lRows1 = Cells(Rows.Count, "A").End(xlUp).Row lRows2 = Cells(Rows.Count, "B").End(xlUp).Row vRngA = Range("A1:A" & lRows1): vRngB = Range("B1:B" & lRows2) On Error Resume Next For j = LBound(vRngB) To UBound(vRngB) dRngB.Add Key:=vRngB(j, 1), Item:=vRngB(j, 1) Next 'j On Error GoTo 0 If AllowDupes Then '//fastest For i = LBound(vRngA) To UBound(vRngA) If dRngB.Exists(Key:=vRngA(i, 1)) Then _ vRngA(i, 1) = "": lMatchesFound = lMatchesFound + 1 Next 'i j = 0: ReDim vRngOut(UBound(vRngA) - lMatchesFound, 0) For i = LBound(vRngA) To UBound(vRngA) If Not vRngA(i, 1) = "" Then _ vRngOut(j, 0) = vRngA(i, 1): j = j + 1 Next 'i Else '//slowest Dim dRngA As New Dictionary On Error Resume Next For i = LBound(vRngA) To UBound(vRngA) If Not dRngB.Exists(vRngA(i, 1)) Then _ dRngA.Add Key:=vRngA(i, 1), Item:=vRngA(i, 1) Next 'i On Error GoTo 0 Dim v As Variant j = 0: ReDim vRngOut(dRngA.Count, 0) For Each v In dRngA vRngOut(j, 0) = dRngA(v): j = j + 1 Next 'v End If 'AllowDupes Range("A1:A" & lRows1).ClearContents Range("A1").Resize(UBound(vRngOut), 1) = vRngOut ErrExit: StripDupes = (Err = 0) End Function 'StripDupes() -- Garry Free usenet access at http://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc |
#13
Posted to microsoft.public.excel.programming
|
|||
|
|||
Delete matching cells
On Jan 16, 5:54*pm, "Jim Cone" wrote:
Garry, I wonder if gary (the original poster) is still around? He would have his moneys worth by now. <g Its time for further posts, if any, on this subject to go into a brand new post. It has gotten a little unwieldy. '--- Jim Cone "GS" wrote in ... We can shave off even more time if we eliminate the checks when adding items to the dictionary because the dictionary won't allow dupes... Function StripDupes(Optional AllowDupes As Boolean = True) As Boolean ' Compares colA to colB and removes colA matches found in colB. ' Args In: * *AllowDupes: True by default. *Keeps duplicate values ' * * * * * * found in colA that are not found in colB. If False, ' * * * * * * duplicate values in colA not found in colB are removed. ' ' Returns: * *True if matches found and no error occurs; ' * * * * * * False if matches not found --OR-- error occurs. ' ' Sources: * *Ron Rosenfeld, Jim Cone, Garry Sansom *Dim i&, j&, lRows1&, lRows2&, lMatchesFound& 'as long *Dim vRngA, vRngB, vRngOut() 'as variant *Dim dRngB As New Dictionary *On Error GoTo ErrExit *lRows1 = Cells(Rows.Count, "A").End(xlUp).Row *lRows2 = Cells(Rows.Count, "B").End(xlUp).Row *vRngA = Range("A1:A" & lRows1): vRngB = Range("B1:B" & lRows2) *On Error Resume Next * *For j = LBound(vRngB) To UBound(vRngB) * * *dRngB.Add Key:=vRngB(j, 1), Item:=vRngB(j, 1) * *Next 'j *On Error GoTo 0 *If AllowDupes Then '//fastest * *For i = LBound(vRngA) To UBound(vRngA) * * *If dRngB.Exists(Key:=vRngA(i, 1)) Then _ * * * *vRngA(i, 1) = "": lMatchesFound = lMatchesFound + 1 * *Next 'i * *j = 0: ReDim vRngOut(UBound(vRngA) - lMatchesFound, 0) * *For i = LBound(vRngA) To UBound(vRngA) * * *If Not vRngA(i, 1) = "" Then _ * * * *vRngOut(j, 0) = vRngA(i, 1): j = j + 1 * *Next 'i *Else '//slowest * *Dim dRngA As New Dictionary * *On Error Resume Next * * *For i = LBound(vRngA) To UBound(vRngA) * * * *If Not dRngB.Exists(vRngA(i, 1)) Then _ * * * * *dRngA.Add Key:=vRngA(i, 1), Item:=vRngA(i, 1) * * *Next 'i * *On Error GoTo 0 * *Dim v As Variant * *j = 0: ReDim vRngOut(dRngA.Count, 0) * *For Each v In dRngA * * *vRngOut(j, 0) = dRngA(v): j = j + 1 * *Next 'v *End If 'AllowDupes *Range("A1:A" & lRows1).ClearContents *Range("A1").Resize(UBound(vRngOut), 1) = vRngOut ErrExit: *StripDupes = (Err = 0) End Function 'StripDupes() -- Garry Free usenet access athttp://www.eternal-september.org ClassicVB Users Regroup! comp.lang.basic.visual.misc- Hide quoted text - - Show quoted text - Yes, Gary (the OP) is still around. Ron's macro gave me the results I was looking for! Thanks for all the contributions! Gary |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Delete all cells in range matching certain values | Excel Programming | |||
Help with Matching Text Fields - Then Moving the Matching Cells Side by Side | Excel Discussion (Misc queries) | |||
DELETE ROW 3 MATCHING CRITERIA | Excel Programming | |||
delete all matching rows | Excel Discussion (Misc queries) | |||
Perform Lookup and delete non matching rows? | Excel Programming |