![]() |
Sub to extract uniques from 200k data in xl03
Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in
cols A to D from row1 down. Looking for a sub which can extract the unique ids from amongst the 200k into a new sheet, listing these ids into as many cols as may be required, ie down col A, then down col B, etc. Thanks for insights. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
This article by John Walkenbach should give you the guts of what you need:
http://www.j-walk.com/ss/excel/tips/tip47.htm Instead of loading a listbox, write the unique items to another sheet. If you can't do it, post back. (I have no idea what your skill set is) -- Regards, Tom Ogilvy "Max" wrote: Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in cols A to D from row1 down. Looking for a sub which can extract the unique ids from amongst the 200k into a new sheet, listing these ids into as many cols as may be required, ie down col A, then down col B, etc. Thanks for insights. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Max, this is surely a simple-minded approach to the problem, but it should
work. It will take a long time to run. I assumed that the data is in Sheet1 with no column headings and that Sheet2 is an unused sheet. HTH, James Sub Uniques4Columns() Dim FromRow As Long, FromCol As Integer Dim ToRow As Long, ToCol As Integer Dim This As Variant, c As Range Worksheets(2).Activate Cells.Clear ToCol = 1: ToRow = 1 With Worksheets(1) For FromCol = 1 To 4 For FromRow = 1 To .Cells(65536, FromCol).End(xlUp).Row This = .Cells(FromRow, FromCol) Set c = Cells.Find(This, LookIn:=xlValues, lookat:=xlWhole) If c Is Nothing Then Cells(ToRow, ToCol) = This ToRow = ToRow + 1 If ToRow 65535 Then ToRow = 1 ToCol = ToCol + 1 End If End If Next FromRow Next FromCol End With End Sub "Max" wrote in message ... Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in cols A to D from row1 down. Looking for a sub which can extract the unique ids from amongst the 200k into a new sheet, listing these ids into as many cols as may be required, ie down col A, then down col B, etc. Thanks for insights. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Max wrote:
Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in cols A to D from row1 down. Looking for a sub which can extract the unique ids from amongst the 200k into a new sheet, listing these ids into as many cols as may be required, ie down col A, then down col B, etc. Thanks for insights. I have a solution that takes 15 to 20 seconds to run. Are you interested? Alan Beban |
Sub to extract uniques from 200k data in xl03
Tom, good point. I probably should have used some variation of rows.count.
How would you do this, otherwise using the same code? James "Tom Ogilvy" wrote in message ... Just note that if the OP has data down to rows 65536 in columns 1 to 3 (as is probable based on the description), you will miss examining 65535 names in each of those columns. -- Regards, Tom Ogilvy "Zone" wrote: Max, this is surely a simple-minded approach to the problem, but it should work. It will take a long time to run. I assumed that the data is in Sheet1 with no column headings and that Sheet2 is an unused sheet. HTH, James Sub Uniques4Columns() Dim FromRow As Long, FromCol As Integer Dim ToRow As Long, ToCol As Integer Dim This As Variant, c As Range Worksheets(2).Activate Cells.Clear ToCol = 1: ToRow = 1 With Worksheets(1) For FromCol = 1 To 4 For FromRow = 1 To .Cells(65536, FromCol).End(xlUp).Row This = .Cells(FromRow, FromCol) Set c = Cells.Find(This, LookIn:=xlValues, lookat:=xlWhole) If c Is Nothing Then Cells(ToRow, ToCol) = This ToRow = ToRow + 1 If ToRow 65535 Then ToRow = 1 ToCol = ToCol + 1 End If End If Next FromRow Next FromCol End With End Sub "Max" wrote in message ... Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in cols A to D from row1 down. Looking for a sub which can extract the unique ids from amongst the 200k into a new sheet, listing these ids into as many cols as may be required, ie down col A, then down col B, etc. Thanks for insights. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Of course. I'm game.
-- Max Singapore http://savefile.com/projects/236895 xdemechanik --- "Alan Beban" <unavailable wrote I have a solution that takes 15 to 20 seconds to run. Are you interested? |
Sub to extract uniques from 200k data in xl03
Thanks, Tom. I'll check it out.
-- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
James, thanks for your Sub Uniques4Columns().
I've noted the preceding discussions between you and Tom. I'll try running your sub over lunchtime today. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Instead of loading a listbox, write the unique items to another sheet.
I don't know enough vba to get this done, Tom. Could you assist? -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Max wrote:
Of course. I'm game. Copy and paste the following three procedures into a general module in your workbook; watch for wordwrap, though I think they're clean. The first returns an array of the unique elements of the input array; by default it is a case-sensitive, 1-based, vertical array. The second transposes an array without some of the limitations of the Excel TRANSPOSE function. It's necessary in this case because the first function needs to convert the collection of unique elements, which is a horizontal array in Excel, to a vertical array to match your data. It's much more general than is required by your inquiry, but it's from my library so that's what you get. Then, in the VBEditor, select Tools|References and check Microsoft Scripting Runtime; I believe this step is also necessary in John Walkenbach's code cited by Tom Ogilvy, though neither John nor Tom mentioned it. Then, assuming that your data is on Sheets(1), and Sheets(2) is available for the output, run the third SubProcedure, abtest1. Post to let us know how it comes out. Function ArrayUniques(InputArray, _ Optional MatchCase As Boolean = True, _ Optional Base_Orient As String = "1vert", _ Optional OmitBlanks As Boolean = True) 'THIS PROCEDURE REQUIRES A PROJECT REFERENCE 'TO "MICROSCOPIC SCRIPTING RUNTIME". 'The function returns an array of unique 'values from an array or range. By default 'it returns a 1-based vertical array; for 'other results enter "0horiz", "1horiz" or '"0vert" as the third argument. By default, 'the function is case-sensitive; i.e., e.g., '"red" and "Red" are treated as two separate 'unique values; to avoid case-sensitivity, 'enter False as the second argument. 'Declare the variables Dim arr, arr2 Dim i As Long, p As Object, q As String Dim Elem, x As Dictionary Dim CalledDirectFromWorksheet As Boolean 'For later use in selecting cells for worksheet output CalledDirectFromWorksheet = False If TypeOf Application.Caller Is Range Then Set p = Application.Caller q = p.Address iRows = Range(q).Rows.Count iCols = Range(q).Columns.Count If InStr(1, p.FormulaArray, "ArrayUniques") = 2 _ Or InStr(1, p.FormulaArray, "arrayuniques") = 2 _ Or InStr(1, p.FormulaArray, "ARRAYUNIQUES") = 2 Then CalledDirectFromWorksheet = True End If End If 'Convert an input range to a VBA array arr = InputArray 'Load the unique elements into a Dictionary Object Set x = New Dictionary x.CompareMode = Abs(Not MatchCase) '<--Case-sensitivity On Error Resume Next For Each Elem In arr x.Add Item:=Elem, key:=CStr(Elem) Next If OmitBlanks Then x.Remove ("") On Error GoTo 0 'Load a 0-based horizontal array with the unique 'elements from the Dictionary Object arr2 = x.Items 'This provides appropriate base and orientation 'of the output array Select Case Base_Orient Case "0horiz" arr2 = arr2 Case "1horiz" ReDim Preserve arr2(1 To UBound(arr2) + 1) Case "0vert" arr2 = ArrayTranspose(arr2) Case "1vert" ReDim Preserve arr2(1 To UBound(arr2) + 1) arr2 = ArrayTranspose(arr2) End Select 'Assure that enough cells are selected to accommodate output If CalledDirectFromWorksheet Then If Range(Application.Caller.Address).Count < x.Count Then ArrayUniques = "Select a range of at least " & _ x.Count & " cells" Exit Function End If End If ArrayUniques = arr2 End Function Function ArrayTranspose(InputArray) 'This function returns the transpose of 'the input array or range; it is designed 'to avoid the limitation on the number of 'array elements and type of array that the 'worksheet TRANSPOSE Function has. 'Declare the variables Dim outputArrayTranspose As Variant, arr As Variant, p As Integer Dim i As Long, j As Long 'Check to confirm that the input array 'is an array or multicell range If IsArray(InputArray) Then 'If so, convert an input range to a 'true array arr = InputArray 'Load the number of dimensions of 'the input array to a variable On Error Resume Next 'Loop until an error occurs i = 1 Do z = UBound(arr, i) i = i + 1 Loop While Err = 0 'Reset the error value for use with other procedures Err = 0 'Return the number of dimensions p = i - 2 End If If Not IsArray(InputArray) Or p 2 Then Msg = "#ERROR! The function accepts only multi-cell ranges " & _ "and 1D or 2D arrays." If TypeOf Application.Caller Is Range Then ArrayTranspose = Msg Else MsgBox Msg, 16 End If Exit Function End If 'Load the output array from a one- 'dimensional input array If p = 1 Then Select Case TypeName(arr) Case "Object()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Object For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) Set outputArrayTranspose(i, _ LBound(outputArrayTranspose)) = _ arr(i) Next Case "Boolean()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Boolean Case "Byte()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Byte Case "Currency()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Currency Case "Date()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Date Case "Double()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Double Case "Integer()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Integer Case "Long()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Long Case "Single()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Single Case "String()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As String Case "Variant()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Variant Case Else Msg = "#ERROR! Only built-in types of arrays " & _ "are supported." If TypeOf Application.Caller Is Range Then ArrayTranspose = Msg Else MsgBox Msg, 16 End If Exit Function End Select If TypeName(arr) < "Object()" Then For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) outputArrayTranspose(i, _ LBound(outputArrayTranspose)) = arr(i) Next End If 'Or load the output array from a two- 'dimensional input array or range ElseIf p = 2 Then Select Case TypeName(arr) Case "Object()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Object For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) For j = LBound(outputArrayTranspose, 2) To _ UBound(outputArrayTranspose, 2) Set outputArrayTranspose(i, j) = arr(j, i) Next Next Case "Boolean()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Boolean Case "Byte()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Byte Case "Currency()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Currency Case "Date()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Date Case "Double()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Double Case "Integer()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Integer Case "Long()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Long Case "Single()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Single Case "String()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As String Case "Variant()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Variant Case Else Msg = "#ERROR! Only built-in types of arrays " & _ "are supported." If TypeOf Application.Caller Is Range Then ArrayTranspose = Msg Else MsgBox Msg, 16 End If Exit Function End Select If TypeName(arr) < "Object()" Then For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) For j = LBound(outputArrayTranspose, 2) To _ UBound(outputArrayTranspose, 2) outputArrayTranspose(i, j) = arr(j, i) Next Next End If End If 'Return the transposed array ArrayTranspose = outputArrayTranspose End Function Sub abtest1() Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD() Dim rng As Range Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range Set rng = Sheets(1).Range("A:D") Set rngA = Sheets(2).Range("A:A") Set rngB = Sheets(2).Range("B:B") Set rngC = Sheets(2).Range("C:C") Set rngD = Sheets(2).Range("D:D") arr1 = rng arr2 = ArrayUniques(arr1) x = ArrayCount(arr2) z = 65536 y = x - (x \ z) * z Select Case x \ z Case 0 Sheets(2).Range("A1:A" & y).Value = arr2 Case 1 ReDim arrA(1 To 65536, 1 To 1) ReDim arrB(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) Next For i = 1 To y arrB(i, 1) = arr2(i + z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & y).Value = arrB Case 2 ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) arrB(i, 1) = arr2(i + z, 1) Next For i = 1 To y arrC(i, 1) = arr2(i + 2 * z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & y).Value = arrC Case 3 ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To z, 1 To 1) ReDim arrD(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) arrB(i, 1) = arr2(i + z, 1) arrC(i, 1) = arr2(i + 2 * z, 1) Next For i = 1 To y arrD(i, 1) = arr2(i + 3 * z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & z).Value = arrC Sheets(2).Range("A1:D" & y).Value = arrD Case 4 Sheets(2).Range("A:D").Value = Sheets(1).Range("A:D").Value End Select End Sub Alan Beban |
Sub to extract uniques from 200k data in xl03
Alan, many thanks.
will try it out and post back here -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Alan,
I hit a "sub or function not defined" at this line: x = ArrayCount(arr2) when i ran Sub abtest1() Pl advise. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Max wrote:
Alan, I hit a "sub or function not defined" at this line: x = ArrayCount(arr2) when i ran Sub abtest1() Pl advise. Drat! I didn't notice that the procedure was using another Function from the Library; I'll post a fix. Alan Beban |
Sub to extract uniques from 200k data in xl03
Max wrote:
Alan, I hit a "sub or function not defined" at this line: x = ArrayCount(arr2) when i ran Sub abtest1() Pl advise. Copy and paste the following into the same module as th other three. Function ArrayCount(InputArray) 'This function counts NOT the number of 'non-blank values in the array, but the 'number of available slots for values, 'whether the slots contain anything or not. 'It's similar to the Count Property [e.g., 'Range("a1:c3").Count] Dim j As Long, k As Long 'Convert range to array 'InputArray = InputArray If IsArray(InputArray) Then If Not TypeOf InputArray Is Range Then j = 1: k = 1 On Error Resume Next Do k = k * (UBound(InputArray, j) - _ LBound(InputArray, j) + 1) j = j + 1 Loop While Err.Number = 0 ArrayCount = k Else If TypeOf Application.Caller Is Range Then ArrayCount = "#ERROR! This function accepts only arrays." Else MsgBox "#ERROR! The ArrayCount function " & _ "accepts only arrays.", 16 End If End If Else If TypeOf Application.Caller Is Range Then ArrayCount = "#ERROR! This function accepts only arrays." Else MsgBox "#ERROR! The ArrayCount function " & _ "accepts only arrays.", 16 End If End If End Function Sorry, Alan Beban |
Sub to extract uniques from 200k data in xl03
Alan, thanks.
Inserted that. Ran the sub, it halted at the line: "End Select" within "... Case 4 Sheets(2).Range("A:D").Value = Sheets(1).Range("A:D").Value End Select ... " -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Re-ran the sub again. The run completes, but Sheet2 is blank (no results)
-- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Max wrote:
Re-ran the sub again. The run completes, but Sheet2 is blank (no results) I reran it; it worked as it should. You mention Sheet2; is that in fact the name of your Sheets(2)? The test file I range it on had A:D on Sheets(1) filled like this Set rng = Sheets(1).Range("A:D") For i = 1 to 65536:For j = 1 to 4 rng(i,j).value = i * j Next:Next and arr2 had 158,378 elements that filled A1:C27306 on Sheets(2). One thing to do is in the abtest1 SubProcedure, after the line y = x - (x \ z) * z insert Stop and when the code stops, mouse over x to see that it is 158738, and mouse over y to see that it is 27306. Let me know. Alan Beban |
Sub to extract uniques from 200k data in xl03
Alan Beban wrote:
Max wrote: Re-ran the sub again. The run completes, but Sheet2 is blank (no results) My last email assumed your data was the same as my test file. In your file, when you mouse over x it should be equal to the number of unique elements in your data; and y should be the integral part of x/65536, i.e., 0,1,2,3 or 4. Alan Beban |
Sub to extract uniques from 200k data in xl03
I reran it; it worked as it should ...
Alan, many thanks. I re-booted the PC,, and re-ran it again, and this time round, it *worked*. And yes, runtime took only around 15** sec, simply amazing !! **for an estimated 200k worth of items .. You mention Sheet2; is that in fact the name of your Sheets(2)? Does Sheets(2) mean the 2nd tab from the left? If so, yes, I have "Sheet2" -- tabname -- as the 2nd tab from the left In VBE under Microsoft Excel Objects, the tab appears as: Sheet2 (Sheet2) What is the first instance of Sheet2 in the above ? I gather that the 2nd instance -- within parens -- is the tabname, which can be changed by user. How could we make use of the 1st instance which apparently cannot be changed and is more robust? Thanks. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
James, thanks. Tried running your sub on smaller test set of data and it
works fine. For my needs here I'll be going with Alan's offering in the other branch which also works as well and is very fast. Nonetheless, I do appreciate your response/sub. -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Alan Beban wrote:
Alan Beban wrote: Max wrote: Re-ran the sub again. The run completes, but Sheet2 is blank (no results) My last email assumed your data was the same as my test file. In your file, when you mouse over x it should be equal to the number of unique elements in your data; and y should be the integral part of x/65536, i.e., 0,1,2,3 or 4. Alan Beban Whoops! Must have been too close to my bedtime when I wrote the above. y is, of course, the number of elements in the rightmost column of the unique output. Alan Beban |
Sub to extract uniques from 200k data in xl03
It wasn't mentioned because it isn't necessary.
It isn't necessary to check scripting runtime because it uses the builtin collection rather than starting up another DLL. == Regards, Tom Ogilvy "Alan Beban" wrote: Max wrote: Of course. I'm game. Copy and paste the following three procedures into a general module in your workbook; watch for wordwrap, though I think they're clean. The first returns an array of the unique elements of the input array; by default it is a case-sensitive, 1-based, vertical array. The second transposes an array without some of the limitations of the Excel TRANSPOSE function. It's necessary in this case because the first function needs to convert the collection of unique elements, which is a horizontal array in Excel, to a vertical array to match your data. It's much more general than is required by your inquiry, but it's from my library so that's what you get. Then, in the VBEditor, select Tools|References and check Microsoft Scripting Runtime; I believe this step is also necessary in John Walkenbach's code cited by Tom Ogilvy, though neither John nor Tom mentioned it. Then, assuming that your data is on Sheets(1), and Sheets(2) is available for the output, run the third SubProcedure, abtest1. Post to let us know how it comes out. Function ArrayUniques(InputArray, _ Optional MatchCase As Boolean = True, _ Optional Base_Orient As String = "1vert", _ Optional OmitBlanks As Boolean = True) 'THIS PROCEDURE REQUIRES A PROJECT REFERENCE 'TO "MICROSCOPIC SCRIPTING RUNTIME". 'The function returns an array of unique 'values from an array or range. By default 'it returns a 1-based vertical array; for 'other results enter "0horiz", "1horiz" or '"0vert" as the third argument. By default, 'the function is case-sensitive; i.e., e.g., '"red" and "Red" are treated as two separate 'unique values; to avoid case-sensitivity, 'enter False as the second argument. 'Declare the variables Dim arr, arr2 Dim i As Long, p As Object, q As String Dim Elem, x As Dictionary Dim CalledDirectFromWorksheet As Boolean 'For later use in selecting cells for worksheet output CalledDirectFromWorksheet = False If TypeOf Application.Caller Is Range Then Set p = Application.Caller q = p.Address iRows = Range(q).Rows.Count iCols = Range(q).Columns.Count If InStr(1, p.FormulaArray, "ArrayUniques") = 2 _ Or InStr(1, p.FormulaArray, "arrayuniques") = 2 _ Or InStr(1, p.FormulaArray, "ARRAYUNIQUES") = 2 Then CalledDirectFromWorksheet = True End If End If 'Convert an input range to a VBA array arr = InputArray 'Load the unique elements into a Dictionary Object Set x = New Dictionary x.CompareMode = Abs(Not MatchCase) '<--Case-sensitivity On Error Resume Next For Each Elem In arr x.Add Item:=Elem, key:=CStr(Elem) Next If OmitBlanks Then x.Remove ("") On Error GoTo 0 'Load a 0-based horizontal array with the unique 'elements from the Dictionary Object arr2 = x.Items 'This provides appropriate base and orientation 'of the output array Select Case Base_Orient Case "0horiz" arr2 = arr2 Case "1horiz" ReDim Preserve arr2(1 To UBound(arr2) + 1) Case "0vert" arr2 = ArrayTranspose(arr2) Case "1vert" ReDim Preserve arr2(1 To UBound(arr2) + 1) arr2 = ArrayTranspose(arr2) End Select 'Assure that enough cells are selected to accommodate output If CalledDirectFromWorksheet Then If Range(Application.Caller.Address).Count < x.Count Then ArrayUniques = "Select a range of at least " & _ x.Count & " cells" Exit Function End If End If ArrayUniques = arr2 End Function Function ArrayTranspose(InputArray) 'This function returns the transpose of 'the input array or range; it is designed 'to avoid the limitation on the number of 'array elements and type of array that the 'worksheet TRANSPOSE Function has. 'Declare the variables Dim outputArrayTranspose As Variant, arr As Variant, p As Integer Dim i As Long, j As Long 'Check to confirm that the input array 'is an array or multicell range If IsArray(InputArray) Then 'If so, convert an input range to a 'true array arr = InputArray 'Load the number of dimensions of 'the input array to a variable On Error Resume Next 'Loop until an error occurs i = 1 Do z = UBound(arr, i) i = i + 1 Loop While Err = 0 'Reset the error value for use with other procedures Err = 0 'Return the number of dimensions p = i - 2 End If If Not IsArray(InputArray) Or p 2 Then Msg = "#ERROR! The function accepts only multi-cell ranges " & _ "and 1D or 2D arrays." If TypeOf Application.Caller Is Range Then ArrayTranspose = Msg Else MsgBox Msg, 16 End If Exit Function End If 'Load the output array from a one- 'dimensional input array If p = 1 Then Select Case TypeName(arr) Case "Object()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Object For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) Set outputArrayTranspose(i, _ LBound(outputArrayTranspose)) = _ arr(i) Next Case "Boolean()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Boolean Case "Byte()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Byte Case "Currency()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Currency Case "Date()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Date Case "Double()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Double Case "Integer()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Integer Case "Long()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Long Case "Single()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Single Case "String()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As String Case "Variant()" ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _ LBound(arr) To LBound(arr)) As Variant Case Else Msg = "#ERROR! Only built-in types of arrays " & _ "are supported." If TypeOf Application.Caller Is Range Then ArrayTranspose = Msg Else MsgBox Msg, 16 End If Exit Function End Select If TypeName(arr) < "Object()" Then For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) outputArrayTranspose(i, _ LBound(outputArrayTranspose)) = arr(i) Next End If 'Or load the output array from a two- 'dimensional input array or range ElseIf p = 2 Then Select Case TypeName(arr) Case "Object()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Object For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) For j = LBound(outputArrayTranspose, 2) To _ UBound(outputArrayTranspose, 2) Set outputArrayTranspose(i, j) = arr(j, i) Next Next Case "Boolean()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Boolean Case "Byte()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Byte Case "Currency()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Currency Case "Date()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Date Case "Double()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Double Case "Integer()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Integer Case "Long()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Long Case "Single()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Single Case "String()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As String Case "Variant()" ReDim outputArrayTranspose(LBound(arr, 2) To _ UBound(arr, 2), _ LBound(arr) To UBound(arr)) As Variant Case Else Msg = "#ERROR! Only built-in types of arrays " & _ "are supported." If TypeOf Application.Caller Is Range Then ArrayTranspose = Msg Else MsgBox Msg, 16 End If Exit Function End Select If TypeName(arr) < "Object()" Then For i = LBound(outputArrayTranspose) To _ UBound(outputArrayTranspose) For j = LBound(outputArrayTranspose, 2) To _ UBound(outputArrayTranspose, 2) outputArrayTranspose(i, j) = arr(j, i) Next Next End If End If 'Return the transposed array ArrayTranspose = outputArrayTranspose End Function Sub abtest1() Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD() Dim rng As Range Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range |
Sub to extract uniques from 200k data in xl03
Max wrote:
Alan, sorry, I just posted a response to your earlier reply which crossed this. It **worked** and took only 15 sec to spit out the results !! Superb. Just to check on how much the array functions from my library slowed down the process, I coded the following. I didn't time it, but there doesn't seem to be much difference in execution time. Most of the time is taken up by loading the dictionary, whether that's directly or within the ArrayUniques function. And the ArrayTranspose function, which is not required in the code below, does not seem to add appreciable execution time. FWIW, Alan Beban Sub abtest3() Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD() Dim rng As Range Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range Set rng = Sheets(1).Range("A:D") Set rngA = Sheets(2).Range("A:A") Set rngB = Sheets(2).Range("B:B") Set rngC = Sheets(2).Range("C:C") Set rngD = Sheets(2).Range("D:D") arr1 = rng 'Load the unique elements into a Dictionary Object Set x = New dictionary On Error Resume Next For Each Elem In arr1 x.Add Item:=Elem, key:=CStr(Elem) Next x.Remove ("") On Error GoTo 0 'Load a 0-based horizontal array with the unique 'elements from the Dictionary Object arr2 = x.Items x = UBound(arr2) - LBound(arr2) + 1 z = 65536 y = x - (x \ z) * z Select Case x \ z Case 0 Sheets(2).Range("A1:A" & y).Value = arr2 Case 1 ReDim arrA(1 To 65536, 1 To 1) ReDim arrB(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i - 1) Next For i = 1 To y arrB(i, 1) = arr2(i - 1 + z) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & y).Value = arrB Case 2 ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i - 1) arrB(i, 1) = arr2(i - 1 + z) Next For i = 1 To y arrC(i, 1) = arr2(i - 1 + 2 * z) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & y).Value = arrC Case 3 ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To z, 1 To 1) ReDim arrD(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i - 1) arrB(i, 1) = arr2(i - 1 + z) arrC(i, 1) = arr2(i + 1 + 2 * z) Next For i = 1 To y arrD(i, 1) = arr2(i + 3 * z) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & z).Value = arrC Sheets(2).Range("A1:D" & y).Value = arrD Case 4 Sheets(2).Range("A1:D" & z).Value = _ Sheets(1).Range("A1:D" & z).Value End Select End Sub |
Sub to extract uniques from 200k data in xl03
"Alan Beban" wrote: Then, in the VBEditor, select Tools|References and check Microsoft Scripting Runtime; I believe this step is also necessary in John Walkenbach's code cited by Tom Ogilvy, though neither John nor Tom mentioned it. Tom Ogilvy wrote: It wasn't mentioned because it isn't necessary. It isn't necessary to check scripting runtime because it uses the builtin collection rather than starting up another DLL. == Regards, Tom Ogilvy Thanks. I didn't notice the use of the built-in collection in the Developer Tip. It raises an interesting question whether the use of a Dictionary, which could avoid the looping to fill the list box, might be faster because of the NoDupes.Items property of the Dictionary. Alan Beban |
Sub to extract uniques from 200k data in xl03
Max wrote:
Alan, sorry, I just posted a response to your earlier reply which crossed this. It **worked** and took only 15 sec to spit out the results !! Superb. There's a bug in the SubProcedure abtest1 that you are using. If the number of unique elements is an exact multiple of 65536, then in the Select Case section of the code, y will equal 0 and the code will throw an error. You need to modify the code to provide for this case. Alan Beban |
Sub to extract uniques from 200k data in xl03
I won't know how to modify the code, Alan
Could you assist on this? Thanks -- Max Singapore http://savefile.com/projects/236895 xdemechanik --- |
Sub to extract uniques from 200k data in xl03
Max wrote:
I won't know how to modify the code, Alan Could you assist on this? Thanks I tested only for the case when the number of unique elements is 131072 (2 full columns). I leave it to you to arrange the basic data so that it has 65536 unique items and 196608 (i.e., 1 full column and 3 full columns), and test those two cases. Alan Beban Sub abtest1() Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD() Dim rng As Range Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range Set rng = Sheets(1).Range("A:D") Set rngA = Sheets(2).Range("A:A") Set rngB = Sheets(2).Range("B:B") Set rngC = Sheets(2).Range("C:C") Set rngD = Sheets(2).Range("D:D") arr1 = rng arr2 = ArrayUniques(arr1) q = ArrayCount(arr2) z = 65536 y = q - (q \ z) * z Select Case q \ z Case 0 If y = 0 Then MsgBox rng.Address & "has no data." Else Sheets(2).Range("A1:A" & y).Value = arr2 End If Case 1 If y = 0 Then ReDim arrA(1 To 65536, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Else ReDim arrA(1 To 65536, 1 To 1) ReDim arrB(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) Next For i = 1 To y arrB(i, 1) = arr2(i + z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & y).Value = arrB End If Case 2 If y = 0 Then ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) arrB(i, 1) = arr2(i + z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Else ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) arrB(i, 1) = arr2(i + z, 1) Next For i = 1 To y arrC(i, 1) = arr2(i + 2 * z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & y).Value = arrC End If Case 3 If y = 0 Then ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To z, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) arrB(i, 1) = arr2(i + z, 1) arrC(i, 1) = arr2(i + 2 * z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & z).Value = arrC Else ReDim arrA(1 To z, 1 To 1) ReDim arrB(1 To z, 1 To 1) ReDim arrC(1 To z, 1 To 1) ReDim arrD(1 To y, 1 To 1) For i = 1 To z arrA(i, 1) = arr2(i, 1) arrB(i, 1) = arr2(i + z, 1) arrC(i, 1) = arr2(i + 2 * z, 1) Next For i = 1 To y arrD(i, 1) = arr2(i + 3 * z, 1) Next Sheets(2).Range("A1:A" & z).Value = arrA Sheets(2).Range("B1:B" & z).Value = arrB Sheets(2).Range("C1:C" & z).Value = arrC Sheets(2).Range("A1:D" & y).Value = arrD End If Case 4 Sheets(2).Range("A1:D" & z).Value = _ Sheets(1).Range("A1:D" & z).Value End Select End Sub |
Sub to extract uniques from 200k data in xl03
|
All times are GMT +1. The time now is 05:49 AM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com