ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Sub to extract uniques from 200k data in xl03 (https://www.excelbanter.com/excel-programming/394634-sub-extract-uniques-200k-data-xl03.html)

Max

Sub to extract uniques from 200k data in xl03
 
Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in
cols A to D from row1 down. Looking for a sub which can extract the unique
ids from amongst the 200k into a new sheet, listing these ids into as many
cols as may be required, ie down col A, then down col B, etc. Thanks for
insights.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Tom Ogilvy

Sub to extract uniques from 200k data in xl03
 
This article by John Walkenbach should give you the guts of what you need:

http://www.j-walk.com/ss/excel/tips/tip47.htm

Instead of loading a listbox, write the unique items to another sheet.

If you can't do it, post back. (I have no idea what your skill set is)

--
Regards,
Tom Ogilvy


"Max" wrote:

Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in
cols A to D from row1 down. Looking for a sub which can extract the unique
ids from amongst the 200k into a new sheet, listing these ids into as many
cols as may be required, ie down col A, then down col B, etc. Thanks for
insights.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---




Zone[_3_]

Sub to extract uniques from 200k data in xl03
 
Max, this is surely a simple-minded approach to the problem, but it should
work. It will take a long time to run. I assumed that the data is in
Sheet1 with no column headings and that Sheet2 is an unused sheet. HTH,
James

Sub Uniques4Columns()
Dim FromRow As Long, FromCol As Integer
Dim ToRow As Long, ToCol As Integer
Dim This As Variant, c As Range
Worksheets(2).Activate
Cells.Clear
ToCol = 1: ToRow = 1
With Worksheets(1)
For FromCol = 1 To 4
For FromRow = 1 To .Cells(65536, FromCol).End(xlUp).Row
This = .Cells(FromRow, FromCol)
Set c = Cells.Find(This, LookIn:=xlValues, lookat:=xlWhole)
If c Is Nothing Then
Cells(ToRow, ToCol) = This
ToRow = ToRow + 1
If ToRow 65535 Then
ToRow = 1
ToCol = ToCol + 1
End If
End If
Next FromRow
Next FromCol
End With
End Sub

"Max" wrote in message
...
Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in
cols A to D from row1 down. Looking for a sub which can extract the unique
ids from amongst the 200k into a new sheet, listing these ids into as many
cols as may be required, ie down col A, then down col B, etc. Thanks for
insights.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---




Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Using xl 2003, subject to 65k row limit. I've got over 200k ids listed in
cols A to D from row1 down. Looking for a sub which can extract the unique
ids from amongst the 200k into a new sheet, listing these ids into as many
cols as may be required, ie down col A, then down col B, etc. Thanks for
insights.


I have a solution that takes 15 to 20 seconds to run. Are you interested?

Alan Beban

Zone[_3_]

Sub to extract uniques from 200k data in xl03
 
Tom, good point. I probably should have used some variation of rows.count.
How would you do this, otherwise using the same code? James

"Tom Ogilvy" wrote in message
...
Just note that if the OP has data down to rows 65536 in columns 1 to 3 (as
is
probable based on the description), you will miss examining 65535 names in
each of those columns.

--
Regards,
Tom Ogilvy


"Zone" wrote:

Max, this is surely a simple-minded approach to the problem, but it
should
work. It will take a long time to run. I assumed that the data is in
Sheet1 with no column headings and that Sheet2 is an unused sheet. HTH,
James

Sub Uniques4Columns()
Dim FromRow As Long, FromCol As Integer
Dim ToRow As Long, ToCol As Integer
Dim This As Variant, c As Range
Worksheets(2).Activate
Cells.Clear
ToCol = 1: ToRow = 1
With Worksheets(1)
For FromCol = 1 To 4
For FromRow = 1 To .Cells(65536, FromCol).End(xlUp).Row
This = .Cells(FromRow, FromCol)
Set c = Cells.Find(This, LookIn:=xlValues,
lookat:=xlWhole)
If c Is Nothing Then
Cells(ToRow, ToCol) = This
ToRow = ToRow + 1
If ToRow 65535 Then
ToRow = 1
ToCol = ToCol + 1
End If
End If
Next FromRow
Next FromCol
End With
End Sub

"Max" wrote in message
...
Using xl 2003, subject to 65k row limit. I've got over 200k ids listed
in
cols A to D from row1 down. Looking for a sub which can extract the
unique
ids from amongst the 200k into a new sheet, listing these ids into as
many
cols as may be required, ie down col A, then down col B, etc. Thanks
for
insights.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---







Max

Sub to extract uniques from 200k data in xl03
 
Of course. I'm game.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---
"Alan Beban" <unavailable wrote
I have a solution that takes 15 to 20 seconds to run. Are you interested?




Max

Sub to extract uniques from 200k data in xl03
 
Thanks, Tom. I'll check it out.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Max

Sub to extract uniques from 200k data in xl03
 
James, thanks for your Sub Uniques4Columns().

I've noted the preceding discussions between you and Tom. I'll try running
your sub over lunchtime today.

--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Max

Sub to extract uniques from 200k data in xl03
 
Instead of loading a listbox, write the unique items to another sheet.

I don't know enough vba to get this done, Tom. Could you assist?
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Of course. I'm game.


Copy and paste the following three procedures into a general module in
your workbook; watch for wordwrap, though I think they're clean.

The first returns an array of the unique elements of the input array; by
default it is a case-sensitive, 1-based, vertical array.

The second transposes an array without some of the limitations of the
Excel TRANSPOSE function. It's necessary in this case because the first
function needs to convert the collection of unique elements, which
is a horizontal array in Excel, to a vertical array to match your data.
It's much more general than is required by your inquiry, but it's from
my library so that's what you get.

Then, in the VBEditor, select Tools|References and check Microsoft
Scripting Runtime; I believe this step is also necessary in John
Walkenbach's code cited by Tom Ogilvy, though neither John nor Tom
mentioned it.

Then, assuming that your data is on Sheets(1), and Sheets(2) is
available for the output, run the third SubProcedure, abtest1.

Post to let us know how it comes out.

Function ArrayUniques(InputArray, _
Optional MatchCase As Boolean = True, _
Optional Base_Orient As String = "1vert", _
Optional OmitBlanks As Boolean = True)
'THIS PROCEDURE REQUIRES A PROJECT REFERENCE
'TO "MICROSCOPIC SCRIPTING RUNTIME".
'The function returns an array of unique
'values from an array or range. By default
'it returns a 1-based vertical array; for
'other results enter "0horiz", "1horiz" or
'"0vert" as the third argument. By default,
'the function is case-sensitive; i.e., e.g.,
'"red" and "Red" are treated as two separate
'unique values; to avoid case-sensitivity,
'enter False as the second argument.

'Declare the variables
Dim arr, arr2
Dim i As Long, p As Object, q As String
Dim Elem, x As Dictionary
Dim CalledDirectFromWorksheet As Boolean

'For later use in selecting cells for worksheet output
CalledDirectFromWorksheet = False
If TypeOf Application.Caller Is Range Then
Set p = Application.Caller
q = p.Address
iRows = Range(q).Rows.Count
iCols = Range(q).Columns.Count
If InStr(1, p.FormulaArray, "ArrayUniques") = 2 _
Or InStr(1, p.FormulaArray, "arrayuniques") = 2 _
Or InStr(1, p.FormulaArray, "ARRAYUNIQUES") = 2 Then
CalledDirectFromWorksheet = True
End If
End If

'Convert an input range to a VBA array
arr = InputArray

'Load the unique elements into a Dictionary Object
Set x = New Dictionary
x.CompareMode = Abs(Not MatchCase) '<--Case-sensitivity
On Error Resume Next
For Each Elem In arr
x.Add Item:=Elem, key:=CStr(Elem)
Next
If OmitBlanks Then x.Remove ("")
On Error GoTo 0

'Load a 0-based horizontal array with the unique
'elements from the Dictionary Object
arr2 = x.Items

'This provides appropriate base and orientation
'of the output array
Select Case Base_Orient
Case "0horiz"
arr2 = arr2
Case "1horiz"
ReDim Preserve arr2(1 To UBound(arr2) + 1)
Case "0vert"
arr2 = ArrayTranspose(arr2)
Case "1vert"
ReDim Preserve arr2(1 To UBound(arr2) + 1)
arr2 = ArrayTranspose(arr2)
End Select

'Assure that enough cells are selected to accommodate output
If CalledDirectFromWorksheet Then
If Range(Application.Caller.Address).Count < x.Count Then
ArrayUniques = "Select a range of at least " & _
x.Count & " cells"
Exit Function
End If
End If

ArrayUniques = arr2
End Function

Function ArrayTranspose(InputArray)
'This function returns the transpose of
'the input array or range; it is designed
'to avoid the limitation on the number of
'array elements and type of array that the
'worksheet TRANSPOSE Function has.

'Declare the variables
Dim outputArrayTranspose As Variant, arr As Variant, p As Integer
Dim i As Long, j As Long

'Check to confirm that the input array
'is an array or multicell range
If IsArray(InputArray) Then

'If so, convert an input range to a
'true array
arr = InputArray

'Load the number of dimensions of
'the input array to a variable
On Error Resume Next

'Loop until an error occurs
i = 1
Do
z = UBound(arr, i)
i = i + 1
Loop While Err = 0

'Reset the error value for use with other procedures
Err = 0

'Return the number of dimensions
p = i - 2
End If

If Not IsArray(InputArray) Or p 2 Then
Msg = "#ERROR! The function accepts only multi-cell ranges " & _
"and 1D or 2D arrays."
If TypeOf Application.Caller Is Range Then
ArrayTranspose = Msg
Else
MsgBox Msg, 16
End If
Exit Function
End If

'Load the output array from a one-
'dimensional input array
If p = 1 Then

Select Case TypeName(arr)
Case "Object()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Object
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
Set outputArrayTranspose(i, _
LBound(outputArrayTranspose)) = _
arr(i)
Next
Case "Boolean()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Boolean
Case "Byte()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Byte
Case "Currency()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Currency
Case "Date()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Date
Case "Double()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Double
Case "Integer()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Integer
Case "Long()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Long
Case "Single()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Single
Case "String()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As String
Case "Variant()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Variant
Case Else
Msg = "#ERROR! Only built-in types of arrays " & _
"are supported."
If TypeOf Application.Caller Is Range Then
ArrayTranspose = Msg
Else
MsgBox Msg, 16
End If
Exit Function
End Select
If TypeName(arr) < "Object()" Then
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
outputArrayTranspose(i, _
LBound(outputArrayTranspose)) = arr(i)
Next
End If

'Or load the output array from a two-
'dimensional input array or range
ElseIf p = 2 Then
Select Case TypeName(arr)
Case "Object()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Object
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
For j = LBound(outputArrayTranspose, 2) To _
UBound(outputArrayTranspose, 2)
Set outputArrayTranspose(i, j) = arr(j, i)
Next
Next
Case "Boolean()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Boolean
Case "Byte()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Byte
Case "Currency()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Currency
Case "Date()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Date
Case "Double()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Double
Case "Integer()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Integer
Case "Long()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Long
Case "Single()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Single
Case "String()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As String
Case "Variant()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Variant
Case Else
Msg = "#ERROR! Only built-in types of arrays " & _
"are supported."
If TypeOf Application.Caller Is Range Then
ArrayTranspose = Msg
Else
MsgBox Msg, 16
End If
Exit Function
End Select
If TypeName(arr) < "Object()" Then
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
For j = LBound(outputArrayTranspose, 2) To _
UBound(outputArrayTranspose, 2)
outputArrayTranspose(i, j) = arr(j, i)
Next
Next
End If
End If

'Return the transposed array
ArrayTranspose = outputArrayTranspose
End Function

Sub abtest1()
Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD()
Dim rng As Range
Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range
Set rng = Sheets(1).Range("A:D")
Set rngA = Sheets(2).Range("A:A")
Set rngB = Sheets(2).Range("B:B")
Set rngC = Sheets(2).Range("C:C")
Set rngD = Sheets(2).Range("D:D")
arr1 = rng
arr2 = ArrayUniques(arr1)
x = ArrayCount(arr2)
z = 65536
y = x - (x \ z) * z
Select Case x \ z
Case 0
Sheets(2).Range("A1:A" & y).Value = arr2
Case 1
ReDim arrA(1 To 65536, 1 To 1)
ReDim arrB(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
Next
For i = 1 To y
arrB(i, 1) = arr2(i + z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & y).Value = arrB
Case 2
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
Next
For i = 1 To y
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & y).Value = arrC
Case 3
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
ReDim arrD(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
For i = 1 To y
arrD(i, 1) = arr2(i + 3 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Sheets(2).Range("A1:D" & y).Value = arrD
Case 4
Sheets(2).Range("A:D").Value = Sheets(1).Range("A:D").Value
End Select
End Sub

Alan Beban

Max

Sub to extract uniques from 200k data in xl03
 
Alan, many thanks.
will try it out and post back here
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Max

Sub to extract uniques from 200k data in xl03
 
Alan,

I hit a "sub or function not defined" at this line:
x = ArrayCount(arr2)

when i ran Sub abtest1()

Pl advise.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Alan,

I hit a "sub or function not defined" at this line:
x = ArrayCount(arr2)

when i ran Sub abtest1()

Pl advise.

Drat! I didn't notice that the procedure was using another Function from
the Library; I'll post a fix.

Alan Beban

Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Alan,

I hit a "sub or function not defined" at this line:
x = ArrayCount(arr2)

when i ran Sub abtest1()

Pl advise.


Copy and paste the following into the same module as th other three.

Function ArrayCount(InputArray)
'This function counts NOT the number of
'non-blank values in the array, but the
'number of available slots for values,
'whether the slots contain anything or not.
'It's similar to the Count Property [e.g.,
'Range("a1:c3").Count]

Dim j As Long, k As Long

'Convert range to array
'InputArray = InputArray

If IsArray(InputArray) Then
If Not TypeOf InputArray Is Range Then

j = 1: k = 1

On Error Resume Next

Do
k = k * (UBound(InputArray, j) - _
LBound(InputArray, j) + 1)
j = j + 1
Loop While Err.Number = 0

ArrayCount = k
Else
If TypeOf Application.Caller Is Range Then
ArrayCount = "#ERROR! This function accepts only arrays."
Else
MsgBox "#ERROR! The ArrayCount function " & _
"accepts only arrays.", 16
End If
End If

Else
If TypeOf Application.Caller Is Range Then
ArrayCount = "#ERROR! This function accepts only arrays."
Else
MsgBox "#ERROR! The ArrayCount function " & _
"accepts only arrays.", 16
End If
End If

End Function

Sorry,
Alan Beban

Max

Sub to extract uniques from 200k data in xl03
 
Alan, thanks.

Inserted that. Ran the sub,
it halted at the line: "End Select" within

"...
Case 4
Sheets(2).Range("A:D").Value = Sheets(1).Range("A:D").Value
End Select ... "

--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Max

Sub to extract uniques from 200k data in xl03
 
Re-ran the sub again. The run completes, but Sheet2 is blank (no results)
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Re-ran the sub again. The run completes, but Sheet2 is blank (no results)


I reran it; it worked as it should. You mention Sheet2; is that in fact
the name of your Sheets(2)?

The test file I range it on had A:D on Sheets(1) filled like this

Set rng = Sheets(1).Range("A:D")
For i = 1 to 65536:For j = 1 to 4
rng(i,j).value = i * j
Next:Next

and arr2 had 158,378 elements that filled A1:C27306 on Sheets(2).

One thing to do is in the abtest1 SubProcedure, after the line

y = x - (x \ z) * z

insert Stop and when the code stops, mouse over x to see that it is
158738, and mouse over y to see that it is 27306.

Let me know.

Alan Beban

Alan Beban

Sub to extract uniques from 200k data in xl03
 
Alan Beban wrote:
Max wrote:
Re-ran the sub again. The run completes, but Sheet2 is blank (no results)


My last email assumed your data was the same as my test file. In your
file, when you mouse over x it should be equal to the number of unique
elements in your data; and y should be the integral part of x/65536,
i.e., 0,1,2,3 or 4.

Alan Beban

Max

Sub to extract uniques from 200k data in xl03
 
I reran it; it worked as it should ...

Alan, many thanks. I re-booted the PC,, and
re-ran it again, and this time round, it *worked*.

And yes, runtime took only around 15** sec, simply amazing !!
**for an estimated 200k worth of items

.. You mention Sheet2; is that in fact the name of your Sheets(2)?


Does Sheets(2) mean the 2nd tab from the left?
If so, yes, I have "Sheet2" -- tabname -- as the 2nd tab from the left

In VBE under Microsoft Excel Objects,
the tab appears as: Sheet2 (Sheet2)

What is the first instance of Sheet2 in the above ? I gather that the 2nd
instance -- within parens -- is the tabname, which can be changed by user.
How could we make use of the 1st instance which apparently cannot be changed
and is more robust? Thanks.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Max

Sub to extract uniques from 200k data in xl03
 
Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Max

Sub to extract uniques from 200k data in xl03
 
James, thanks. Tried running your sub on smaller test set of data and it
works fine. For my needs here I'll be going with Alan's offering in the
other branch which also works as well and is very fast. Nonetheless, I do
appreciate your response/sub.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Alan Beban

Sub to extract uniques from 200k data in xl03
 
Alan Beban wrote:
Alan Beban wrote:
Max wrote:
Re-ran the sub again. The run completes, but Sheet2 is blank (no
results)


My last email assumed your data was the same as my test file. In your
file, when you mouse over x it should be equal to the number of unique
elements in your data; and y should be the integral part of x/65536,
i.e., 0,1,2,3 or 4.

Alan Beban


Whoops! Must have been too close to my bedtime when I wrote the above. y
is, of course, the number of elements in the rightmost column of the
unique output.

Alan Beban

Tom Ogilvy

Sub to extract uniques from 200k data in xl03
 
It wasn't mentioned because it isn't necessary.

It isn't necessary to check scripting runtime because it uses the builtin
collection rather than starting up another DLL.

==
Regards,
Tom Ogilvy


"Alan Beban" wrote:

Max wrote:
Of course. I'm game.


Copy and paste the following three procedures into a general module in
your workbook; watch for wordwrap, though I think they're clean.

The first returns an array of the unique elements of the input array; by
default it is a case-sensitive, 1-based, vertical array.

The second transposes an array without some of the limitations of the
Excel TRANSPOSE function. It's necessary in this case because the first
function needs to convert the collection of unique elements, which
is a horizontal array in Excel, to a vertical array to match your data.
It's much more general than is required by your inquiry, but it's from
my library so that's what you get.

Then, in the VBEditor, select Tools|References and check Microsoft
Scripting Runtime; I believe this step is also necessary in John
Walkenbach's code cited by Tom Ogilvy, though neither John nor Tom
mentioned it.

Then, assuming that your data is on Sheets(1), and Sheets(2) is
available for the output, run the third SubProcedure, abtest1.

Post to let us know how it comes out.

Function ArrayUniques(InputArray, _
Optional MatchCase As Boolean = True, _
Optional Base_Orient As String = "1vert", _
Optional OmitBlanks As Boolean = True)
'THIS PROCEDURE REQUIRES A PROJECT REFERENCE
'TO "MICROSCOPIC SCRIPTING RUNTIME".
'The function returns an array of unique
'values from an array or range. By default
'it returns a 1-based vertical array; for
'other results enter "0horiz", "1horiz" or
'"0vert" as the third argument. By default,
'the function is case-sensitive; i.e., e.g.,
'"red" and "Red" are treated as two separate
'unique values; to avoid case-sensitivity,
'enter False as the second argument.

'Declare the variables
Dim arr, arr2
Dim i As Long, p As Object, q As String
Dim Elem, x As Dictionary
Dim CalledDirectFromWorksheet As Boolean

'For later use in selecting cells for worksheet output
CalledDirectFromWorksheet = False
If TypeOf Application.Caller Is Range Then
Set p = Application.Caller
q = p.Address
iRows = Range(q).Rows.Count
iCols = Range(q).Columns.Count
If InStr(1, p.FormulaArray, "ArrayUniques") = 2 _
Or InStr(1, p.FormulaArray, "arrayuniques") = 2 _
Or InStr(1, p.FormulaArray, "ARRAYUNIQUES") = 2 Then
CalledDirectFromWorksheet = True
End If
End If

'Convert an input range to a VBA array
arr = InputArray

'Load the unique elements into a Dictionary Object
Set x = New Dictionary
x.CompareMode = Abs(Not MatchCase) '<--Case-sensitivity
On Error Resume Next
For Each Elem In arr
x.Add Item:=Elem, key:=CStr(Elem)
Next
If OmitBlanks Then x.Remove ("")
On Error GoTo 0

'Load a 0-based horizontal array with the unique
'elements from the Dictionary Object
arr2 = x.Items

'This provides appropriate base and orientation
'of the output array
Select Case Base_Orient
Case "0horiz"
arr2 = arr2
Case "1horiz"
ReDim Preserve arr2(1 To UBound(arr2) + 1)
Case "0vert"
arr2 = ArrayTranspose(arr2)
Case "1vert"
ReDim Preserve arr2(1 To UBound(arr2) + 1)
arr2 = ArrayTranspose(arr2)
End Select

'Assure that enough cells are selected to accommodate output
If CalledDirectFromWorksheet Then
If Range(Application.Caller.Address).Count < x.Count Then
ArrayUniques = "Select a range of at least " & _
x.Count & " cells"
Exit Function
End If
End If

ArrayUniques = arr2
End Function

Function ArrayTranspose(InputArray)
'This function returns the transpose of
'the input array or range; it is designed
'to avoid the limitation on the number of
'array elements and type of array that the
'worksheet TRANSPOSE Function has.

'Declare the variables
Dim outputArrayTranspose As Variant, arr As Variant, p As Integer
Dim i As Long, j As Long

'Check to confirm that the input array
'is an array or multicell range
If IsArray(InputArray) Then

'If so, convert an input range to a
'true array
arr = InputArray

'Load the number of dimensions of
'the input array to a variable
On Error Resume Next

'Loop until an error occurs
i = 1
Do
z = UBound(arr, i)
i = i + 1
Loop While Err = 0

'Reset the error value for use with other procedures
Err = 0

'Return the number of dimensions
p = i - 2
End If

If Not IsArray(InputArray) Or p 2 Then
Msg = "#ERROR! The function accepts only multi-cell ranges " & _
"and 1D or 2D arrays."
If TypeOf Application.Caller Is Range Then
ArrayTranspose = Msg
Else
MsgBox Msg, 16
End If
Exit Function
End If

'Load the output array from a one-
'dimensional input array
If p = 1 Then

Select Case TypeName(arr)
Case "Object()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Object
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
Set outputArrayTranspose(i, _
LBound(outputArrayTranspose)) = _
arr(i)
Next
Case "Boolean()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Boolean
Case "Byte()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Byte
Case "Currency()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Currency
Case "Date()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Date
Case "Double()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Double
Case "Integer()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Integer
Case "Long()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Long
Case "Single()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Single
Case "String()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As String
Case "Variant()"
ReDim outputArrayTranspose(LBound(arr) To UBound(arr), _
LBound(arr) To LBound(arr)) As Variant
Case Else
Msg = "#ERROR! Only built-in types of arrays " & _
"are supported."
If TypeOf Application.Caller Is Range Then
ArrayTranspose = Msg
Else
MsgBox Msg, 16
End If
Exit Function
End Select
If TypeName(arr) < "Object()" Then
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
outputArrayTranspose(i, _
LBound(outputArrayTranspose)) = arr(i)
Next
End If

'Or load the output array from a two-
'dimensional input array or range
ElseIf p = 2 Then
Select Case TypeName(arr)
Case "Object()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Object
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
For j = LBound(outputArrayTranspose, 2) To _
UBound(outputArrayTranspose, 2)
Set outputArrayTranspose(i, j) = arr(j, i)
Next
Next
Case "Boolean()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Boolean
Case "Byte()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Byte
Case "Currency()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Currency
Case "Date()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Date
Case "Double()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Double
Case "Integer()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Integer
Case "Long()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Long
Case "Single()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Single
Case "String()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As String
Case "Variant()"
ReDim outputArrayTranspose(LBound(arr, 2) To _
UBound(arr, 2), _
LBound(arr) To UBound(arr)) As Variant
Case Else
Msg = "#ERROR! Only built-in types of arrays " & _
"are supported."
If TypeOf Application.Caller Is Range Then
ArrayTranspose = Msg
Else
MsgBox Msg, 16
End If
Exit Function
End Select
If TypeName(arr) < "Object()" Then
For i = LBound(outputArrayTranspose) To _
UBound(outputArrayTranspose)
For j = LBound(outputArrayTranspose, 2) To _
UBound(outputArrayTranspose, 2)
outputArrayTranspose(i, j) = arr(j, i)
Next
Next
End If
End If

'Return the transposed array
ArrayTranspose = outputArrayTranspose
End Function

Sub abtest1()
Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD()
Dim rng As Range
Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range


Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb.


Just to check on how much the array functions from my library slowed
down the process, I coded the following. I didn't time it, but there
doesn't seem to be much difference in execution time. Most of the time
is taken up by loading the dictionary, whether that's directly or within
the ArrayUniques function. And the ArrayTranspose function, which is not
required in the code below, does not seem to add appreciable execution time.

FWIW,
Alan Beban

Sub abtest3()
Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD()
Dim rng As Range
Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range
Set rng = Sheets(1).Range("A:D")
Set rngA = Sheets(2).Range("A:A")
Set rngB = Sheets(2).Range("B:B")
Set rngC = Sheets(2).Range("C:C")
Set rngD = Sheets(2).Range("D:D")
arr1 = rng
'Load the unique elements into a Dictionary Object
Set x = New dictionary
On Error Resume Next
For Each Elem In arr1
x.Add Item:=Elem, key:=CStr(Elem)
Next
x.Remove ("")
On Error GoTo 0

'Load a 0-based horizontal array with the unique
'elements from the Dictionary Object
arr2 = x.Items
x = UBound(arr2) - LBound(arr2) + 1
z = 65536
y = x - (x \ z) * z
Select Case x \ z
Case 0
Sheets(2).Range("A1:A" & y).Value = arr2
Case 1
ReDim arrA(1 To 65536, 1 To 1)
ReDim arrB(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i - 1)
Next
For i = 1 To y
arrB(i, 1) = arr2(i - 1 + z)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & y).Value = arrB
Case 2
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i - 1)
arrB(i, 1) = arr2(i - 1 + z)
Next
For i = 1 To y
arrC(i, 1) = arr2(i - 1 + 2 * z)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & y).Value = arrC
Case 3
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
ReDim arrD(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i - 1)
arrB(i, 1) = arr2(i - 1 + z)
arrC(i, 1) = arr2(i + 1 + 2 * z)
Next
For i = 1 To y
arrD(i, 1) = arr2(i + 3 * z)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Sheets(2).Range("A1:D" & y).Value = arrD
Case 4
Sheets(2).Range("A1:D" & z).Value = _
Sheets(1).Range("A1:D" & z).Value
End Select
End Sub

Alan Beban

Sub to extract uniques from 200k data in xl03
 

"Alan Beban" wrote:

Then, in the VBEditor, select Tools|References and check Microsoft
Scripting Runtime; I believe this step is also necessary in John
Walkenbach's code cited by Tom Ogilvy, though neither John nor Tom
mentioned it.


Tom Ogilvy wrote:
It wasn't mentioned because it isn't necessary.

It isn't necessary to check scripting runtime because it uses the

builtin
collection rather than starting up another DLL.

==
Regards,
Tom Ogilvy


Thanks. I didn't notice the use of the built-in collection in the
Developer Tip. It raises an interesting question whether the use of a
Dictionary, which could avoid the looping to fill the list box, might be
faster because of the NoDupes.Items property of the Dictionary.

Alan Beban

Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
Alan, sorry, I just posted a response to your earlier reply which crossed
this. It **worked** and took only 15 sec to spit out the results !! Superb.


There's a bug in the SubProcedure abtest1 that you are using. If the
number of unique elements is an exact multiple of 65536, then in the
Select Case section of the code, y will equal 0 and the code will throw
an error. You need to modify the code to provide for this case.

Alan Beban

Max

Sub to extract uniques from 200k data in xl03
 
I won't know how to modify the code, Alan
Could you assist on this? Thanks
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---



Alan Beban

Sub to extract uniques from 200k data in xl03
 
Max wrote:
I won't know how to modify the code, Alan
Could you assist on this? Thanks


I tested only for the case when the number of unique elements is 131072
(2 full columns). I leave it to you to arrange the basic data so that it
has 65536 unique items and 196608 (i.e., 1 full column and 3 full
columns), and test those two cases.

Alan Beban

Sub abtest1()
Dim arr1(), arr2(), arrA(), arrB(), arrC(), arrD()
Dim rng As Range
Dim rngA As Range, rngB As Range, rngC As Range, rngD As Range
Set rng = Sheets(1).Range("A:D")
Set rngA = Sheets(2).Range("A:A")
Set rngB = Sheets(2).Range("B:B")
Set rngC = Sheets(2).Range("C:C")
Set rngD = Sheets(2).Range("D:D")
arr1 = rng
arr2 = ArrayUniques(arr1)
q = ArrayCount(arr2)
z = 65536
y = q - (q \ z) * z
Select Case q \ z
Case 0
If y = 0 Then
MsgBox rng.Address & "has no data."
Else
Sheets(2).Range("A1:A" & y).Value = arr2
End If
Case 1
If y = 0 Then
ReDim arrA(1 To 65536, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Else
ReDim arrA(1 To 65536, 1 To 1)
ReDim arrB(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
Next
For i = 1 To y
arrB(i, 1) = arr2(i + z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & y).Value = arrB
End If
Case 2
If y = 0 Then
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Else
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
Next
For i = 1 To y
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & y).Value = arrC
End If
Case 3
If y = 0 Then
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Else
ReDim arrA(1 To z, 1 To 1)
ReDim arrB(1 To z, 1 To 1)
ReDim arrC(1 To z, 1 To 1)
ReDim arrD(1 To y, 1 To 1)
For i = 1 To z
arrA(i, 1) = arr2(i, 1)
arrB(i, 1) = arr2(i + z, 1)
arrC(i, 1) = arr2(i + 2 * z, 1)
Next
For i = 1 To y
arrD(i, 1) = arr2(i + 3 * z, 1)
Next
Sheets(2).Range("A1:A" & z).Value = arrA
Sheets(2).Range("B1:B" & z).Value = arrB
Sheets(2).Range("C1:C" & z).Value = arrC
Sheets(2).Range("A1:D" & y).Value = arrD
End If
Case 4
Sheets(2).Range("A1:D" & z).Value = _
Sheets(1).Range("A1:D" & z).Value
End Select
End Sub

Max

Sub to extract uniques from 200k data in xl03
 
Many thanks, Alan.
--
Max
Singapore
http://savefile.com/projects/236895
xdemechanik
---




All times are GMT +1. The time now is 05:49 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com