Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
I have been reading thru various post on the forum about sorting large arrays
and most of the post seem to be about integer arrays. I have found some links and comments about string arrays, but none of them are exactly what I'm looking for and I'm having some trouble re-writing the code to get them to work. Basically, I'm needing a routine that can sort a string array containing approx 1.5 million elements. As I said, I have had a look around and a quick sort seems to be the best method to use for this size of array. I have the code for a quick sort written by Jim Rech in 1998, but this is for a 2 dimensional array, where mine is just a single dimension. If I can get the array sorted, then this is going to make processing of the input file alot quicker. Hopefully some one can help. Many thanks in advance... -- Cheers... |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
Function QuickSortString(arrString() As String, _
Optional lLow1 = -1, _ Optional lhigh1 = -1) 'Dimension variables Dim lLow2 As Long Dim lhigh2 As Long Dim strVal1 As String Dim strVal2 As String 'If first time, get the size of the array to sort If lLow1 = -1 Then lLow1 = LBound(arrString, 1) End If If lhigh1 = -1 Then lhigh1 = UBound(arrString, 1) End If 'Set new extremes to old extremes lLow2 = lLow1 lhigh2 = lhigh1 'Get value of array item in middle of new extremes strVal1 = arrString((lLow1 + lhigh1) / 2) 'Loop for all the items in the array between the extremes While (lLow2 <= lhigh2) 'Find the first item that is greater than the mid-point item While (arrString(lLow2) < strVal1 And lLow2 < lhigh1) lLow2 = lLow2 + 1 Wend 'Find the last item that is less than the mid-point item While (arrString(lhigh2) strVal1 And lhigh2 lLow1) lhigh2 = lhigh2 - 1 Wend 'If the new 'greater' item comes before the new 'less' item, swap them If (lLow2 <= lhigh2) Then strVal2 = arrString(lLow2) arrString(lLow2) = arrString(lhigh2) arrString(lhigh2) = strVal2 'Advance the pointers to the next item lLow2 = lLow2 + 1 lhigh2 = lhigh2 - 1 End If Wend 'Iterate to sort the lower half of the extremes If (lhigh2 lLow1) Then Call QuickSortString(arrString, lLow1, lhigh2) 'Iterate to sort the upper half of the extremes If (lLow2 < lhigh1) Then Call QuickSortString(arrString, lLow2, lhigh1) QuickSortString = arrString End Function There is actually a way to do this much faster, but that needs coding in VB6. Will see if I can post it later. RBS "Deke" wrote in message ... I have been reading thru various post on the forum about sorting large arrays and most of the post seem to be about integer arrays. I have found some links and comments about string arrays, but none of them are exactly what I'm looking for and I'm having some trouble re-writing the code to get them to work. Basically, I'm needing a routine that can sort a string array containing approx 1.5 million elements. As I said, I have had a look around and a quick sort seems to be the best method to use for this size of array. I have the code for a quick sort written by Jim Rech in 1998, but this is for a 2 dimensional array, where mine is just a single dimension. If I can get the array sorted, then this is going to make processing of the input file alot quicker. Hopefully some one can help. Many thanks in advance... -- Cheers... |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
This one is a lot faster, but needs coding in VB6 as it needs all the fast
compiler options. Don't think it is that much faster if you code in VBA. Got this code from Olaf Schmidt. Option Explicit '(native compiled, all options this Sort needs 0.19 sec for 20000 'Random Strings of 200'er length on a PIII 500). Private Type SAFEARRAY1D cDims As Integer fFeatures As Integer cbElements As Long cLocks As Long pvData As Long cElements As Long lLbound As Long End Type Private Declare Sub BindArray Lib "kernel32" Alias "RtlMoveMemory" _ (pArr() As Any, PSrc&, Optional ByVal cb& = 4) Private Declare Sub ReleaseArray Lib "kernel32" Alias "RtlMoveMemory" _ (pArr() As Any, _ Optional PSrc& = 0, _ Optional ByVal cb& = 4) Public Sub QSort1DP(Arr() As String) 'Dim Compare As String Dim i As Long Dim j As Long Dim Lo As Long Dim Hi As Long Dim StPtr As Long Dim V(0) As String Dim pV() As Long Dim sapV As SAFEARRAY1D Dim pArr() As Long Dim sapArr As SAFEARRAY1D Dim p As Long Dim StSize As Long Dim StLo() As Long Dim StHi() As Long StSize = 255 ReDim StLo(StSize) ReDim StHi(StSize) 'init the stack On Error Resume Next 'spans a Long-Array (pArr()) over the StringPointers in Arr() sapArr.cDims = 1 sapArr.cbElements = 4 'Bytes used by each StrPointer sapArr.pvData = VarPtr(Arr(0)) sapArr.cElements = UBound(Arr) - LBound(Arr) + 1 If Err Then Err.Clear Exit Sub 'Arr was not initialized End If On Error GoTo 0 'switch off Err-Handler for speed-reasons BindArray pArr, VarPtr(sapArr) 'another Array, used to hold only one single String, 'respective its pointer for reasons of comparing inside the algo sapV.cDims = 1 sapV.cbElements = 4 sapV.pvData = VarPtr(V(0)) sapV.cElements = 1 BindArray pV, VarPtr(sapV) StPtr = 1 'init the StackPointer StLo(0) = LBound(Arr) StHi(0) = UBound(Arr) Do StPtr = StPtr - 1 Lo = StLo(StPtr) Hi = StHi(StPtr) If Hi - Lo < 12 Then 'MinSort For Lo = Lo To Hi - 1 j = Lo For i = Lo + 1 To Hi If Arr(i) < Arr(j) Then j = i Next i If j < Lo Then p = pArr(j): pArr(j) = pArr(Lo): pArr(Lo) = p End If Next Lo Else 'QSort Do i = Lo: j = Hi pV(0) = pArr((Lo + Hi) \ 2) Do Do While Arr(i) < V(0) i = i + 1 Loop Do While Arr(j) V(0) j = j - 1 Loop If i <= j Then p = pArr(i) pArr(i) = pArr(j) pArr(j) = p i = i + 1 j = j - 1 End If Loop While i <= j If j - Lo < Hi - i Then If i < Hi Then StLo(StPtr) = i StHi(StPtr) = Hi StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Hi = j Else If Lo < j Then StLo(StPtr) = Lo StHi(StPtr) = j StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Lo = i End If Loop While Lo < Hi End If Loop While StPtr pV(0) = 0 'don't dispose the current String-Content of V(0) ReleaseArray pV 'release the Array-Mapping between V() and pV() ReleaseArray pArr 'relase the Array-Mapping between Arr() and pArr() End Sub RBS "Deke" wrote in message ... I have been reading thru various post on the forum about sorting large arrays and most of the post seem to be about integer arrays. I have found some links and comments about string arrays, but none of them are exactly what I'm looking for and I'm having some trouble re-writing the code to get them to work. Basically, I'm needing a routine that can sort a string array containing approx 1.5 million elements. As I said, I have had a look around and a quick sort seems to be the best method to use for this size of array. I have the code for a quick sort written by Jim Rech in 1998, but this is for a 2 dimensional array, where mine is just a single dimension. If I can get the array sorted, then this is going to make processing of the input file alot quicker. Hopefully some one can help. Many thanks in advance... -- Cheers... |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
It is in fact a lot faster in VBA as well, have just tested.
It needs some adjustments though to work with a 1-based array, which I haven't worked out yet. Fine though with a 0-based array. RBS |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
Hi,
Thank for the reply. Got both the bits of code to work perfectly first time. The VB script is a lot faster and is exactly what I'm needing, I've got the processing time down from 20-30 min's to seconds. Again, thank's for all your help, it was exactly what I was needing... -- Cheers... " wrote: It is in fact a lot faster in VBA as well, have just tested. It needs some adjustments though to work with a 1-based array, which I haven't worked out yet. Fine though with a 0-based array. RBS |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
This will work with both an 0-based array and a 1-based array:
Public Sub QSort1DStringArrayP(arrString() As String) 'Dim Compare As String Dim i As Long Dim j As Long Dim Lo As Long Dim Hi As Long Dim StPtr As Long Dim V(0) As String Dim pV() As Long Dim sapV As SAFEARRAY1D Dim pArr() As Long Dim sapArr As SAFEARRAY1D Dim p As Long Dim StSize As Long Dim StLo() As Long Dim StHi() As Long StSize = 255 ReDim StLo(StSize) ReDim StHi(StSize) 'init the stack On Error Resume Next 'spans a Long-Array (pArr()) over the StringPointers in Arr() sapArr.cDims = 1 sapArr.cbElements = 4 'Bytes used by each StrPointer sapArr.lLbound = LBound(arrString) sapArr.pvData = VarPtr(arrString(sapArr.lLbound)) sapArr.cElements = UBound(arrString) - LBound(arrString) + 1 If Err Then Err.Clear Exit Sub 'Arr was not initialized End If On Error GoTo 0 'switch off Err-Handler for speed-reasons BindArray pArr, VarPtr(sapArr) 'another Array, used to hold only one single String, 'respective its pointer for reasons of comparing inside the algo sapV.cDims = 1 sapV.cbElements = 4 sapV.pvData = VarPtr(V(0)) sapV.cElements = 1 BindArray pV, VarPtr(sapV) StPtr = 1 'init the StackPointer StLo(0) = LBound(arrString) StHi(0) = UBound(arrString) Do StPtr = StPtr - 1 Lo = StLo(StPtr) Hi = StHi(StPtr) If Hi - Lo < 12 Then 'MinSort For Lo = Lo To Hi - 1 j = Lo For i = Lo + 1 To Hi If arrString(i) < arrString(j) Then j = i Next i If j < Lo Then p = pArr(j): pArr(j) = pArr(Lo): pArr(Lo) = p End If Next Lo Else 'QSort Do i = Lo: j = Hi pV(0) = pArr((Lo + Hi) \ 2) Do Do While arrString(i) < V(0) i = i + 1 Loop Do While arrString(j) V(0) j = j - 1 Loop If i <= j Then p = pArr(i) pArr(i) = pArr(j) pArr(j) = p i = i + 1 j = j - 1 End If Loop While i <= j If j - Lo < Hi - i Then If i < Hi Then StLo(StPtr) = i StHi(StPtr) = Hi StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Hi = j Else If Lo < j Then StLo(StPtr) = Lo StHi(StPtr) = j StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Lo = i End If Loop While Lo < Hi End If Loop While StPtr pV(0) = 0 'don't dispose the current String-Content of V(0) ReleaseArray pV 'release the Array-Mapping between V() and pV() ReleaseArray pArr 'relase the Array-Mapping between Arr() and pArr() End Sub The declarations etc. are just the same as before. It will be faster if you could code in VB6 (worth buying) as you have all the fast compiler options that are not available in VBA, but even in VBA this is very fast. RBS "Deke" wrote in message ... Hi, Thank for the reply. Got both the bits of code to work perfectly first time. The VB script is a lot faster and is exactly what I'm needing, I've got the processing time down from 20-30 min's to seconds. Again, thank's for all your help, it was exactly what I was needing... -- Cheers... " wrote: It is in fact a lot faster in VBA as well, have just tested. It needs some adjustments though to work with a 1-based array, which I haven't worked out yet. Fine though with a 0-based array. RBS |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
I was able to re-write my code to use a 0-based array, so the original code
works perfectly. Thanks again for youe help, got the processing time for the file down from 30-ish minutes to 3 minutes, which is alot more reasonable. -- Cheers... "RB Smissaert" wrote: This will work with both an 0-based array and a 1-based array: Public Sub QSort1DStringArrayP(arrString() As String) 'Dim Compare As String Dim i As Long Dim j As Long Dim Lo As Long Dim Hi As Long Dim StPtr As Long Dim V(0) As String Dim pV() As Long Dim sapV As SAFEARRAY1D Dim pArr() As Long Dim sapArr As SAFEARRAY1D Dim p As Long Dim StSize As Long Dim StLo() As Long Dim StHi() As Long StSize = 255 ReDim StLo(StSize) ReDim StHi(StSize) 'init the stack On Error Resume Next 'spans a Long-Array (pArr()) over the StringPointers in Arr() sapArr.cDims = 1 sapArr.cbElements = 4 'Bytes used by each StrPointer sapArr.lLbound = LBound(arrString) sapArr.pvData = VarPtr(arrString(sapArr.lLbound)) sapArr.cElements = UBound(arrString) - LBound(arrString) + 1 If Err Then Err.Clear Exit Sub 'Arr was not initialized End If On Error GoTo 0 'switch off Err-Handler for speed-reasons BindArray pArr, VarPtr(sapArr) 'another Array, used to hold only one single String, 'respective its pointer for reasons of comparing inside the algo sapV.cDims = 1 sapV.cbElements = 4 sapV.pvData = VarPtr(V(0)) sapV.cElements = 1 BindArray pV, VarPtr(sapV) StPtr = 1 'init the StackPointer StLo(0) = LBound(arrString) StHi(0) = UBound(arrString) Do StPtr = StPtr - 1 Lo = StLo(StPtr) Hi = StHi(StPtr) If Hi - Lo < 12 Then 'MinSort For Lo = Lo To Hi - 1 j = Lo For i = Lo + 1 To Hi If arrString(i) < arrString(j) Then j = i Next i If j < Lo Then p = pArr(j): pArr(j) = pArr(Lo): pArr(Lo) = p End If Next Lo Else 'QSort Do i = Lo: j = Hi pV(0) = pArr((Lo + Hi) \ 2) Do Do While arrString(i) < V(0) i = i + 1 Loop Do While arrString(j) V(0) j = j - 1 Loop If i <= j Then p = pArr(i) pArr(i) = pArr(j) pArr(j) = p i = i + 1 j = j - 1 End If Loop While i <= j If j - Lo < Hi - i Then If i < Hi Then StLo(StPtr) = i StHi(StPtr) = Hi StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Hi = j Else If Lo < j Then StLo(StPtr) = Lo StHi(StPtr) = j StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Lo = i End If Loop While Lo < Hi End If Loop While StPtr pV(0) = 0 'don't dispose the current String-Content of V(0) ReleaseArray pV 'release the Array-Mapping between V() and pV() ReleaseArray pArr 'relase the Array-Mapping between Arr() and pArr() End Sub The declarations etc. are just the same as before. It will be faster if you could code in VB6 (worth buying) as you have all the fast compiler options that are not available in VBA, but even in VBA this is very fast. RBS "Deke" wrote in message ... Hi, Thank for the reply. Got both the bits of code to work perfectly first time. The VB script is a lot faster and is exactly what I'm needing, I've got the processing time down from 20-30 min's to seconds. Again, thank's for all your help, it was exactly what I was needing... -- Cheers... " wrote: It is in fact a lot faster in VBA as well, have just tested. It needs some adjustments though to work with a 1-based array, which I haven't worked out yet. Fine though with a 0-based array. RBS |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Sorting a string array
I was able to re-write my code to use a 0-based array, so the original code
works perfectly. Thanks again for youe help, got the processing time for the file down from 30-ish minutes to 3 minutes, which is alot more reasonable. -- Cheers... "RB Smissaert" wrote: This will work with both an 0-based array and a 1-based array: Public Sub QSort1DStringArrayP(arrString() As String) 'Dim Compare As String Dim i As Long Dim j As Long Dim Lo As Long Dim Hi As Long Dim StPtr As Long Dim V(0) As String Dim pV() As Long Dim sapV As SAFEARRAY1D Dim pArr() As Long Dim sapArr As SAFEARRAY1D Dim p As Long Dim StSize As Long Dim StLo() As Long Dim StHi() As Long StSize = 255 ReDim StLo(StSize) ReDim StHi(StSize) 'init the stack On Error Resume Next 'spans a Long-Array (pArr()) over the StringPointers in Arr() sapArr.cDims = 1 sapArr.cbElements = 4 'Bytes used by each StrPointer sapArr.lLbound = LBound(arrString) sapArr.pvData = VarPtr(arrString(sapArr.lLbound)) sapArr.cElements = UBound(arrString) - LBound(arrString) + 1 If Err Then Err.Clear Exit Sub 'Arr was not initialized End If On Error GoTo 0 'switch off Err-Handler for speed-reasons BindArray pArr, VarPtr(sapArr) 'another Array, used to hold only one single String, 'respective its pointer for reasons of comparing inside the algo sapV.cDims = 1 sapV.cbElements = 4 sapV.pvData = VarPtr(V(0)) sapV.cElements = 1 BindArray pV, VarPtr(sapV) StPtr = 1 'init the StackPointer StLo(0) = LBound(arrString) StHi(0) = UBound(arrString) Do StPtr = StPtr - 1 Lo = StLo(StPtr) Hi = StHi(StPtr) If Hi - Lo < 12 Then 'MinSort For Lo = Lo To Hi - 1 j = Lo For i = Lo + 1 To Hi If arrString(i) < arrString(j) Then j = i Next i If j < Lo Then p = pArr(j): pArr(j) = pArr(Lo): pArr(Lo) = p End If Next Lo Else 'QSort Do i = Lo: j = Hi pV(0) = pArr((Lo + Hi) \ 2) Do Do While arrString(i) < V(0) i = i + 1 Loop Do While arrString(j) V(0) j = j - 1 Loop If i <= j Then p = pArr(i) pArr(i) = pArr(j) pArr(j) = p i = i + 1 j = j - 1 End If Loop While i <= j If j - Lo < Hi - i Then If i < Hi Then StLo(StPtr) = i StHi(StPtr) = Hi StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Hi = j Else If Lo < j Then StLo(StPtr) = Lo StHi(StPtr) = j StPtr = StPtr + 1 If StPtr = StSize Then StSize = StSize + StSize ReDim Preserve StLo(StSize) ReDim Preserve StHi(StSize) End If End If Lo = i End If Loop While Lo < Hi End If Loop While StPtr pV(0) = 0 'don't dispose the current String-Content of V(0) ReleaseArray pV 'release the Array-Mapping between V() and pV() ReleaseArray pArr 'relase the Array-Mapping between Arr() and pArr() End Sub The declarations etc. are just the same as before. It will be faster if you could code in VB6 (worth buying) as you have all the fast compiler options that are not available in VBA, but even in VBA this is very fast. RBS "Deke" wrote in message ... Hi, Thank for the reply. Got both the bits of code to work perfectly first time. The VB script is a lot faster and is exactly what I'm needing, I've got the processing time down from 20-30 min's to seconds. Again, thank's for all your help, it was exactly what I was needing... -- Cheers... " wrote: It is in fact a lot faster in VBA as well, have just tested. It needs some adjustments though to work with a 1-based array, which I haven't worked out yet. Fine though with a 0-based array. RBS |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
sorting text in a string | Excel Programming | |||
Sorting By Largest Value In A String | Excel Programming | |||
Passing a String in Array to Range as String | Excel Programming | |||
string sorting problem | Excel Programming | |||
sorting data in string | Excel Programming |