Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
String Arrays
I need a utility that has to do an enormous number of string comparisons in
arrays based on a text file. Ive written something in VBA that does what I need but it is painfully slow. I think the bulk of the time is spent on the string searching itself. This is the function Im using: pblic Function SequentialSearchStringArray(ByRef sArray() As String, ByVal sFind As String) As Long Dim i As Long Dim iLBound As Long Dim iUBound As Long Dim astrFilter() As String 'Quick check - filter the current array to see if the value matches. astrFilter = Filter(sArray, sFind) 'Unfortuantly this doesn't tell us where the match is so we then have to do a search to find it If UBound(astrFilter) < 0 Then SequentialSearchStringArray = -1 Else iLBound = LBound(sArray) iUBound = UBound(sArray) For i = iLBound To iUBound If sArray(i) = sFind Then SequentialSearchStringArray = i: Exit Function Next i End If Can anyone come up with a way of really speeding this up? Is there a way of treating each string as array of ints or something rather than doing the endless casting. Ideas, code re-writes etc, greatly appreciated. |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
String Arrays
Chuckles,
The general idea is to: 1: Read your strings from the text file and write them to an array 2: Do a bubble sort on the array to get them into alphabetical order, if the file was not already sorted 3: Use a binary search algorithm to search your array for the match I have written code that does the above with the exception of step 2 (since I have a sorted file). See the code below, which you can modify as needed (And which you _will_ need to modify - the binary search also returns some flags (B and N), since it is written to look for 'B'eginning partial matches and 'N'o matches, but you'll get the idea.) Some of the variables may not be needed, but I'm too lazy to sort out which ones actually are needed.... HTH, Bernie MS Excel MVP Option Explicit Option Base 1 Public DictArray() As String Public myFCount As Integer Public myFound() As String Public myTemp As String Public Counter As Double Public FoundPath As Boolean Public tmpStr As String Public myRet As Long Public FileLoaded As Boolean Sub LoadFile() 'This subroutine loads the dictionary file into an array Dim FileNum As Integer 'Check to see if the file is already in memory If FileLoaded Then Exit Sub 'First Get Next Available File Handle Number FileNum = FreeFile() 'Open Dictionary Text File For Input Open ThisWorkbook.Path & "\" & Range("DictName").Value For Input As #FileNum 'Set The Counter to 1 Counter = 0 'Loop Until the End Of File Is Reached Do While Seek(FileNum) <= LOF(FileNum) 'Store One Line Of Text From File To array Counter = Counter + 1 ReDim Preserve DictArray(Counter) Line Input #FileNum, DictArray(Counter) 'Increment the Counter By 1 'Start Again At Top Of 'Do While' Statement Loop 'Close The Open Text File Close 'Set the flag to indicate the file is loaded FileLoaded = True End Sub Function BinaryWordMatch(FindVal As Variant, _ ByVal FirstIndex As Long, _ ByVal LastIndex As Long) As Variant 'Uses binary search routine to compare words to dictionary 'Assumes dictionary file as read into array is sorted in ascending order Dim TempVal As String Dim lngIndex As Long Dim lngIndexPrevious As Long 'Check for obvious cases: 'The string is less than the first entry 'of the dictionary If (FindVal < DictArray(FirstIndex)) Then 'If the letters begin the first word, then indicate and continue If Left(DictArray(FirstIndex), Len(FindVal)) = FindVal Then BinaryWordMatch = "B" Exit Function End If 'If the letters don't begin the first word, then 'FindVal is not in the list, so can stop looking BinaryWordMatch = "N" Exit Function End If 'Obvious case #2 'The word is beyond the last word If (FindVal DictArray(LastIndex)) Then 'FindVal is not in the list, so stop looking BinaryWordMatch = "N" Exit Function End If 'Obvious case #3 - the word is the last in the dictionary If FindVal = DictArray(LastIndex) Then BinaryWordMatch = LastIndex Exit Function End If 'Obvious case #4 - the word is the first in the dictionary If FindVal = DictArray(FirstIndex) Then 'Also check to see if it also starts the next entry If Left(DictArray(FirstIndex + 1), Len(FindVal)) = FindVal Then BinaryWordMatch = BinaryWordMatch & " " & "B" Exit Function End If BinaryWordMatch = FirstIndex Exit Function End If 'now more dificult cases! lngIndexPrevious = -1 Do lngIndex = Int((FirstIndex + LastIndex) / 2) 'If lngIndex is the same as the previous time, 'we have converged without finding value 'First check for word beginnings, if none found then exit loop If lngIndex = lngIndexPrevious Then If Left(DictArray(lngIndex - 1), Len(FindVal)) = FindVal Then 'search string begins the previous word BinaryWordMatch = "B" Exit Function End If If Left(DictArray(lngIndex + 1), Len(FindVal)) = FindVal Then 'search string begins the next word BinaryWordMatch = "B" Exit Function End If If Left(DictArray(lngIndex), Len(FindVal)) = FindVal Then 'search string begins the current word BinaryWordMatch = "B" Exit Function End If Exit Do End If lngIndexPrevious = lngIndex TempVal = CStr(DictArray(lngIndex)) 'Has the value been found? If TempVal = FindVal Then BinaryWordMatch = lngIndex 'Check to see if it also starts the next dictionary entry If Left(DictArray(lngIndex + 1), Len(FindVal)) = FindVal Then BinaryWordMatch = BinaryWordMatch & " " & "B" Exit Function End If Exit Function End If 'Determine which half of list to discard? If TempVal < FindVal Then FirstIndex = lngIndex Else LastIndex = lngIndex End If Loop 'Indicate that no match of any type has been found BinaryWordMatch = "N" End Function "chuckles" wrote in message ... I need a utility that has to do an enormous number of string comparisons in arrays based on a text file. I've written something in VBA that does what I need but it is painfully slow. I think the bulk of the time is spent on the string searching itself. This is the function I'm using: pblic Function SequentialSearchStringArray(ByRef sArray() As String, ByVal sFind As String) As Long Dim i As Long Dim iLBound As Long Dim iUBound As Long Dim astrFilter() As String 'Quick check - filter the current array to see if the value matches. astrFilter = Filter(sArray, sFind) 'Unfortuantly this doesn't tell us where the match is so we then have to do a search to find it If UBound(astrFilter) < 0 Then SequentialSearchStringArray = -1 Else iLBound = LBound(sArray) iUBound = UBound(sArray) For i = iLBound To iUBound If sArray(i) = sFind Then SequentialSearchStringArray = i: Exit Function Next i End If Can anyone come up with a way of really speeding this up? Is there a way of treating each string as array of ints or something rather than doing the endless casting. Ideas, code re-writes etc, greatly appreciated. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
String Arrays
Alternatively to Bernie's solution..
use a dictionary object from the Scripting runtime library. iso storing your data in arrays you store them in dictionaries. iso search the array you simply retrieve the data (and position) by their key. I've found Dictionary to be faster than vba's Collection. plus it has advantage of CaseSensitivity, the possibility to read (and change) the index. Also you can fill an array from a dictionary by retrieving either the Keys or Index arrays, unlike Collections where you'd need a loop. -- keepITcool | www.XLsupport.com | keepITcool chello nl | amsterdam chuckles wrote : I need a utility that has to do an enormous number of string comparisons in arrays based on a text file. Ive written something in VBA that does what I need but it is painfully slow. I think the bulk of the time is spent on the string searching itself. This is the function Im using: pblic Function SequentialSearchStringArray(ByRef sArray() As String, ByVal sFind As String) As Long Dim i As Long Dim iLBound As Long Dim iUBound As Long Dim astrFilter() As String 'Quick check - filter the current array to see if the value matches. astrFilter = Filter(sArray, sFind) 'Unfortuantly this doesn't tell us where the match is so we then have to do a search to find it If UBound(astrFilter) < 0 Then SequentialSearchStringArray = -1 Else iLBound = LBound(sArray) iUBound = UBound(sArray) For i = iLBound To iUBound If sArray(i) = sFind Then SequentialSearchStringArray = i: Exit Function Next i End If Can anyone come up with a way of really speeding this up? Is there a way of treating each string as array of ints or something rather than doing the endless casting. Ideas, code re-writes etc, greatly appreciated. |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
String Arrays
Many thanks both of you. Will try out these techniques and see how I get on.
Chuckles "keepITcool" wrote: Alternatively to Bernie's solution.. use a dictionary object from the Scripting runtime library. iso storing your data in arrays you store them in dictionaries. iso search the array you simply retrieve the data (and position) by their key. I've found Dictionary to be faster than vba's Collection. plus it has advantage of CaseSensitivity, the possibility to read (and change) the index. Also you can fill an array from a dictionary by retrieving either the Keys or Index arrays, unlike Collections where you'd need a loop. -- keepITcool | www.XLsupport.com | keepITcool chello nl | amsterdam chuckles wrote : I need a utility that has to do an enormous number of string comparisons in arrays based on a text file. Ive written something in VBA that does what I need but it is painfully slow. I think the bulk of the time is spent on the string searching itself. This is the function Im using: pblic Function SequentialSearchStringArray(ByRef sArray() As String, ByVal sFind As String) As Long Dim i As Long Dim iLBound As Long Dim iUBound As Long Dim astrFilter() As String 'Quick check - filter the current array to see if the value matches. astrFilter = Filter(sArray, sFind) 'Unfortuantly this doesn't tell us where the match is so we then have to do a search to find it If UBound(astrFilter) < 0 Then SequentialSearchStringArray = -1 Else iLBound = LBound(sArray) iUBound = UBound(sArray) For i = iLBound To iUBound If sArray(i) = sFind Then SequentialSearchStringArray = i: Exit Function Next i End If Can anyone come up with a way of really speeding this up? Is there a way of treating each string as array of ints or something rather than doing the endless casting. Ideas, code re-writes etc, greatly appreciated. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Change 3 letter text string to a number string | Excel Discussion (Misc queries) | |||
counting the number of instances of a string within another string | Excel Worksheet Functions | |||
Multiplying to string arrays | Excel Discussion (Misc queries) | |||
to search for a string and affect data if it finds the string? | Excel Worksheet Functions | |||
Create a formula into a String then assign string to a cell | Excel Programming |