Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 44
Default Find & Replace in Multi-Megabyte Strings

This is a re-post of "Batch Replace function for Large strings"

String processing in VBA is very slow when strings are large (1-500
MB). I have a
function that I've been using for doing batch replace ops (in the above
referenced previous post), but
it chokes on large strings with many replacements to do (like 50),
because it has to do 50 passes of the string to perform the
replacements.

Has anyone written a fast function designed to do the same thing for
large strings but only makes one pass through the data?

  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 2,253
Default Find & Replace in Multi-Megabyte Strings



Rex Avery... REGEX Avery should be your name!


Have a look at regex. (regular expressions)

Following test builds a 28million character teststring, it will still
loop thru all the replacements, but regex replace is a "trifle" faster
than vba's.

Following example takes 20 seconds for the 6 replacements.
(but note that each replacement occurs 1 million times!

The example uses late bound code, it's a little bit faster when you
reference "Microsoft VBScript Regular Expression 5.5" and dim the rgx
variable as RegExp.

You'll have to learn some regex patterns though!
http://msdn.microsoft.com/library/de.../en-us/script5
6/html/vspropattern.asp

or try and buy RegExBuddy. (be aware that regexbuddy supports more
advanced flavors than vbscripts regex)

Sub TestRegExReplace()

Dim str$, arrPat, arrRep

Dim rgx As Object
Dim t!, n&
'This is slow but nevermind..
'it builds a string of 28 000 000 chars.

str = "This is the string I have." & vbLf
For n = 1 To 20
str = str & str
Next
Debug.Print Len(str)

arrPat = Array("This", "is", "the", "string", "I", "have")
arrRep = Array("That", "was", "that", "text", "you", "had")

t = Timer
Set rgx = CreateObject("vbscript.regexp")
With rgx
.Global = True
.IgnoreCase = True
For n = LBound(arrPat) To UBound(arrPat)
.Pattern = "\b" & arrPat(n) & "\b"
str = .Replace(str, arrRep(n))
Next
End With

t = Timer - t
MsgBox Left(str, InStr(50, str, vbLf)) & "in " & t & "seconds"

End Sub


HTH... (but i'm fairly sure it does)




--
keepITcool
| www.XLsupport.com | keepITcool chello nl | amsterdam


R Avery wrote :

This is a re-post of "Batch Replace function for Large strings"

String processing in VBA is very slow when strings are large (1-500
MB). I have a
function that I've been using for doing batch replace ops (in the
above referenced previous post), but
it chokes on large strings with many replacements to do (like 50),
because it has to do 50 passes of the string to perform the
replacements.

Has anyone written a fast function designed to do the same thing for
large strings but only makes one pass through the data?

  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 44
Default Find & Replace in Multi-Megabyte Strings

Cool! The RegEx (early bound) in my tests is 4-5x faster than VBA's
replace. However, I think that this might even be able to be improved
more... I think it should be possible to do all of the find/replaces
in a single pass of the string, further improving the speed by a factor
of however many replaces there are to do.

Since doing 6 search and replaces requires calling the Regex replace
function 6 times, it is iterating over the string 6 times, but it
should really only need to do it once.

Perhaps some sort of clever regex could combine all of the
search/replace ops into a single call of .Replace.

Or perhaps a custom procedure would be better... i'll make a post
if/when i find the answer i'm looking for.

Thanks for the idea! And for now, i'll take the 5x improvement ;D

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
find and replace numeric strings in larger text strings Mr Molio Excel Worksheet Functions 8 November 9th 11 05:17 PM
How to find a value with multi-column, multi-record list Dallasm Excel Worksheet Functions 1 May 30th 10 05:40 PM
How to find number of pairs of strings from list of strings? greg_overholt Excel Worksheet Functions 5 January 27th 06 10:42 PM
Using a collection class to implement mutliple find/replace strings in cells Bill Hertzing Excel Programming 2 February 18th 04 01:42 AM


All times are GMT +1. The time now is 05:19 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"