ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Huge data set Find-Replace (https://www.excelbanter.com/excel-programming/452490-huge-data-set-find-replace.html)

L. Howard

Huge data set Find-Replace
 
Is it possible to use something like this example for find replace except just for single sheet with a huge data set, say 300,000 - 600,000 cells?

The built in find replace is quite slow.

This looks like it loads the sheet (actually a file) into a Variable and does the find replace on the variable... is that correct?

Howard

Sub TextFile_FindReplace()
'PURPOSE: Modify Contents of a text file using Find/Replace
'SOURCE: www.TheSpreadsheetGuru.com

Dim TextFile As Integer
Dim FilePath As String
Dim FileContent As String

'File Path of Text File
FilePath = "C:\Users\chris\Desktop\MyFile.txt"

'Determine the next file number available for use by the FileOpen function
TextFile = FreeFile

'Open the text file in a Read State
Open FilePath For Input As TextFile

'Store file content inside a variable
FileContent = Input(LOF(TextFile), TextFile)

'Clost Text File
Close TextFile

'Find/Replace
FileContent = Replace(FileContent, "Hello", "Good By")

'Determine the next file number available for use by the FileOpen function
TextFile = FreeFile

'Open the text file in a Write State
Open FilePath For Output As TextFile

'Write New Text data to file
' Print #TextFile, FileContent

'Close Text File
Close TextFile

End Sub


GS[_6_]

Huge data set Find-Replace
 
Is it possible to use something like this example for find replace
except just for single sheet with a huge data set, say 300,000 -
600,000 cells?

The built in find replace is quite slow.

This looks like it loads the sheet (actually a file) into a Variable
and does the find replace on the variable... is that correct?

Yes, that's correct! Doesn't address your data being in a worksheet
unless you export the sheet to a delimited file first, then import it
back after replacing the string.

I recall posting code to efficiently 'dump' a sheet into a
tab-delimited file and vice versa. You'll have to search topics (in a
few groups here that you frequent) for something containing "importing
from a text file" Basically, you can...
call the export procedure;
call 'ReadTextFile' to dump the data into a string var;
replace text;
call 'WriteTextFile' to dump the string back into the file;
then call the import procedure
...to 'dump' the file back into the sheet.

Might seem a bit convoluted but all the procs are reusable. There are
addition procs to convert 2D array to 1D and vice versa. All is very
fast compared to working directly in the sheet!

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

GS[_6_]

Huge data set Find-Replace
 
Is it possible to use something like this example for find replace
except just for single sheet with a huge data set, say 300,000 -
600,000 cells?

The built in find replace is quite slow.

This looks like it loads the sheet (actually a file) into a
Variable and does the find replace on the variable... is that
correct?

Yes, that's correct! Doesn't address your data being in a worksheet
unless you export the sheet to a delimited file first, then import it
back after replacing the string.

I recall posting code to efficiently 'dump' a sheet into a
tab-delimited file and vice versa. You'll have to search topics (in a
few groups here that you frequent) for something containing
"importing from a text file" Basically, you can...
call the export procedure;
call 'ReadTextFile' to dump the data into a string var;
replace text;
call 'WriteTextFile' to dump the string back into the file;
then call the import procedure
..to 'dump' the file back into the sheet.

Might seem a bit convoluted but all the procs are reusable. There are
addition procs to convert 2D array to 1D and vice versa. All is very
fast compared to working directly in the sheet!


There's more recent code to dump a sheet into a file here in this forum
posted Sep 29th...

"Save Excel VBA as tab delimited text file without quotes"

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

L. Howard

Huge data set Find-Replace
 
Okay, Garry.

I'll poke around and see what I can come up with.

I never realized how very very slow the built-in Find/Replace was for large data sets.

It actually crashed/froze my sheet after a few minutes on a 650,000 cell test sheet.

Thanks,
Howard




GS[_6_]

Huge data set Find-Replace
 
Okay, Garry.

I'll poke around and see what I can come up with.

I never realized how very very slow the built-in Find/Replace was for
large data sets.

It actually crashed/froze my sheet after a few minutes on a 650,000
cell test sheet.

Thanks,
Howard


Importing a delimited file into a worksheet is a bit more complex than
the export process because the data needs to be converted to a 1D array
of arrays OR a 2D array. In either case, the target position needs to
be resized to fit the data. So for a tab-delimited file the import
'concept' is...

Step1
- import the data to a string var
- Split(str, vbCrLf) into a variant

Dim vData, sFile$
sFile = GetOpenFilename: If sFile = "" Then Exit Sub
vData = Split(ReadTextFile(sFile), vbCrLf)
'(ReadTextFile() below returns a string and so this is a 1-liner)

Step2
- split each element into an array

Dim n&
For n = LBound(vData) To UBound(vData)
vdata(n) = Split(vData(n), vbTab)
Next 'n

Step3
- dump file data into worksheet

3a - 1D array of arrays:
Dim lRow&
For n = LBound(vData) To UBound(vData)
lRow = lRow + 1
Cells(lRow, 1).Resize(1, UBound(vData(n) + 1) = vData(n)
Next 'n


3b - 1D array to 2D:
Dim lMaxCols&, lMaxRows&, vOut()
For n = LBound(vData) To UBound(vData)
lMaxCols = Iif(UBound(vData(n)) lMaxCols, _
UBound(vData(n)), lMaxCols)
Next 'n
lMaxRows = UBound(vData)

Redim vOut(lMaxRows, lMaxCols)
For n = LBound(vData) To UBound(vData)
Application.Index(vOut, n, 0)) = vData(n) '// (**NOT tested)
Next 'n
Cells(1, 1).Resize(lMaxRows +1, lMaxCols + 1) = vOut

**
I have more complex pure VB processes for converting arrays but think
you should be able to take advantage of the Index() function here for
code brevity. Both approaches handle zero-based dynamic array
structures where the 2nd dim can also vary in the #cols.


Function ReadTextFile$(Filename$)
' Reads large amounts of data from a text file in one single step.
Dim iNum%
On Error GoTo ErrHandler
iNum = FreeFile(): Open Filename For Input As #iNum
ReadTextFile = Space$(LOF(iNum))
ReadTextFile = Input(LOF(iNum), iNum)

ErrHandler:
Close #iNum: If Err Then Err.Raise Err.Number, , Err.Description
End Function 'ReadTextFile()

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

GS[_6_]

Huge data set Find-Replace
 
The code snippets are more than you need for this task, but you should
be able to create a reusable VB_Replace(sFind$, sReplace$) procedure
from it by revising the code samples to...

1 load the sheet data into a string var
vData = ActiveSheet.UsedRange
Join() each row of data with vbTab into a tmp array
Join() the tmp array into a string with vbCrLf

2 Replace(MyStr, sFind, sReplace) '//all occurances in 1 shot

3 load the string back into the worksheet
Split the string into a tmp array with vbCrLf
Split each element with vbTab so you have an array of arrays
dump the data back into the worksheet via 1 of the methods shown

...so that you accomplish everything exampled except the file
write/read!

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

L. Howard

Huge data set Find-Replace
 
Hi Garry,

I found these in the search area you suggested.

I have no idea how to make them work, nor can I see them when I call up the Macro box with f8 from the sheet with my test data.

Howard

Sub WksToTabFile(Optional Wks As Worksheet)
' Loads worksheet data into a tab-delimited text file
' Requires WriteTextFile() to create the file

Dim vData, n&, sFile$

sFile = Application.GetSaveAsFilename
If sFile = False Then Exit Sub

If Wks Is Nothing Then Set Wks = ActiveSheet
vData = Wks.UsedRange
ReDim vTmp(1 To UBound(vData))

'Load data from 2D array to 1D array
For n = LBound(vData) To UBound(vData)
vTmp(n) = Join(Application.Index(vData, n, 0), vbTab)
Next 'n

'Write data to text file
WriteTextFile Join(vTmp, vbCrLf), sFile
End Sub

Sub WriteTextFile(TextOut$, Filename$, _
Optional AppendMode As Boolean = False)
' Reusable procedure that Writes/Overwrites or Appends
' large amounts of data to a Text file in one single step.
' **Does not create a blank line at the end of the file**
Dim iNum%
On Error GoTo ErrHandler
iNum = FreeFile()
If AppendMode Then
Open Filename For Append As #iNum: Print #iNum, vbCrLf & TextOut;
Else
Open Filename For Output As #iNum: Print #iNum, TextOut;
End If

ErrHandler:
Close #iNum: If Err Then Err.Raise Err.Number, , Err.Description
End Sub 'WriteTextFile()


GS[_6_]

Huge data set Find-Replace
 
I have no idea how to make them work, nor can I see them when I call
up the Macro box with f8 from the sheet with my test data.


They have args and so must be called by other procs; this is why they
do not show up in the Macros dialog.

WksToTabFile is what you need to load the sheet into a string var. This
is done by the line to write to file and so you just...

MyStr = Join(vTmp, vbCrlf)
Replace(MyStr, sFind, sReplace)

...then load MyStr back into an array of arrays OR a 2D array...

vTmp = Split(MyStr, vbCrLf)

...and loop thru to create array of arrays...

For n = LBound(vTmp) To UBound(vTmp)
vTmp(n) = Split(vTmp(n), vbTab)
Next 'n

...and choose 1 of the methds to put the data back into the sheet.

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion


All times are GMT +1. The time now is 08:04 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com