ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   File truncation with VBA (https://www.excelbanter.com/excel-programming/338057-file-truncation-vba.html)

crazybass2

File truncation with VBA
 
I have several text files that are very large (500MB+). There is a keyword
in each file ("Array") at which point I don't need the data below and
including the line in appears. Problem is, using a text editor for such a
large file makes it nearly impossible to Find, Select-to-End, Delete.

I currently have an excel file that will input each line into a cell and
until it reaches the keyword. Once the keyword is found, the file is closed
and a new file is opened and all the cells are written to the file. With
files as large as I have go thru this process it takes more than two hours to
finish. I'm sure that the majority of the time is spent writing the data to
the cells and writing it back to a new file.

Is there a way (in Excel) to find a keyword in a text file and delete
everything from that line on?

crazybass2

File truncation with VBA
 
Jake,

That was almost perfect! I needed for the entire line containting "Array"
to be deleted, not just after "Array". It was a quick mod to your code to
accomplish. Thanks for your quick response!

Much appreciated,
Mike

"Jake Marx" wrote:

Hi crazybass2,

You can do this pretty easily without bringing the file into Excel. Here's
a function that should do what you're looking for. Just pass in the full
path/filename and the string you want to truncate after (in this case,
"array").

I don't know how quickly this will execute against a 500MB+ file, but it's
worth a try to see if it's any faster. It should be IMO.


Private Const mlTEMP_FOLDER As Long = 2

Public Function gbTruncateFile(rsFullPath As String, _
rsTruncString As String) As Boolean
Dim fso As Object
Dim tsSrc As Object
Dim tsDest As Object
Dim bDone As Boolean
Dim sTemp As String
Dim lTruncPos As Long
Dim sDestPath As String

On Error GoTo ErrHandler

Set fso = CreateObject("Scripting.FileSystemObject")
If fso.FileExists(rsFullPath) Then
Set tsSrc = fso.OpenTextFile(rsFullPath)
sDestPath = fso.GetSpecialFolder(mlTEMP_FOLDER) _
& Application.PathSeparator & fso.GetFileName(rsFullPath)
Set tsDest = fso.CreateTextFile(sDestPath, True)
With tsSrc
Do While Not (.AtEndOfStream Or bDone)
sTemp = .ReadLine
lTruncPos = InStr(1, sTemp, rsTruncString, _
vbTextCompare)
If lTruncPos Then
sTemp = Left$(sTemp, lTruncPos - 1)
bDone = True
End If
tsDest.WriteLine sTemp
Loop
End With

tsSrc.Close
tsDest.Close

fso.CopyFile sDestPath, rsFullPath, True

gbTruncateFile = True
End If

ExitRoutine:
On Error Resume Next
tsSrc.Close
tsDest.Close
Kill sDestPath
Set tsSrc = Nothing
Set tsDest = Nothing
Set fso = Nothing
Exit Function
ErrHandler:
Resume ExitRoutine
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


crazybass2 wrote:
I have several text files that are very large (500MB+). There is a
keyword in each file ("Array") at which point I don't need the data
below and including the line in appears. Problem is, using a text
editor for such a large file makes it nearly impossible to Find,
Select-to-End, Delete.

I currently have an excel file that will input each line into a cell
and until it reaches the keyword. Once the keyword is found, the
file is closed and a new file is opened and all the cells are written
to the file. With files as large as I have go thru this process it
takes more than two hours to finish. I'm sure that the majority of
the time is spent writing the data to the cells and writing it back
to a new file.

Is there a way (in Excel) to find a keyword in a text file and delete
everything from that line on?




crazybass2

File truncation with VBA
 
Oh, and BTW...it took less than 30 seconds to process a 680MB file down to a
78MB file.

Again, Thanks!



"Jake Marx" wrote:

Hi crazybass2,

You can do this pretty easily without bringing the file into Excel. Here's
a function that should do what you're looking for. Just pass in the full
path/filename and the string you want to truncate after (in this case,
"array").

I don't know how quickly this will execute against a 500MB+ file, but it's
worth a try to see if it's any faster. It should be IMO.


Private Const mlTEMP_FOLDER As Long = 2

Public Function gbTruncateFile(rsFullPath As String, _
rsTruncString As String) As Boolean
Dim fso As Object
Dim tsSrc As Object
Dim tsDest As Object
Dim bDone As Boolean
Dim sTemp As String
Dim lTruncPos As Long
Dim sDestPath As String

On Error GoTo ErrHandler

Set fso = CreateObject("Scripting.FileSystemObject")
If fso.FileExists(rsFullPath) Then
Set tsSrc = fso.OpenTextFile(rsFullPath)
sDestPath = fso.GetSpecialFolder(mlTEMP_FOLDER) _
& Application.PathSeparator & fso.GetFileName(rsFullPath)
Set tsDest = fso.CreateTextFile(sDestPath, True)
With tsSrc
Do While Not (.AtEndOfStream Or bDone)
sTemp = .ReadLine
lTruncPos = InStr(1, sTemp, rsTruncString, _
vbTextCompare)
If lTruncPos Then
sTemp = Left$(sTemp, lTruncPos - 1)
bDone = True
End If
tsDest.WriteLine sTemp
Loop
End With

tsSrc.Close
tsDest.Close

fso.CopyFile sDestPath, rsFullPath, True

gbTruncateFile = True
End If

ExitRoutine:
On Error Resume Next
tsSrc.Close
tsDest.Close
Kill sDestPath
Set tsSrc = Nothing
Set tsDest = Nothing
Set fso = Nothing
Exit Function
ErrHandler:
Resume ExitRoutine
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


crazybass2 wrote:
I have several text files that are very large (500MB+). There is a
keyword in each file ("Array") at which point I don't need the data
below and including the line in appears. Problem is, using a text
editor for such a large file makes it nearly impossible to Find,
Select-to-End, Delete.

I currently have an excel file that will input each line into a cell
and until it reaches the keyword. Once the keyword is found, the
file is closed and a new file is opened and all the cells are written
to the file. With files as large as I have go thru this process it
takes more than two hours to finish. I'm sure that the majority of
the time is spent writing the data to the cells and writing it back
to a new file.

Is there a way (in Excel) to find a keyword in a text file and delete
everything from that line on?




Jake Marx[_3_]

File truncation with VBA
 
No problem - glad it helped!

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


crazybass2 wrote:
Oh, and BTW...it took less than 30 seconds to process a 680MB file
down to a 78MB file.

Again, Thanks!



"Jake Marx" wrote:

Hi crazybass2,

You can do this pretty easily without bringing the file into Excel.
Here's a function that should do what you're looking for. Just pass
in the full path/filename and the string you want to truncate after
(in this case, "array").

I don't know how quickly this will execute against a 500MB+ file,
but it's worth a try to see if it's any faster. It should be IMO.


Private Const mlTEMP_FOLDER As Long = 2

Public Function gbTruncateFile(rsFullPath As String, _
rsTruncString As String) As Boolean
Dim fso As Object
Dim tsSrc As Object
Dim tsDest As Object
Dim bDone As Boolean
Dim sTemp As String
Dim lTruncPos As Long
Dim sDestPath As String

On Error GoTo ErrHandler

Set fso = CreateObject("Scripting.FileSystemObject")
If fso.FileExists(rsFullPath) Then
Set tsSrc = fso.OpenTextFile(rsFullPath)
sDestPath = fso.GetSpecialFolder(mlTEMP_FOLDER) _
& Application.PathSeparator & fso.GetFileName(rsFullPath)
Set tsDest = fso.CreateTextFile(sDestPath, True)
With tsSrc
Do While Not (.AtEndOfStream Or bDone)
sTemp = .ReadLine
lTruncPos = InStr(1, sTemp, rsTruncString, _
vbTextCompare)
If lTruncPos Then
sTemp = Left$(sTemp, lTruncPos - 1)
bDone = True
End If
tsDest.WriteLine sTemp
Loop
End With

tsSrc.Close
tsDest.Close

fso.CopyFile sDestPath, rsFullPath, True

gbTruncateFile = True
End If

ExitRoutine:
On Error Resume Next
tsSrc.Close
tsDest.Close
Kill sDestPath
Set tsSrc = Nothing
Set tsDest = Nothing
Set fso = Nothing
Exit Function
ErrHandler:
Resume ExitRoutine
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


crazybass2 wrote:
I have several text files that are very large (500MB+). There is a
keyword in each file ("Array") at which point I don't need the data
below and including the line in appears. Problem is, using a text
editor for such a large file makes it nearly impossible to Find,
Select-to-End, Delete.

I currently have an excel file that will input each line into a cell
and until it reaches the keyword. Once the keyword is found, the
file is closed and a new file is opened and all the cells are
written to the file. With files as large as I have go thru this
process it takes more than two hours to finish. I'm sure that the
majority of the time is spent writing the data to the cells and
writing it back to a new file.

Is there a way (in Excel) to find a keyword in a text file and
delete everything from that line on?



All times are GMT +1. The time now is 08:39 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com