![]() |
How to put lines with certain text (from a file) in an array
Guys,
I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
Function OpenTextFileToString(strFile As String) As String
Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
The following code will output all the lines containing the text "layers_"
**anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
That's pretty cool Rick!
you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf)
OK, that is another way, but as you say you still may need Instr if you want layer_1, layer_2, layer_3, but not layer_4. RBS "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
I actually meant to post my message directly under the OP's posting, not
yours... sorry. -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) OK, that is another way, but as you say you still may need Instr if you want layer_1, layer_2, layer_3, but not layer_4. RBS "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
Yes, very good Peter, that does seem to work. "Pretty cool" back at you.
-- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... That's pretty cool Rick! you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
OK, it is a one-liner, but is it faster than Instr in a loop?
Will test in a bit, unless somebody else will do that ... RBS "Rick Rothstein" wrote in message ... Yes, very good Peter, that does seem to work. "Pretty cool" back at you. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... That's pretty cool Rick! you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
I got the method with Join and Filter about twice as slow.
This is with testing on 1 Mb test file, with the 5 line repeating sequence as in the OP: Option Explicit Private Declare Function timeGetTime Lib "winmm.dll" () As Long Private lStartTime As Long Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim str2 As String Dim arr2 Dim bJoin As Boolean bJoin = True str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) StartSW If bJoin Then str2 = Join(Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) arr2 = Split(str2, vbCrLf) Else ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String End If StopSW 'to check we got it right For i = 0 To 3 MsgBox arr2(i), , i Next i End Sub Sub StartSW() lStartTime = timeGetTime() End Sub Function StopSW(Optional bMsgBox As Boolean = True, _ Optional vMessage As Variant, _ Optional lMinimumTimeToShow As Long = -1) As Variant Dim lTime As Long lTime = timeGetTime() - lStartTime If lTime lMinimumTimeToShow Then If IsMissing(vMessage) Then StopSW = lTime Else StopSW = lTime & " - " & vMessage End If End If If bMsgBox Then If lTime lMinimumTimeToShow Then MsgBox "Done in " & lTime & " msecs", , vMessage End If End If End Function RBS "RB Smissaert" wrote in message ... OK, it is a one-liner, but is it faster than Instr in a loop? Will test in a bit, unless somebody else will do that ... RBS "Rick Rothstein" wrote in message ... Yes, very good Peter, that does seem to work. "Pretty cool" back at you. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... That's pretty cool Rick! you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
Hi Bart,
For your test the Join and the Split are not necessary, simply arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) With small files (up to say 0.3Mb) and large files 10Mb I didn't find much difference in the two methods. Barely any measurable difference with the small files although the loop was always slightly faster with the large files. Curiously though the loop was much faster with medium size files of 1Mb. The Filter method was only slightly slower with a 1Mb file vs a 10Mb file (not pro-rata at all). I don't understand the timing anomalies I got. Regards, Peter T "RB Smissaert" wrote in message ... I got the method with Join and Filter about twice as slow. This is with testing on 1 Mb test file, with the 5 line repeating sequence as in the OP: Option Explicit Private Declare Function timeGetTime Lib "winmm.dll" () As Long Private lStartTime As Long Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim str2 As String Dim arr2 Dim bJoin As Boolean bJoin = True str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) StartSW If bJoin Then str2 = Join(Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) arr2 = Split(str2, vbCrLf) Else ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String End If StopSW 'to check we got it right For i = 0 To 3 MsgBox arr2(i), , i Next i End Sub Sub StartSW() lStartTime = timeGetTime() End Sub Function StopSW(Optional bMsgBox As Boolean = True, _ Optional vMessage As Variant, _ Optional lMinimumTimeToShow As Long = -1) As Variant Dim lTime As Long lTime = timeGetTime() - lStartTime If lTime lMinimumTimeToShow Then If IsMissing(vMessage) Then StopSW = lTime Else StopSW = lTime & " - " & vMessage End If End If If bMsgBox Then If lTime lMinimumTimeToShow Then MsgBox "Done in " & lTime & " msecs", , vMessage End If End If End Function RBS "RB Smissaert" wrote in message ... OK, it is a one-liner, but is it faster than Instr in a loop? Will test in a bit, unless somebody else will do that ... RBS "Rick Rothstein" wrote in message ... Yes, very good Peter, that does seem to work. "Pretty cool" back at you. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... That's pretty cool Rick! you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
Hi Peter,
Ah, yes, that was a bit silly, joining first and then splitting again. I did it all very quick and didn't look carefully at what it was doing. So, on the whole the loop is somewhat faster still then, particularly for medium sized files. So overall I prefer it as it clearer as well as to what it is doing. RBS "Peter T" <peter_t@discussions wrote in message ... Hi Bart, For your test the Join and the Split are not necessary, simply arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) With small files (up to say 0.3Mb) and large files 10Mb I didn't find much difference in the two methods. Barely any measurable difference with the small files although the loop was always slightly faster with the large files. Curiously though the loop was much faster with medium size files of 1Mb. The Filter method was only slightly slower with a 1Mb file vs a 10Mb file (not pro-rata at all). I don't understand the timing anomalies I got. Regards, Peter T "RB Smissaert" wrote in message ... I got the method with Join and Filter about twice as slow. This is with testing on 1 Mb test file, with the 5 line repeating sequence as in the OP: Option Explicit Private Declare Function timeGetTime Lib "winmm.dll" () As Long Private lStartTime As Long Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim str2 As String Dim arr2 Dim bJoin As Boolean bJoin = True str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) StartSW If bJoin Then str2 = Join(Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) arr2 = Split(str2, vbCrLf) Else ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String End If StopSW 'to check we got it right For i = 0 To 3 MsgBox arr2(i), , i Next i End Sub Sub StartSW() lStartTime = timeGetTime() End Sub Function StopSW(Optional bMsgBox As Boolean = True, _ Optional vMessage As Variant, _ Optional lMinimumTimeToShow As Long = -1) As Variant Dim lTime As Long lTime = timeGetTime() - lStartTime If lTime lMinimumTimeToShow Then If IsMissing(vMessage) Then StopSW = lTime Else StopSW = lTime & " - " & vMessage End If End If If bMsgBox Then If lTime lMinimumTimeToShow Then MsgBox "Done in " & lTime & " msecs", , vMessage End If End If End Function RBS "RB Smissaert" wrote in message ... OK, it is a one-liner, but is it faster than Instr in a loop? Will test in a bit, unless somebody else will do that ... RBS "Rick Rothstein" wrote in message ... Yes, very good Peter, that does seem to work. "Pretty cool" back at you. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... That's pretty cool Rick! you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
Out of curiosity, how much faster was "much faster" for a single loop if
your test involved multiple loops (total time divided by number of loops)? -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... Hi Bart, For your test the Join and the Split are not necessary, simply arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) With small files (up to say 0.3Mb) and large files 10Mb I didn't find much difference in the two methods. Barely any measurable difference with the small files although the loop was always slightly faster with the large files. Curiously though the loop was much faster with medium size files of 1Mb. The Filter method was only slightly slower with a 1Mb file vs a 10Mb file (not pro-rata at all). I don't understand the timing anomalies I got. Regards, Peter T "RB Smissaert" wrote in message ... I got the method with Join and Filter about twice as slow. This is with testing on 1 Mb test file, with the 5 line repeating sequence as in the OP: Option Explicit Private Declare Function timeGetTime Lib "winmm.dll" () As Long Private lStartTime As Long Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim str2 As String Dim arr2 Dim bJoin As Boolean bJoin = True str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) StartSW If bJoin Then str2 = Join(Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) arr2 = Split(str2, vbCrLf) Else ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String End If StopSW 'to check we got it right For i = 0 To 3 MsgBox arr2(i), , i Next i End Sub Sub StartSW() lStartTime = timeGetTime() End Sub Function StopSW(Optional bMsgBox As Boolean = True, _ Optional vMessage As Variant, _ Optional lMinimumTimeToShow As Long = -1) As Variant Dim lTime As Long lTime = timeGetTime() - lStartTime If lTime lMinimumTimeToShow Then If IsMissing(vMessage) Then StopSW = lTime Else StopSW = lTime & " - " & vMessage End If End If If bMsgBox Then If lTime lMinimumTimeToShow Then MsgBox "Done in " & lTime & " msecs", , vMessage End If End If End Function RBS "RB Smissaert" wrote in message ... OK, it is a one-liner, but is it faster than Instr in a loop? Will test in a bit, unless somebody else will do that ... RBS "Rick Rothstein" wrote in message ... Yes, very good Peter, that does seem to work. "Pretty cool" back at you. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... That's pretty cool Rick! you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... maybe - LinesOut = Join(Filter(Filter(LinesIn, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare), vbCrLf) Regards, Peter T "Rick Rothstein" wrote in message ... The following code will output all the lines containing the text "layers_" **anywhere** within them. Notice that you can't pick and choose a subset of all the "layer_" lines; that is, you can't use this method to only output "layer_1" and "layer_2" skipping over "layer_3"... the search text that gets used in the Filter function works like that in the InStr function. Oh, and change the file names for the input and output files. Sub ReadProcessOutput() Dim FileNum As Long Dim TotalFile As String Dim LinesOut As String Dim LinesIn() As String FileNum = FreeFile Open "d:\temp\Test.txt" For Binary As #FileNum TotalFile = Space(LOF(FileNum)) Get #FileNum, , TotalFile Close #FileNum LinesIn = Split(TotalFile, vbCrLf) LinesOut = Join(Filter(LinesIn, "layer_", True, vbTextCompare), vbCrLf) FileNum = FreeFile Open "d:\temp\OutTest.txt" For Output As #FileNum Print #FileNum, LinesOut Close #FileNum End Sub -- Rick (MVP - Excel) "RB Smissaert" wrote in message ... Function OpenTextFileToString(strFile As String) As String Dim hFile As Long On Error GoTo ERROROUT hFile = FreeFile Open strFile For Binary As #hFile OpenTextFileToString = Space(LOF(hFile)) Get hFile, , OpenTextFileToString Close #hFile Exit Function ERROROUT: If hFile 0 Then Close #hFile End If End Function Sub Test() Dim i As Long Dim n As Long Dim str As String Dim arr1 Dim arr2() As String str = OpenTextFileToString("C:\testfile.txt") arr1 = Split(str, vbCrLf) ReDim arr2(0 To UBound(arr1)) As String For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String 'to check we got it right For i = 0 To UBound(arr2) MsgBox arr2(i), , i Next i End Sub RBS "Varun" wrote in message ... Guys, I'd like to open and parse a file such that when I parse, only lines with certain text in them get included into my array. How can I accomplish this? For example, let say that file contents are as follows: text in line 1 text in line 2 layer_1 layer_2 layer_3 I'd like to save the lines with layer_1, layer_2, and layer_3 in my array named line. Here's what I have so far - what should I do next? Sub geomsasciiparse() Dim Buf() As String Dim logical_layer As Variant Dim line() As String Dim objFSO As Object Dim objGeomsAsciiFile As Object Set objFSO = CreateObject("Scripting.FileSystemObject") Set objGeomsAsciiFile = objFSO.OpenTextFile(MentDesContPath & "\geoms_ascii") strBuffer = objGeomsAsciiFile.Readline Do While Not objGeomsAsciiFile.AtEndOfStream If InStr(strBuffer, "layer") = 1 Then End If Thanks for help. |
How to put lines with certain text (from a file) in an array
I didn't save the original test. I've made a new test with somewhat
different data and seem to be getting a very different set of results this time. In one sense all consistent but now the Filter approach is taking about 2x longer than the loop with Instr with all sizes. (Previously 10Mb was only about 25% slower with the Filter method, but 1Mb an odd 3x slower). I'm pretty sure I had double checked my results last time. Maybe somehow I got it wrong or as I suspect, in the past I've also had inconsistent results with large strings, who knows. Here's what I tested this time - Option Explicit Private Declare Function GetTickCount Lib "kernel32.dll" () As Long Const cFILE As String = "c:\temp\TestFile#.txt" Sub MakeTestFiles() Dim i As Long Dim sFile As String, sText As String Dim a(1 To 5) As String Dim ff As Integer a(1) = "This is layer_1" a(2) = "this line does not have any layers" a(3) = "Embedded at the end of this line is Layer_3" a(4) = "A layer_4 in this fourth line" a(5) = "This will be the last line with layer_5" sText = Join(a, vbCrLf) Do sText = sText & vbCrLf & sText If Len(sText) 20000 Then i = i + 1 sFile = Replace(cFILE, "#", i) ff = FreeFile Open sFile For Output As #ff Print #ff, sText Close #ff Debug.Print i, Len(sText), sFile End If Loop Until Len(sText) 10000000 ' 10 files from 22Kb to 11Mb End Sub Sub CompareFilterLoop() Dim ff As Integer Dim i As Long, k As Long, n As Long, nSize As Long Dim tFilter As Long, tLoop As Long Dim sFile As String, sText As String Dim arr1, arr2 For k = 1 To 10 ff = FreeFile sFile = Replace(cFILE, "#", k) Open sFile For Binary As #ff nSize = LOF(ff) sText = Space(nSize) Get #ff, , sText Close #ff arr1 = Split(sText, vbCrLf) If IsArray(arr2) Then Erase arr2 tFilter = GetTickCount arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) tFilter = GetTickCount - tFilter Erase arr2 tLoop = GetTickCount ReDim arr2(0 To UBound(arr1)) As String n = 0 For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String tLoop = GetTickCount - tLoop Debug.Print tFilter, tLoop, nSize, UBound(arr1), UBound(arr2) Next End Sub For me the filter method was roughly 2x slower with all sizes above 300k where timings are meaningful Regards, Peter T "Rick Rothstein" wrote in message ... Out of curiosity, how much faster was "much faster" for a single loop if your test involved multiple loops (total time divided by number of loops)? -- Rick (MVP - Excel) |
How to put lines with certain text (from a file) in an array
My question was referring to physical elapsed time per loop, not relative
percentage speed. The reason I asked that question is if the entire process (read, process, save) takes, say, 5 seconds to complete and the part of the code in question takes either an 1/8 second for the fast code or 1/4 second for the slow code, I would not think that a significant time difference, even though one is half as fast as the other, when compared to the entire process the code is part of. In other words, reading a file and then saving a file will more than likely take up the bulk of the time and that is what the user will notice, not the relative time difference for a portion of the entire process. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... I didn't save the original test. I've made a new test with somewhat different data and seem to be getting a very different set of results this time. In one sense all consistent but now the Filter approach is taking about 2x longer than the loop with Instr with all sizes. (Previously 10Mb was only about 25% slower with the Filter method, but 1Mb an odd 3x slower). I'm pretty sure I had double checked my results last time. Maybe somehow I got it wrong or as I suspect, in the past I've also had inconsistent results with large strings, who knows. Here's what I tested this time - Option Explicit Private Declare Function GetTickCount Lib "kernel32.dll" () As Long Const cFILE As String = "c:\temp\TestFile#.txt" Sub MakeTestFiles() Dim i As Long Dim sFile As String, sText As String Dim a(1 To 5) As String Dim ff As Integer a(1) = "This is layer_1" a(2) = "this line does not have any layers" a(3) = "Embedded at the end of this line is Layer_3" a(4) = "A layer_4 in this fourth line" a(5) = "This will be the last line with layer_5" sText = Join(a, vbCrLf) Do sText = sText & vbCrLf & sText If Len(sText) 20000 Then i = i + 1 sFile = Replace(cFILE, "#", i) ff = FreeFile Open sFile For Output As #ff Print #ff, sText Close #ff Debug.Print i, Len(sText), sFile End If Loop Until Len(sText) 10000000 ' 10 files from 22Kb to 11Mb End Sub Sub CompareFilterLoop() Dim ff As Integer Dim i As Long, k As Long, n As Long, nSize As Long Dim tFilter As Long, tLoop As Long Dim sFile As String, sText As String Dim arr1, arr2 For k = 1 To 10 ff = FreeFile sFile = Replace(cFILE, "#", k) Open sFile For Binary As #ff nSize = LOF(ff) sText = Space(nSize) Get #ff, , sText Close #ff arr1 = Split(sText, vbCrLf) If IsArray(arr2) Then Erase arr2 tFilter = GetTickCount arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) tFilter = GetTickCount - tFilter Erase arr2 tLoop = GetTickCount ReDim arr2(0 To UBound(arr1)) As String n = 0 For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String tLoop = GetTickCount - tLoop Debug.Print tFilter, tLoop, nSize, UBound(arr1), UBound(arr2) Next End Sub For me the filter method was roughly 2x slower with all sizes above 300k where timings are meaningful Regards, Peter T "Rick Rothstein" wrote in message ... Out of curiosity, how much faster was "much faster" for a single loop if your test involved multiple loops (total time divided by number of loops)? -- Rick (MVP - Excel) |
How to put lines with certain text (from a file) in an array
I didn't understand exactly what you were asking, actually I still don't -
"total time divided by number of loops" There were no loops in the Filter method. The relative timings I referred to in my last two posts refer to the time to "process" the In-Array (irrespective from where it came from) and to output a processed array, respectively for the double Filter method and looping Instr in each element of the in-array. IOW, timings relate purely to the different methods to "process", exclusive of say read and save. This is consistent with the timings Bart demonstrated (albeit with the unnecessary Split & Join). I fully accept your point that the relevant time for the user is the overall time, and that a significant difference in a small part of the overall process may be insignificant overall, but that depends on the overall process. If you try the demo I posted it should be easy to adapt to time "read, process, save" vs merely "process" Regards, Peter T "Rick Rothstein" wrote in message ... My question was referring to physical elapsed time per loop, not relative percentage speed. The reason I asked that question is if the entire process (read, process, save) takes, say, 5 seconds to complete and the part of the code in question takes either an 1/8 second for the fast code or 1/4 second for the slow code, I would not think that a significant time difference, even though one is half as fast as the other, when compared to the entire process the code is part of. In other words, reading a file and then saving a file will more than likely take up the bulk of the time and that is what the user will notice, not the relative time difference for a portion of the entire process. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... I didn't save the original test. I've made a new test with somewhat different data and seem to be getting a very different set of results this time. In one sense all consistent but now the Filter approach is taking about 2x longer than the loop with Instr with all sizes. (Previously 10Mb was only about 25% slower with the Filter method, but 1Mb an odd 3x slower). I'm pretty sure I had double checked my results last time. Maybe somehow I got it wrong or as I suspect, in the past I've also had inconsistent results with large strings, who knows. Here's what I tested this time - Option Explicit Private Declare Function GetTickCount Lib "kernel32.dll" () As Long Const cFILE As String = "c:\temp\TestFile#.txt" Sub MakeTestFiles() Dim i As Long Dim sFile As String, sText As String Dim a(1 To 5) As String Dim ff As Integer a(1) = "This is layer_1" a(2) = "this line does not have any layers" a(3) = "Embedded at the end of this line is Layer_3" a(4) = "A layer_4 in this fourth line" a(5) = "This will be the last line with layer_5" sText = Join(a, vbCrLf) Do sText = sText & vbCrLf & sText If Len(sText) 20000 Then i = i + 1 sFile = Replace(cFILE, "#", i) ff = FreeFile Open sFile For Output As #ff Print #ff, sText Close #ff Debug.Print i, Len(sText), sFile End If Loop Until Len(sText) 10000000 ' 10 files from 22Kb to 11Mb End Sub Sub CompareFilterLoop() Dim ff As Integer Dim i As Long, k As Long, n As Long, nSize As Long Dim tFilter As Long, tLoop As Long Dim sFile As String, sText As String Dim arr1, arr2 For k = 1 To 10 ff = FreeFile sFile = Replace(cFILE, "#", k) Open sFile For Binary As #ff nSize = LOF(ff) sText = Space(nSize) Get #ff, , sText Close #ff arr1 = Split(sText, vbCrLf) If IsArray(arr2) Then Erase arr2 tFilter = GetTickCount arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) tFilter = GetTickCount - tFilter Erase arr2 tLoop = GetTickCount ReDim arr2(0 To UBound(arr1)) As String n = 0 For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String tLoop = GetTickCount - tLoop Debug.Print tFilter, tLoop, nSize, UBound(arr1), UBound(arr2) Next End Sub For me the filter method was roughly 2x slower with all sizes above 300k where timings are meaningful Regards, Peter T "Rick Rothstein" wrote in message ... Out of curiosity, how much faster was "much faster" for a single loop if your test involved multiple loops (total time divided by number of loops)? -- Rick (MVP - Excel) |
How to put lines with certain text (from a file) in an array
How about run the test yourself and you will see?
Just add some extra StartSW and StopSW and it will all be revealed. RBS "Rick Rothstein" wrote in message ... My question was referring to physical elapsed time per loop, not relative percentage speed. The reason I asked that question is if the entire process (read, process, save) takes, say, 5 seconds to complete and the part of the code in question takes either an 1/8 second for the fast code or 1/4 second for the slow code, I would not think that a significant time difference, even though one is half as fast as the other, when compared to the entire process the code is part of. In other words, reading a file and then saving a file will more than likely take up the bulk of the time and that is what the user will notice, not the relative time difference for a portion of the entire process. -- Rick (MVP - Excel) "Peter T" <peter_t@discussions wrote in message ... I didn't save the original test. I've made a new test with somewhat different data and seem to be getting a very different set of results this time. In one sense all consistent but now the Filter approach is taking about 2x longer than the loop with Instr with all sizes. (Previously 10Mb was only about 25% slower with the Filter method, but 1Mb an odd 3x slower). I'm pretty sure I had double checked my results last time. Maybe somehow I got it wrong or as I suspect, in the past I've also had inconsistent results with large strings, who knows. Here's what I tested this time - Option Explicit Private Declare Function GetTickCount Lib "kernel32.dll" () As Long Const cFILE As String = "c:\temp\TestFile#.txt" Sub MakeTestFiles() Dim i As Long Dim sFile As String, sText As String Dim a(1 To 5) As String Dim ff As Integer a(1) = "This is layer_1" a(2) = "this line does not have any layers" a(3) = "Embedded at the end of this line is Layer_3" a(4) = "A layer_4 in this fourth line" a(5) = "This will be the last line with layer_5" sText = Join(a, vbCrLf) Do sText = sText & vbCrLf & sText If Len(sText) 20000 Then i = i + 1 sFile = Replace(cFILE, "#", i) ff = FreeFile Open sFile For Output As #ff Print #ff, sText Close #ff Debug.Print i, Len(sText), sFile End If Loop Until Len(sText) 10000000 ' 10 files from 22Kb to 11Mb End Sub Sub CompareFilterLoop() Dim ff As Integer Dim i As Long, k As Long, n As Long, nSize As Long Dim tFilter As Long, tLoop As Long Dim sFile As String, sText As String Dim arr1, arr2 For k = 1 To 10 ff = FreeFile sFile = Replace(cFILE, "#", k) Open sFile For Binary As #ff nSize = LOF(ff) sText = Space(nSize) Get #ff, , sText Close #ff arr1 = Split(sText, vbCrLf) If IsArray(arr2) Then Erase arr2 tFilter = GetTickCount arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _ "layer_3", False, vbTextCompare) tFilter = GetTickCount - tFilter Erase arr2 tLoop = GetTickCount ReDim arr2(0 To UBound(arr1)) As String n = 0 For i = 0 To UBound(arr1) If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _ InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then arr2(n) = arr1(i) n = n + 1 End If Next i ReDim Preserve arr2(0 To n - 1) As String tLoop = GetTickCount - tLoop Debug.Print tFilter, tLoop, nSize, UBound(arr1), UBound(arr2) Next End Sub For me the filter method was roughly 2x slower with all sizes above 300k where timings are meaningful Regards, Peter T "Rick Rothstein" wrote in message ... Out of curiosity, how much faster was "much faster" for a single loop if your test involved multiple loops (total time divided by number of loops)? -- Rick (MVP - Excel) |
All times are GMT +1. The time now is 05:15 PM. |
Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com