View Single Post
  #15   Report Post  
Posted to microsoft.public.excel.programming
Peter T Peter T is offline
external usenet poster
 
Posts: 5,600
Default How to put lines with certain text (from a file) in an array

I didn't understand exactly what you were asking, actually I still don't -
"total time divided by number of loops"
There were no loops in the Filter method.

The relative timings I referred to in my last two posts refer to the time to
"process" the In-Array (irrespective from where it came from) and to output
a processed array, respectively for the double Filter method and looping
Instr in each element of the in-array.

IOW, timings relate purely to the different methods to "process", exclusive
of say read and save. This is consistent with the timings Bart demonstrated
(albeit with the unnecessary Split & Join).

I fully accept your point that the relevant time for the user is the overall
time, and that a significant difference in a small part of the overall
process may be insignificant overall, but that depends on the overall
process.

If you try the demo I posted it should be easy to adapt to time "read,
process, save" vs merely "process"

Regards,
Peter T






"Rick Rothstein" wrote in message
...
My question was referring to physical elapsed time per loop, not relative
percentage speed. The reason I asked that question is if the entire
process (read, process, save) takes, say, 5 seconds to complete and the
part of the code in question takes either an 1/8 second for the fast code
or 1/4 second for the slow code, I would not think that a significant time
difference, even though one is half as fast as the other, when compared to
the entire process the code is part of. In other words, reading a file and
then saving a file will more than likely take up the bulk of the time and
that is what the user will notice, not the relative time difference for a
portion of the entire process.

--
Rick (MVP - Excel)


"Peter T" <peter_t@discussions wrote in message
...
I didn't save the original test. I've made a new test with somewhat
different data and seem to be getting a very different set of results this
time. In one sense all consistent but now the Filter approach is taking
about 2x longer than the loop with Instr with all sizes. (Previously 10Mb
was only about 25% slower with the Filter method, but 1Mb an odd 3x
slower).

I'm pretty sure I had double checked my results last time. Maybe somehow
I got it wrong or as I suspect, in the past I've also had inconsistent
results with large strings, who knows. Here's what I tested this time -

Option Explicit
Private Declare Function GetTickCount Lib "kernel32.dll" () As Long
Const cFILE As String = "c:\temp\TestFile#.txt"

Sub MakeTestFiles()
Dim i As Long
Dim sFile As String, sText As String
Dim a(1 To 5) As String
Dim ff As Integer

a(1) = "This is layer_1"
a(2) = "this line does not have any layers"
a(3) = "Embedded at the end of this line is Layer_3"
a(4) = "A layer_4 in this fourth line"
a(5) = "This will be the last line with layer_5"

sText = Join(a, vbCrLf)
Do
sText = sText & vbCrLf & sText
If Len(sText) 20000 Then
i = i + 1
sFile = Replace(cFILE, "#", i)
ff = FreeFile
Open sFile For Output As #ff
Print #ff, sText
Close #ff
Debug.Print i, Len(sText), sFile
End If

Loop Until Len(sText) 10000000
' 10 files from 22Kb to 11Mb

End Sub

Sub CompareFilterLoop()
Dim ff As Integer
Dim i As Long, k As Long, n As Long, nSize As Long
Dim tFilter As Long, tLoop As Long
Dim sFile As String, sText As String
Dim arr1, arr2

For k = 1 To 10

ff = FreeFile
sFile = Replace(cFILE, "#", k)
Open sFile For Binary As #ff
nSize = LOF(ff)
sText = Space(nSize)
Get #ff, , sText
Close #ff

arr1 = Split(sText, vbCrLf)
If IsArray(arr2) Then Erase arr2

tFilter = GetTickCount
arr2 = Filter(Filter(arr1, "layer_", True, vbTextCompare), _
"layer_3", False, vbTextCompare)
tFilter = GetTickCount - tFilter

Erase arr2

tLoop = GetTickCount
ReDim arr2(0 To UBound(arr1)) As String
n = 0
For i = 0 To UBound(arr1)
If InStr(1, arr1(i), "layer_", vbBinaryCompare) 0 And _
InStr(1, arr1(i), "layer_3", vbBinaryCompare) = 0 Then
arr2(n) = arr1(i)
n = n + 1
End If
Next i
ReDim Preserve arr2(0 To n - 1) As String

tLoop = GetTickCount - tLoop
Debug.Print tFilter, tLoop, nSize, UBound(arr1), UBound(arr2)

Next

End Sub

For me the filter method was roughly 2x slower with all sizes above 300k
where timings are meaningful

Regards,
Peter T



"Rick Rothstein" wrote in message
...
Out of curiosity, how much faster was "much faster" for a single loop if
your test involved multiple loops (total time divided by number of
loops)?

--
Rick (MVP - Excel)