Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() I am developing a macro which pulls a lot of information out of certai .txt files and does calculations and such. The problem comes i finding which files need to be parsed. There is no consistent fil naming system. All we know is that it is in a specified folder an that the 3rd line of the file will contain a string which the use specifies. Problem: This directory contains approx. 4000 files so we should no attempt to open every file in the entire directory to parse. I was thinking about trying to use the document properties to sif through the files. For example, we do know that the file has to b created on or after a certain date. Is this faster than opening th entire file for output? (I assume it is) Can anyone think of a way to find the required files faster? Also, is there a fast way to get directly to a specific line number i an input file (or specific line/col in a txt file?) Thanks in advance -- prepotenc ----------------------------------------------------------------------- prepotency's Profile: http://www.excelforum.com/member.php...fo&userid=2415 View this thread: http://www.excelforum.com/showthread.php?threadid=38663 |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Can anyone just answer me the latter question? Is there a way to parse a .txt file and go directly to a specified row or column in the text file? (i.e. to get to row 3 in the text file without iterating there) -- prepotency ------------------------------------------------------------------------ prepotency's Profile: http://www.excelforum.com/member.php...o&userid=24155 View this thread: http://www.excelforum.com/showthread...hreadid=386630 |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I don't think you can "go directly" without having to count lines: in any
case it doesn't take long to count to 3... How large are the files? Even if you checked them all it wouldn't take too long (unless of course this is something you need to do every minute). For a one-time job even a couple of minutes should be acceptable. Tim. -- Tim Williams Palo Alto, CA "prepotency" wrote in message ... Can anyone just answer me the latter question? Is there a way to parse a .txt file and go directly to a specified row or column in the text file? (i.e. to get to row 3 in the text file without iterating there) -- prepotency ------------------------------------------------------------------------ prepotency's Profile: http://www.excelforum.com/member.php...o&userid=24155 View this thread: http://www.excelforum.com/showthread...hreadid=386630 |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
As far as I know Tim's spot on, in that
(a) there's no way as far as I know - unless you start to get into random access files; and (b) you will be amazed at how blindingly fast VBA can search through thousands of files, parse each line, compare strings. It's amazing. However, if you need to access specific points in a file... I could understand why this might sound a good idea -- say you want to go to line 9999, it *might* be quicker to read the file in one fell swoop rather than bit by bit. (But surely - definitely - not -as Tim wrote - if you need to just access data at the beginning of the file). To do this maybe something like: Sub ReadWholeFile() Dim myFile As String Dim myLines() As String Dim hFile As Integer hFile = FreeFile Open "c:\test.txt" For Input As #hFile myFile = Input(LOF(hFile), hFile) Close #hFile 'If file has records of identical length (say 100) 'you could access line 999 thus: myrecordlength = 100 + Len(vbCrLf) Line999 = Mid$(myFile, myrecordlength * 999, myrecordlength) 'Alternatively, you could split everything up by line into an 'array myLines = Split(myFile, vbCrLf) 'You can now address line 999 thus: Line999 = myLines(999) End Sub I have no idea if this will be quicker or (massively) slower than reading a large file piecemeal. I would be interested to know if you do a performance test! HTH, G Tim Williams wrote: I don't think you can "go directly" without having to count lines: in any case it doesn't take long to count to 3... How large are the files? Even if you checked them all it wouldn't take too long (unless of course this is something you need to do every minute). For a one-time job even a couple of minutes should be acceptable. Tim. |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Please ignore my last post. I thought about this and decided to do a
test for curiosity's sake and was very surprised. You should definitely go through line by line. I created a file with 10,000 records of 100 char each. I then accessed a specific line in the file, closed it and repeated accessing the line 100 times. Test 1 uses Line Input to go through line by line. Test 2 loads the whole file into memory and parses the string using MID. Test 3 loads the whole file into memory and splits the string using SPLIT. Even if you go through the file line by line and need to access the penultimate record it is still almost twice as fast as loading the file into memory and using MID and 2.5 times as fast as using SPLIT. I would expect it to be faster in general but not if I'm accessing the last record. Wow! It's probably the SPLIT and MID that take the time I guess. **TEST BEGINS** Accessing line 999 Reading 100 files Test 1: 00:00 (nn:ss) Test 2: 00:07 Test 3: 00:10 **TEST ENDS** **TEST BEGINS** Accessing line 9999 Reading 100 files Test 1: 00:04 Test 2: 00:07 Test 3: 00:10 **TEST ENDS** **TEST BEGINS** Accessing line 99 Reading 1000 files Test 1: 00:00 Test 2: 01:12 Test 3: 01:40 **TEST ENDS** My test code in case you're interested: Sub ReadWholeFileTest() Dim myFile As String Dim myLines() As String Dim hFile As Integer Dim LineToAccess As Long Dim myPath As String Dim NumberOfTimesToReadFile As Integer Dim myStartTime As Date Dim myLineCounter As Integer Dim myLine As String Dim myRecordLength As Long Dim PositionOfRecord As Long LineToAccess = 99 myPath = "c:\test.txt" NumberOfTimesToReadFile = 100 Debug.Print "**TEST BEGINS**" Debug.Print "Accessing line " & LineToAccess Debug.Print "Reading " & NumberOfTimesToReadFile & " files" 'Test 1 - normal line by line myStartTime = Now For i = 1 To NumberOfTimesToReadFile hFile = FreeFile Open myPath For Input As #hFile Do While Not EOF(hFile) Line Input #hFile, myLine myLineCounter = myLineCounter + 1 If myLineCounter = LineToAccess Then Exit Do Loop Close #hFile myLineCounter = 0 Next i Debug.Print "Test 1: " & Format(myStartTime - Now, "nn:ss") Debug.Print myLine 'Test 2 - by parsing string using mid myLine = "" myRecordLength = 100 + Len(vbCrLf) myStartTime = Now For i = 1 To NumberOfTimesToReadFile hFile = FreeFile Open myPath For Input As #hFile myFile = Input(LOF(hFile), hFile) Close #hFile PositionOfRecord = (myRecordLength * (LineToAccess - 1)) + 1 myLine = Mid$(myFile, PositionOfRecord, 100) Next i Debug.Print "Test 2: " & Format(myStartTime - Now, "nn:ss") Debug.Print myLine 'Test 3 - by splitting string myLine = "" myRecordLength = 100 + Len(vbCrLf) myStartTime = Now For i = 1 To NumberOfTimesToReadFile hFile = FreeFile Open myPath For Input As #hFile myFile = Input(LOF(hFile), hFile) Close #hFile myLines = Split(myFile, vbCrLf) myLine = myLines(LineToAccess - 1) Next i Debug.Print "Test 3: " & Format(myStartTime - Now, "nn:ss") Debug.Print myLine Debug.Print "**TEST ENDS**" End Sub Gareth wrote: As far as I know Tim's spot on, in that (a) there's no way as far as I know - unless you start to get into random access files; and (b) you will be amazed at how blindingly fast VBA can search through thousands of files, parse each line, compare strings. It's amazing. However, if you need to access specific points in a file... I could understand why this might sound a good idea -- say you want to go to line 9999, it *might* be quicker to read the file in one fell swoop rather than bit by bit. (But surely - definitely - not -as Tim wrote - if you need to just access data at the beginning of the file). To do this maybe something like: Sub ReadWholeFile() Dim myFile As String Dim myLines() As String Dim hFile As Integer hFile = FreeFile Open "c:\test.txt" For Input As #hFile myFile = Input(LOF(hFile), hFile) Close #hFile 'If file has records of identical length (say 100) 'you could access line 999 thus: myrecordlength = 100 + Len(vbCrLf) Line999 = Mid$(myFile, myrecordlength * 999, myrecordlength) 'Alternatively, you could split everything up by line into an 'array myLines = Split(myFile, vbCrLf) 'You can now address line 999 thus: Line999 = myLines(999) End Sub I have no idea if this will be quicker or (massively) slower than reading a large file piecemeal. I would be interested to know if you do a performance test! HTH, G Tim Williams wrote: I don't think you can "go directly" without having to count lines: in any case it doesn't take long to count to 3... How large are the files? Even if you checked them all it wouldn't take too long (unless of course this is something you need to do every minute). For a one-time job even a couple of minutes should be acceptable. Tim. |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Yes exactly. I guess my question in the OP was referring to the file parsing method which uses the least memory (time taken to run). I had considered using split or Mid but what we're really trying to get around here is reading in the whole file. Furthermore with Split, it takes so much time because it parses the whole file and writes its contents into an array (the creation of the array is what takes so long). With the Mid option, we are still forced to read on the whole file. Here's a question: when you read in line by line does VBA still open the whole file? I guess that's what I'm wondering. Because I don't really want to open up the whole file. I just assumed that it opened the whole file into ram and then started pulling the lines out of it, which wouldn't necessarily save you any time. Question: What's the fastest way to tab through every file in a specified folder and how do you allow the user to browse and select a folder (NOT a file)? Nagging question I was having. G -- prepotency ------------------------------------------------------------------------ prepotency's Profile: http://www.excelforum.com/member.php...o&userid=24155 View this thread: http://www.excelforum.com/showthread...hreadid=386630 |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Input Box Question | Excel Discussion (Misc queries) | |||
Input Box Question | Excel Discussion (Misc queries) | |||
Extracting data from one xls file and input into another xls file | Excel Discussion (Misc queries) | |||
Input Box Question with Code | Excel Programming | |||
Input box question | Excel Programming |