Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default File Input Question


I am developing a macro which pulls a lot of information out of certai
.txt files and does calculations and such. The problem comes i
finding which files need to be parsed. There is no consistent fil
naming system. All we know is that it is in a specified folder an
that the 3rd line of the file will contain a string which the use
specifies.

Problem: This directory contains approx. 4000 files so we should no
attempt to open every file in the entire directory to parse.

I was thinking about trying to use the document properties to sif
through the files. For example, we do know that the file has to b
created on or after a certain date. Is this faster than opening th
entire file for output? (I assume it is)

Can anyone think of a way to find the required files faster?
Also, is there a fast way to get directly to a specific line number i
an input file (or specific line/col in a txt file?)

Thanks in advance

--
prepotenc
-----------------------------------------------------------------------
prepotency's Profile: http://www.excelforum.com/member.php...fo&userid=2415
View this thread: http://www.excelforum.com/showthread.php?threadid=38663

  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default File Input Question


Can anyone just answer me the latter question?

Is there a way to parse a .txt file and go directly to a specified row
or column in the text file? (i.e. to get to row 3 in the text file
without iterating there)


--
prepotency
------------------------------------------------------------------------
prepotency's Profile: http://www.excelforum.com/member.php...o&userid=24155
View this thread: http://www.excelforum.com/showthread...hreadid=386630

  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default File Input Question

I don't think you can "go directly" without having to count lines: in any
case it doesn't take long to count to 3...

How large are the files? Even if you checked them all it wouldn't take too
long (unless of course this is something you need to do every minute). For
a one-time job even a couple of minutes should be acceptable.


Tim.

--
Tim Williams
Palo Alto, CA


"prepotency" wrote
in message ...

Can anyone just answer me the latter question?

Is there a way to parse a .txt file and go directly to a specified row
or column in the text file? (i.e. to get to row 3 in the text file
without iterating there)


--
prepotency
------------------------------------------------------------------------
prepotency's Profile:

http://www.excelforum.com/member.php...o&userid=24155
View this thread: http://www.excelforum.com/showthread...hreadid=386630



  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 158
Default File Input Question

As far as I know Tim's spot on, in that

(a) there's no way as far as I know - unless you start to get into
random access files;
and
(b) you will be amazed at how blindingly fast VBA can search through
thousands of files, parse each line, compare strings. It's amazing.

However, if you need to access specific points in a file... I could
understand why this might sound a good idea -- say you want to go to
line 9999, it *might* be quicker to read the file in one fell swoop
rather than bit by bit. (But surely - definitely - not -as Tim wrote -
if you need to just access data at the beginning of the file).

To do this maybe something like:

Sub ReadWholeFile()

Dim myFile As String
Dim myLines() As String
Dim hFile As Integer

hFile = FreeFile
Open "c:\test.txt" For Input As #hFile
myFile = Input(LOF(hFile), hFile)
Close #hFile

'If file has records of identical length (say 100)
'you could access line 999 thus:
myrecordlength = 100 + Len(vbCrLf)
Line999 = Mid$(myFile, myrecordlength * 999, myrecordlength)

'Alternatively, you could split everything up by line into an
'array
myLines = Split(myFile, vbCrLf)
'You can now address line 999 thus:
Line999 = myLines(999)

End Sub

I have no idea if this will be quicker or (massively) slower than
reading a large file piecemeal. I would be interested to know if you do
a performance test!

HTH,
G



Tim Williams wrote:
I don't think you can "go directly" without having to count lines: in any
case it doesn't take long to count to 3...

How large are the files? Even if you checked them all it wouldn't take too
long (unless of course this is something you need to do every minute). For
a one-time job even a couple of minutes should be acceptable.


Tim.

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 158
Default File Input Question

Please ignore my last post. I thought about this and decided to do a
test for curiosity's sake and was very surprised. You should definitely
go through line by line.

I created a file with 10,000 records of 100 char each. I then accessed a
specific line in the file, closed it and repeated accessing the line 100
times.

Test 1 uses Line Input to go through line by line.
Test 2 loads the whole file into memory and parses the string using MID.
Test 3 loads the whole file into memory and splits the string using SPLIT.

Even if you go through the file line by line and need to access the
penultimate record it is still almost twice as fast as loading the file
into memory and using MID and 2.5 times as fast as using SPLIT. I would
expect it to be faster in general but not if I'm accessing the last
record. Wow! It's probably the SPLIT and MID that take the time I guess.

**TEST BEGINS**
Accessing line 999
Reading 100 files
Test 1: 00:00 (nn:ss)
Test 2: 00:07
Test 3: 00:10
**TEST ENDS**

**TEST BEGINS**
Accessing line 9999
Reading 100 files
Test 1: 00:04
Test 2: 00:07
Test 3: 00:10
**TEST ENDS**

**TEST BEGINS**
Accessing line 99
Reading 1000 files
Test 1: 00:00
Test 2: 01:12
Test 3: 01:40
**TEST ENDS**


My test code in case you're interested:

Sub ReadWholeFileTest()

Dim myFile As String
Dim myLines() As String
Dim hFile As Integer
Dim LineToAccess As Long
Dim myPath As String
Dim NumberOfTimesToReadFile As Integer
Dim myStartTime As Date
Dim myLineCounter As Integer
Dim myLine As String
Dim myRecordLength As Long
Dim PositionOfRecord As Long

LineToAccess = 99
myPath = "c:\test.txt"
NumberOfTimesToReadFile = 100

Debug.Print "**TEST BEGINS**"
Debug.Print "Accessing line " & LineToAccess
Debug.Print "Reading " & NumberOfTimesToReadFile & " files"

'Test 1 - normal line by line
myStartTime = Now
For i = 1 To NumberOfTimesToReadFile
hFile = FreeFile
Open myPath For Input As #hFile
Do While Not EOF(hFile)
Line Input #hFile, myLine
myLineCounter = myLineCounter + 1
If myLineCounter = LineToAccess Then Exit Do
Loop
Close #hFile
myLineCounter = 0
Next i
Debug.Print "Test 1: " & Format(myStartTime - Now, "nn:ss")
Debug.Print myLine

'Test 2 - by parsing string using mid
myLine = ""
myRecordLength = 100 + Len(vbCrLf)
myStartTime = Now
For i = 1 To NumberOfTimesToReadFile
hFile = FreeFile
Open myPath For Input As #hFile
myFile = Input(LOF(hFile), hFile)
Close #hFile
PositionOfRecord = (myRecordLength * (LineToAccess - 1)) + 1
myLine = Mid$(myFile, PositionOfRecord, 100)
Next i
Debug.Print "Test 2: " & Format(myStartTime - Now, "nn:ss")
Debug.Print myLine


'Test 3 - by splitting string
myLine = ""
myRecordLength = 100 + Len(vbCrLf)
myStartTime = Now
For i = 1 To NumberOfTimesToReadFile
hFile = FreeFile
Open myPath For Input As #hFile
myFile = Input(LOF(hFile), hFile)
Close #hFile
myLines = Split(myFile, vbCrLf)
myLine = myLines(LineToAccess - 1)
Next i
Debug.Print "Test 3: " & Format(myStartTime - Now, "nn:ss")
Debug.Print myLine

Debug.Print "**TEST ENDS**"

End Sub




Gareth wrote:
As far as I know Tim's spot on, in that

(a) there's no way as far as I know - unless you start to get into
random access files;
and
(b) you will be amazed at how blindingly fast VBA can search through
thousands of files, parse each line, compare strings. It's amazing.

However, if you need to access specific points in a file... I could
understand why this might sound a good idea -- say you want to go to
line 9999, it *might* be quicker to read the file in one fell swoop
rather than bit by bit. (But surely - definitely - not -as Tim wrote -
if you need to just access data at the beginning of the file).

To do this maybe something like:

Sub ReadWholeFile()

Dim myFile As String
Dim myLines() As String
Dim hFile As Integer

hFile = FreeFile
Open "c:\test.txt" For Input As #hFile
myFile = Input(LOF(hFile), hFile)
Close #hFile

'If file has records of identical length (say 100)
'you could access line 999 thus:
myrecordlength = 100 + Len(vbCrLf)
Line999 = Mid$(myFile, myrecordlength * 999, myrecordlength)

'Alternatively, you could split everything up by line into an
'array
myLines = Split(myFile, vbCrLf)
'You can now address line 999 thus:
Line999 = myLines(999)

End Sub

I have no idea if this will be quicker or (massively) slower than
reading a large file piecemeal. I would be interested to know if you do
a performance test!

HTH,
G



Tim Williams wrote:

I don't think you can "go directly" without having to count lines: in
any
case it doesn't take long to count to 3...

How large are the files? Even if you checked them all it wouldn't
take too
long (unless of course this is something you need to do every
minute). For
a one-time job even a couple of minutes should be acceptable.


Tim.



  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default File Input Question


Yes exactly. I guess my question in the OP was referring to the file
parsing method which uses the least memory (time taken to run). I had
considered using split or Mid but what we're really trying to get
around here is reading in the whole file. Furthermore with Split, it
takes so much time because it parses the whole file and writes its
contents into an array (the creation of the array is what takes so
long). With the Mid option, we are still forced to read on the whole
file. Here's a question: when you read in line by line does VBA still
open the whole file? I guess that's what I'm wondering. Because I
don't really want to open up the whole file. I just assumed that it
opened the whole file into ram and then started pulling the lines out
of it, which wouldn't necessarily save you any time.

Question: What's the fastest way to tab through every file in a
specified folder and how do you allow the user to browse and select a
folder (NOT a file)? Nagging question I was having.

G


--
prepotency
------------------------------------------------------------------------
prepotency's Profile: http://www.excelforum.com/member.php...o&userid=24155
View this thread: http://www.excelforum.com/showthread...hreadid=386630

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Input Box Question bumper338 Excel Discussion (Misc queries) 4 March 20th 07 11:18 PM
Input Box Question StGermain Excel Discussion (Misc queries) 2 March 12th 07 05:14 PM
Extracting data from one xls file and input into another xls file trainerbro Excel Discussion (Misc queries) 1 November 1st 06 08:57 PM
Input Box Question with Code sebastienm Excel Programming 0 August 16th 04 07:37 PM
Input box question losmac Excel Programming 0 August 20th 03 10:32 PM


All times are GMT +1. The time now is 06:23 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"