Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 55
Default load/parse large text file

I have a data file that I'm importing into Excel (starts as a VMS report
that is dumped into the body of an outlook message, which is then saved as a
text file). I parse it based on fixed width columns (no delimiter) the same
code is used for several different files, so the parse widths are stored in
an array.

When the file was smaller, it was plenty fast- but now the file has grown to
13MB, and it takes many minutes to import. I suspect there is a better way.
Right now I'm opening the file, grabbing each line, parsing it, then pasting
it to the worksheet (code below).

Can someone recommend the best practice to do this as quickly as possible?
I've already got calculations set to manual, and screenupdating to false. I
was considering loading the entire file first, then parsing each line in
memory, then pasting all the lines, but while that groups like operations
together, I wasn't sure how it would save time if all of the same operations
were still being done a line at a time.

Thanks!!
Keith

Open LongFN For Input As #1
Do While Not EOF(1)
Line Input #1, LineofText
If Len(LineofText) 3 Then
rw = rw + 1
' now parse LineofText according to the column
widths and
' put the values in an array.
For j = 1 To 30
ParseStart = TotalFileArray(WhichFile, j)
ParseEnd = TotalFileArray(WhichFile, j + 1)
If ParseEnd 0 Then
TotalDataArray(WhichFile, j, rw) =
Trim(Mid(LineofText, ParseStart, ParseEnd - ParseStart))
ConvertCol (j)
mywrksht.Range(Usecol & Trim(Str(rw))).Value
= TotalDataArray(WhichFile, j, rw)
End If
Next
End If
Loop
Close #1 'Close the file




  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 12
Default load/parse large text file

Keith

My understanding is that each "write" to the spreadsheet carries an
overhead.

So, you could try parsing something like 100 rows of data into a
variant array and then writing that to the spreadsheet by simply
setting the appropriate range equal to the array. This will also cut
out your call to your ConvertCol routine.

HTH

Peter

  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 55
Default load/parse large text file

I've updated my code to read the entire file first, one line at a time into
an array (wow, that does seem faster, now down to about 30 seconds), parse
each line in memory (almost instant), and now I'm ready to set the worksheet
range equal to the array. I now have a few more questions. Using Excel 2003.

1) I'm never sure how many rows the data file will have, so I start with an
oversized array so I won't have to continually redim/preserve. My array is
dimmed as testarray(1 To 32, 1 To 50000). I'm not worried about memory, so
is it ok to dim an oversized array or will that cause some other problem I
don't know about yet?

2) I'm not sure how many rows are in the data file in advance, so I
increment a counter variable, rw, as I read each line. what is the
appropriate syntax to set a range equal to the array? I tried the following,
but it didn't work out.
Sheet1.range("A1") = testArray() 'in case Excel could autoexpand the range
to match the array size
Sheet1.range("A1:AF50000") = testArray() 'to paste the whole array at once,
risk overwriting surrounding areas with blanks
Sheet1.range("A1").resize(rw,32) = testArray() ' to paste only the same
number of rows as were in the raw file?

I keep getting a type mismatch error :(

3) I'm actually reading the raw file into (32, 1 to 50000), then parsing
each string back into fields 1-31. When I write the range back, is there an
easy way to just write the first 31 fields of the array? My alternative is
to erase field 32 before writing the whole array (probably with a
redim/preserve to make the array smaller), but I wasn't sure if I could just
write the desired section of the array- it might be useful if I later find
out I need to go back to the original string for any reason.

Many thanks!!
Keith

"Peter Grebenik" wrote in message
ups.com...
Keith

My understanding is that each "write" to the spreadsheet carries an
overhead.

So, you could try parsing something like 100 rows of data into a
variant array and then writing that to the spreadsheet by simply
setting the appropriate range equal to the array. This will also cut
out your call to your ConvertCol routine.

HTH

Peter



  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 12
Default load/parse large text file

Keith

I think that you need to swap your rows and columns so that it becomes
testArray(rows,columns).

To put your array into the sheet, you need to set the range size to be
equal to the array size as the following sample code shows. I used
this to copy a range of 40000 rows by 31 columns in about 10 seconds.

Sub t()
Dim r As Long, c As Long, a As Variant
a = Selection

r = UBound(a, 1) - LBound(a, 1) + 1
c = UBound(a, 2) - LBound(a, 2) + 1

Range("ah2").Resize(r, c) = a

End Sub


Peter

  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default load/parse large text file

As Peter said you need to swap your rows/columns

Re not dumping unnecessary columns into cells with your potentially
oversized array, while populating your array track the max number of used
columns, then redim preserve down to the max used. Can't do that of course
for the first dimension, the rows.

Regards,
Peter T

"Keith R" wrote in message
...
I've updated my code to read the entire file first, one line at a time

into
an array (wow, that does seem faster, now down to about 30 seconds), parse
each line in memory (almost instant), and now I'm ready to set the

worksheet
range equal to the array. I now have a few more questions. Using Excel

2003.

1) I'm never sure how many rows the data file will have, so I start with

an
oversized array so I won't have to continually redim/preserve. My array is
dimmed as testarray(1 To 32, 1 To 50000). I'm not worried about memory, so
is it ok to dim an oversized array or will that cause some other problem I
don't know about yet?

2) I'm not sure how many rows are in the data file in advance, so I
increment a counter variable, rw, as I read each line. what is the
appropriate syntax to set a range equal to the array? I tried the

following,
but it didn't work out.
Sheet1.range("A1") = testArray() 'in case Excel could autoexpand the range
to match the array size
Sheet1.range("A1:AF50000") = testArray() 'to paste the whole array at

once,
risk overwriting surrounding areas with blanks
Sheet1.range("A1").resize(rw,32) = testArray() ' to paste only the same
number of rows as were in the raw file?

I keep getting a type mismatch error :(

3) I'm actually reading the raw file into (32, 1 to 50000), then parsing
each string back into fields 1-31. When I write the range back, is there

an
easy way to just write the first 31 fields of the array? My alternative is
to erase field 32 before writing the whole array (probably with a
redim/preserve to make the array smaller), but I wasn't sure if I could

just
write the desired section of the array- it might be useful if I later find
out I need to go back to the original string for any reason.

Many thanks!!
Keith

"Peter Grebenik" wrote in message
ups.com...
Keith

My understanding is that each "write" to the spreadsheet carries an
overhead.

So, you could try parsing something like 100 rows of data into a
variant array and then writing that to the spreadsheet by simply
setting the appropriate range equal to the array. This will also cut
out your call to your ConvertCol routine.

HTH

Peter







  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 55
Default load/parse large text file

Peter- I apologize for being so dense- I've not taken this approach before,
so the syntax is killing me. I switched the rows and columns, so that part
is done.

I believe the Ubound and LBound would need to refer to the array, so
therefore 'a' has to refer to my array. I changed the line a=selection to
a=myarray()

It gets through the Ubound/LBound lines and returns correct size paramters,
but when it tries to assign the array to my worksheet, I get a runtime error
1004 application- or object-defined error.

Sheet17.Range("a1").Resize(r, c) = a
values=
Sheet17.range("a1").Resize(5000,32)=a
I also reversed it in case the resize was based on my original (switched)
array order
Sheet17.range("a1").resize(32,5000)=a
but got the same error.

I tried both a=myarray() and a=myarray

Thank you for your continued help and patience,
Keith


"Peter Grebenik" wrote in message
oups.com...
Keith

I think that you need to swap your rows and columns so that it becomes
testArray(rows,columns).

To put your array into the sheet, you need to set the range size to be
equal to the array size as the following sample code shows. I used
this to copy a range of 40000 rows by 31 columns in about 10 seconds.

Sub t()
Dim r As Long, c As Long, a As Variant
a = Selection

r = UBound(a, 1) - LBound(a, 1) + 1
c = UBound(a, 2) - LBound(a, 2) + 1

Range("ah2").Resize(r, c) = a

End Sub


Peter



  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 55
Default load/parse large text file

Ok, I have it working now-

The reason I was getting the error was I had not yet redimmed my array to
truncate the last field (the one the contained the original text string, in
case I needed it later). The error was occuring when it tried to paste the
original strings into cells. Once I redimmed the array and shrank the paste
range accordingly, everything pasted just fine.

I know there is a visible character limit of something like 255 characters
in a cell, but I thought that the cell could actually hold a good bit more
than that. I'm now assuming that I hit an upper limit on the actual cell
string length, whatever that limit is.

Thanks again to Peter G and Peter T for the help,
Keith

"Keith R" wrote in message
...
Peter- I apologize for being so dense- I've not taken this approach
before, so the syntax is killing me. I switched the rows and columns, so
that part is done.

I believe the Ubound and LBound would need to refer to the array, so
therefore 'a' has to refer to my array. I changed the line a=selection to
a=myarray()

It gets through the Ubound/LBound lines and returns correct size
paramters, but when it tries to assign the array to my worksheet, I get a
runtime error 1004 application- or object-defined error.

Sheet17.Range("a1").Resize(r, c) = a
values=
Sheet17.range("a1").Resize(5000,32)=a
I also reversed it in case the resize was based on my original (switched)
array order
Sheet17.range("a1").resize(32,5000)=a
but got the same error.

I tried both a=myarray() and a=myarray

Thank you for your continued help and patience,
Keith


"Peter Grebenik" wrote in message
oups.com...
Keith

I think that you need to swap your rows and columns so that it becomes
testArray(rows,columns).

To put your array into the sheet, you need to set the range size to be
equal to the array size as the following sample code shows. I used
this to copy a range of 40000 rows by 31 columns in about 10 seconds.

Sub t()
Dim r As Long, c As Long, a As Variant
a = Selection

r = UBound(a, 1) - LBound(a, 1) + 1
c = UBound(a, 2) - LBound(a, 2) + 1

Range("ah2").Resize(r, c) = a

End Sub


Peter





  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 12
Default load/parse large text file

Keith

I hope you got your improvement in speed.

A little experimentation seems to indicate that the maximum string
length that can be copied by this method is about 900 characters in
Excel 2003.

Helpfully, Microsoft have arranged different limits in different
versions of Excel!

Best wishes

Peter

  #9   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default load/parse large text file

A cell can display the first 1024 characters but store 32K, whether as the
result of a formula, a value or after copy/paste

[a1].Formula = "=REPT(""a"",2^15-1)"
[a2].Formula = "=LEN(A1)"
[a3].Value = [a1].Value
[a4].Formula = "=LEN(A1)"
[a3].Copy [a5] ' copy/paste
[a6].Formula = "=LEN(A5)"

The above is a bit artificial and the respective limits may be slightly
less. Max length of a formula (beginning with an =) is 1024 or a bit less.

Regards,
Peter T

"Keith R" wrote in message
...
Ok, I have it working now-

The reason I was getting the error was I had not yet redimmed my array to
truncate the last field (the one the contained the original text string,

in
case I needed it later). The error was occuring when it tried to paste the
original strings into cells. Once I redimmed the array and shrank the

paste
range accordingly, everything pasted just fine.

I know there is a visible character limit of something like 255 characters
in a cell, but I thought that the cell could actually hold a good bit more
than that. I'm now assuming that I hit an upper limit on the actual cell
string length, whatever that limit is.

Thanks again to Peter G and Peter T for the help,
Keith

"Keith R" wrote in message
...
Peter- I apologize for being so dense- I've not taken this approach
before, so the syntax is killing me. I switched the rows and columns, so
that part is done.

I believe the Ubound and LBound would need to refer to the array, so
therefore 'a' has to refer to my array. I changed the line a=selection

to
a=myarray()

It gets through the Ubound/LBound lines and returns correct size
paramters, but when it tries to assign the array to my worksheet, I get

a
runtime error 1004 application- or object-defined error.

Sheet17.Range("a1").Resize(r, c) = a
values=
Sheet17.range("a1").Resize(5000,32)=a
I also reversed it in case the resize was based on my original

(switched)
array order
Sheet17.range("a1").resize(32,5000)=a
but got the same error.

I tried both a=myarray() and a=myarray

Thank you for your continued help and patience,
Keith


"Peter Grebenik" wrote in message
oups.com...
Keith

I think that you need to swap your rows and columns so that it becomes
testArray(rows,columns).

To put your array into the sheet, you need to set the range size to be
equal to the array size as the following sample code shows. I used
this to copy a range of 40000 rows by 31 columns in about 10 seconds.

Sub t()
Dim r As Long, c As Long, a As Variant
a = Selection

r = UBound(a, 1) - LBound(a, 1) + 1
c = UBound(a, 2) - LBound(a, 2) + 1

Range("ah2").Resize(r, c) = a

End Sub


Peter







  #10   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 12
Default load/parse large text file

Peter

I was merely pointing out the limitations of copying data by setting a
range equal to an array. It means that this method cannot be used to
copy cells containing very long text strings; they have to be copied
using the copy method.

Peter



  #11   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default load/parse large text file

Hi Peter,

I have to confess I only scan read parts of the thread before posting the
previous two times, eg I didn't pick up on "this method" in your adjacent
post

I see your point, for me in a very light test the array to cells method
truncates to 1823 characters

[a1].Formula = "=Rept(""a"", 20000)"
[a1:d5] = [a1].Value

varr = [a1:d5].Value
Debug.Print Len(varr(2, 2)) ' 20000, picks up fine

Range("A11:d16") = varr
Debug.Print Len(Range("A11")) ' 1823, dumps truncated

Regards,
Peter T


"Peter Grebenik" wrote in message
oups.com...
Peter

I was merely pointing out the limitations of copying data by setting a
range equal to an array. It means that this method cannot be used to
copy cells containing very long text strings; they have to be copied
using the copy method.

Peter



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
load a CSV file and have it take all fields as text cj Excel Discussion (Misc queries) 5 March 29th 07 02:24 PM
How to parse data in text file sifar Excel Programming 4 October 8th 05 01:50 AM
Can I load text file into Excel seting field delimter using cmd li Tim Excel Discussion (Misc queries) 0 April 19th 05 01:55 AM
Load text file and write back sanjay Excel Programming 1 May 26th 04 03:28 AM
Parse Text File John[_62_] Excel Programming 5 October 22nd 03 02:50 PM


All times are GMT +1. The time now is 07:39 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"