Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
I have a data file that I'm importing into Excel (starts as a VMS report
that is dumped into the body of an outlook message, which is then saved as a text file). I parse it based on fixed width columns (no delimiter) the same code is used for several different files, so the parse widths are stored in an array. When the file was smaller, it was plenty fast- but now the file has grown to 13MB, and it takes many minutes to import. I suspect there is a better way. Right now I'm opening the file, grabbing each line, parsing it, then pasting it to the worksheet (code below). Can someone recommend the best practice to do this as quickly as possible? I've already got calculations set to manual, and screenupdating to false. I was considering loading the entire file first, then parsing each line in memory, then pasting all the lines, but while that groups like operations together, I wasn't sure how it would save time if all of the same operations were still being done a line at a time. Thanks!! Keith Open LongFN For Input As #1 Do While Not EOF(1) Line Input #1, LineofText If Len(LineofText) 3 Then rw = rw + 1 ' now parse LineofText according to the column widths and ' put the values in an array. For j = 1 To 30 ParseStart = TotalFileArray(WhichFile, j) ParseEnd = TotalFileArray(WhichFile, j + 1) If ParseEnd 0 Then TotalDataArray(WhichFile, j, rw) = Trim(Mid(LineofText, ParseStart, ParseEnd - ParseStart)) ConvertCol (j) mywrksht.Range(Usecol & Trim(Str(rw))).Value = TotalDataArray(WhichFile, j, rw) End If Next End If Loop Close #1 'Close the file |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Keith
My understanding is that each "write" to the spreadsheet carries an overhead. So, you could try parsing something like 100 rows of data into a variant array and then writing that to the spreadsheet by simply setting the appropriate range equal to the array. This will also cut out your call to your ConvertCol routine. HTH Peter |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
I've updated my code to read the entire file first, one line at a time into
an array (wow, that does seem faster, now down to about 30 seconds), parse each line in memory (almost instant), and now I'm ready to set the worksheet range equal to the array. I now have a few more questions. Using Excel 2003. 1) I'm never sure how many rows the data file will have, so I start with an oversized array so I won't have to continually redim/preserve. My array is dimmed as testarray(1 To 32, 1 To 50000). I'm not worried about memory, so is it ok to dim an oversized array or will that cause some other problem I don't know about yet? 2) I'm not sure how many rows are in the data file in advance, so I increment a counter variable, rw, as I read each line. what is the appropriate syntax to set a range equal to the array? I tried the following, but it didn't work out. Sheet1.range("A1") = testArray() 'in case Excel could autoexpand the range to match the array size Sheet1.range("A1:AF50000") = testArray() 'to paste the whole array at once, risk overwriting surrounding areas with blanks Sheet1.range("A1").resize(rw,32) = testArray() ' to paste only the same number of rows as were in the raw file? I keep getting a type mismatch error :( 3) I'm actually reading the raw file into (32, 1 to 50000), then parsing each string back into fields 1-31. When I write the range back, is there an easy way to just write the first 31 fields of the array? My alternative is to erase field 32 before writing the whole array (probably with a redim/preserve to make the array smaller), but I wasn't sure if I could just write the desired section of the array- it might be useful if I later find out I need to go back to the original string for any reason. Many thanks!! Keith "Peter Grebenik" wrote in message ups.com... Keith My understanding is that each "write" to the spreadsheet carries an overhead. So, you could try parsing something like 100 rows of data into a variant array and then writing that to the spreadsheet by simply setting the appropriate range equal to the array. This will also cut out your call to your ConvertCol routine. HTH Peter |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Keith
I think that you need to swap your rows and columns so that it becomes testArray(rows,columns). To put your array into the sheet, you need to set the range size to be equal to the array size as the following sample code shows. I used this to copy a range of 40000 rows by 31 columns in about 10 seconds. Sub t() Dim r As Long, c As Long, a As Variant a = Selection r = UBound(a, 1) - LBound(a, 1) + 1 c = UBound(a, 2) - LBound(a, 2) + 1 Range("ah2").Resize(r, c) = a End Sub Peter |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
As Peter said you need to swap your rows/columns
Re not dumping unnecessary columns into cells with your potentially oversized array, while populating your array track the max number of used columns, then redim preserve down to the max used. Can't do that of course for the first dimension, the rows. Regards, Peter T "Keith R" wrote in message ... I've updated my code to read the entire file first, one line at a time into an array (wow, that does seem faster, now down to about 30 seconds), parse each line in memory (almost instant), and now I'm ready to set the worksheet range equal to the array. I now have a few more questions. Using Excel 2003. 1) I'm never sure how many rows the data file will have, so I start with an oversized array so I won't have to continually redim/preserve. My array is dimmed as testarray(1 To 32, 1 To 50000). I'm not worried about memory, so is it ok to dim an oversized array or will that cause some other problem I don't know about yet? 2) I'm not sure how many rows are in the data file in advance, so I increment a counter variable, rw, as I read each line. what is the appropriate syntax to set a range equal to the array? I tried the following, but it didn't work out. Sheet1.range("A1") = testArray() 'in case Excel could autoexpand the range to match the array size Sheet1.range("A1:AF50000") = testArray() 'to paste the whole array at once, risk overwriting surrounding areas with blanks Sheet1.range("A1").resize(rw,32) = testArray() ' to paste only the same number of rows as were in the raw file? I keep getting a type mismatch error :( 3) I'm actually reading the raw file into (32, 1 to 50000), then parsing each string back into fields 1-31. When I write the range back, is there an easy way to just write the first 31 fields of the array? My alternative is to erase field 32 before writing the whole array (probably with a redim/preserve to make the array smaller), but I wasn't sure if I could just write the desired section of the array- it might be useful if I later find out I need to go back to the original string for any reason. Many thanks!! Keith "Peter Grebenik" wrote in message ups.com... Keith My understanding is that each "write" to the spreadsheet carries an overhead. So, you could try parsing something like 100 rows of data into a variant array and then writing that to the spreadsheet by simply setting the appropriate range equal to the array. This will also cut out your call to your ConvertCol routine. HTH Peter |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Peter- I apologize for being so dense- I've not taken this approach before,
so the syntax is killing me. I switched the rows and columns, so that part is done. I believe the Ubound and LBound would need to refer to the array, so therefore 'a' has to refer to my array. I changed the line a=selection to a=myarray() It gets through the Ubound/LBound lines and returns correct size paramters, but when it tries to assign the array to my worksheet, I get a runtime error 1004 application- or object-defined error. Sheet17.Range("a1").Resize(r, c) = a values= Sheet17.range("a1").Resize(5000,32)=a I also reversed it in case the resize was based on my original (switched) array order Sheet17.range("a1").resize(32,5000)=a but got the same error. I tried both a=myarray() and a=myarray Thank you for your continued help and patience, Keith "Peter Grebenik" wrote in message oups.com... Keith I think that you need to swap your rows and columns so that it becomes testArray(rows,columns). To put your array into the sheet, you need to set the range size to be equal to the array size as the following sample code shows. I used this to copy a range of 40000 rows by 31 columns in about 10 seconds. Sub t() Dim r As Long, c As Long, a As Variant a = Selection r = UBound(a, 1) - LBound(a, 1) + 1 c = UBound(a, 2) - LBound(a, 2) + 1 Range("ah2").Resize(r, c) = a End Sub Peter |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Ok, I have it working now-
The reason I was getting the error was I had not yet redimmed my array to truncate the last field (the one the contained the original text string, in case I needed it later). The error was occuring when it tried to paste the original strings into cells. Once I redimmed the array and shrank the paste range accordingly, everything pasted just fine. I know there is a visible character limit of something like 255 characters in a cell, but I thought that the cell could actually hold a good bit more than that. I'm now assuming that I hit an upper limit on the actual cell string length, whatever that limit is. Thanks again to Peter G and Peter T for the help, Keith "Keith R" wrote in message ... Peter- I apologize for being so dense- I've not taken this approach before, so the syntax is killing me. I switched the rows and columns, so that part is done. I believe the Ubound and LBound would need to refer to the array, so therefore 'a' has to refer to my array. I changed the line a=selection to a=myarray() It gets through the Ubound/LBound lines and returns correct size paramters, but when it tries to assign the array to my worksheet, I get a runtime error 1004 application- or object-defined error. Sheet17.Range("a1").Resize(r, c) = a values= Sheet17.range("a1").Resize(5000,32)=a I also reversed it in case the resize was based on my original (switched) array order Sheet17.range("a1").resize(32,5000)=a but got the same error. I tried both a=myarray() and a=myarray Thank you for your continued help and patience, Keith "Peter Grebenik" wrote in message oups.com... Keith I think that you need to swap your rows and columns so that it becomes testArray(rows,columns). To put your array into the sheet, you need to set the range size to be equal to the array size as the following sample code shows. I used this to copy a range of 40000 rows by 31 columns in about 10 seconds. Sub t() Dim r As Long, c As Long, a As Variant a = Selection r = UBound(a, 1) - LBound(a, 1) + 1 c = UBound(a, 2) - LBound(a, 2) + 1 Range("ah2").Resize(r, c) = a End Sub Peter |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Keith
I hope you got your improvement in speed. A little experimentation seems to indicate that the maximum string length that can be copied by this method is about 900 characters in Excel 2003. Helpfully, Microsoft have arranged different limits in different versions of Excel! Best wishes Peter |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
A cell can display the first 1024 characters but store 32K, whether as the
result of a formula, a value or after copy/paste [a1].Formula = "=REPT(""a"",2^15-1)" [a2].Formula = "=LEN(A1)" [a3].Value = [a1].Value [a4].Formula = "=LEN(A1)" [a3].Copy [a5] ' copy/paste [a6].Formula = "=LEN(A5)" The above is a bit artificial and the respective limits may be slightly less. Max length of a formula (beginning with an =) is 1024 or a bit less. Regards, Peter T "Keith R" wrote in message ... Ok, I have it working now- The reason I was getting the error was I had not yet redimmed my array to truncate the last field (the one the contained the original text string, in case I needed it later). The error was occuring when it tried to paste the original strings into cells. Once I redimmed the array and shrank the paste range accordingly, everything pasted just fine. I know there is a visible character limit of something like 255 characters in a cell, but I thought that the cell could actually hold a good bit more than that. I'm now assuming that I hit an upper limit on the actual cell string length, whatever that limit is. Thanks again to Peter G and Peter T for the help, Keith "Keith R" wrote in message ... Peter- I apologize for being so dense- I've not taken this approach before, so the syntax is killing me. I switched the rows and columns, so that part is done. I believe the Ubound and LBound would need to refer to the array, so therefore 'a' has to refer to my array. I changed the line a=selection to a=myarray() It gets through the Ubound/LBound lines and returns correct size paramters, but when it tries to assign the array to my worksheet, I get a runtime error 1004 application- or object-defined error. Sheet17.Range("a1").Resize(r, c) = a values= Sheet17.range("a1").Resize(5000,32)=a I also reversed it in case the resize was based on my original (switched) array order Sheet17.range("a1").resize(32,5000)=a but got the same error. I tried both a=myarray() and a=myarray Thank you for your continued help and patience, Keith "Peter Grebenik" wrote in message oups.com... Keith I think that you need to swap your rows and columns so that it becomes testArray(rows,columns). To put your array into the sheet, you need to set the range size to be equal to the array size as the following sample code shows. I used this to copy a range of 40000 rows by 31 columns in about 10 seconds. Sub t() Dim r As Long, c As Long, a As Variant a = Selection r = UBound(a, 1) - LBound(a, 1) + 1 c = UBound(a, 2) - LBound(a, 2) + 1 Range("ah2").Resize(r, c) = a End Sub Peter |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Peter
I was merely pointing out the limitations of copying data by setting a range equal to an array. It means that this method cannot be used to copy cells containing very long text strings; they have to be copied using the copy method. Peter |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
load/parse large text file
Hi Peter,
I have to confess I only scan read parts of the thread before posting the previous two times, eg I didn't pick up on "this method" in your adjacent post I see your point, for me in a very light test the array to cells method truncates to 1823 characters [a1].Formula = "=Rept(""a"", 20000)" [a1:d5] = [a1].Value varr = [a1:d5].Value Debug.Print Len(varr(2, 2)) ' 20000, picks up fine Range("A11:d16") = varr Debug.Print Len(Range("A11")) ' 1823, dumps truncated Regards, Peter T "Peter Grebenik" wrote in message oups.com... Peter I was merely pointing out the limitations of copying data by setting a range equal to an array. It means that this method cannot be used to copy cells containing very long text strings; they have to be copied using the copy method. Peter |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
load a CSV file and have it take all fields as text | Excel Discussion (Misc queries) | |||
How to parse data in text file | Excel Programming | |||
Can I load text file into Excel seting field delimter using cmd li | Excel Discussion (Misc queries) | |||
Load text file and write back | Excel Programming | |||
Parse Text File | Excel Programming |