Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 62
Default Smart way to delete the first row of a huge file

Hello

Would anyone know a better way to remove the first line of a huge txt
file?

I am using a FileSystemObject/TextStream, doing a first readline, and
then looping:
Destination.WriteLine Source.ReadLine

but the two files are located on drives and it is taking hours to run.
Is anyone aware of a smarter way? (I guess the solution would be to
directly delete the first row in a file and save the change...without
loading it all to memory, but not sure of how to achieve that)

Thanks in advance
Charles
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default Smart way to delete the first row of a huge file

I think you're pretty much stuck with how you're doing it now.
How large is the file. Do you mean hours literally ?

Tim


"Charles" wrote in message
...
Hello

Would anyone know a better way to remove the first line of a huge txt
file?

I am using a FileSystemObject/TextStream, doing a first readline, and
then looping:
Destination.WriteLine Source.ReadLine

but the two files are located on drives and it is taking hours to run.
Is anyone aware of a smarter way? (I guess the solution would be to
directly delete the first row in a file and save the change...without
loading it all to memory, but not sure of how to achieve that)

Thanks in advance
Charles



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Smart way to delete the first row of a huge file

How huge is "huge"? I would think the following should work fairly quickly
on a file up to 20 Megs in size and, while slowing down on file sizes larger
than this, it should still work much, much faster what you are describing
for your present code. Give it a try and let us know...

Sub RemoveFirstLine(PathAndFileName As String)
Dim FileNum As Long
Dim TotalFile As String
FileNum = FreeFile
Open PathAndFileName For Binary As #FileNum
TotalFile = Space(LOF(FileNum))
Get #FileNum, , TotalFile
Close #FileNum
FileNum = FreeFile
Open PathAndFileName For Output As #FileNum
Print #FileNum, Mid$(TotalFile, InStr(TotalFile, vbCrLf) + 2)
Close #FileNum
End Sub

You would call this subroutine from within your own code like this...

Sub Test()
'
' <<< Some code
'
RemoveFirstLine "C:\TEMP\TestData.txt"
'
' <<< Some more code
'
End Sub

where, of course, you would substitute your own path and filename for the
sample one I used. If you think you might need it (I don't), you can add a
slight measure of safety to the RemoveFirstLine subroutine by Output'ting
the modified TotalFile (the Mid$ function call) to a different filename in
the path, then executing a Kill statement for the original file, and finally
executing a Name..As statement to rename the Output'ted filename back to the
original filename.

Rick


"Charles" wrote in message
...
Hello

Would anyone know a better way to remove the first line of a huge txt
file?

I am using a FileSystemObject/TextStream, doing a first readline, and
then looping:
Destination.WriteLine Source.ReadLine

but the two files are located on drives and it is taking hours to run.
Is anyone aware of a smarter way? (I guess the solution would be to
directly delete the first row in a file and save the change...without
loading it all to memory, but not sure of how to achieve that)

Thanks in advance
Charles


  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 62
Default Smart way to delete the first row of a huge file

Hi

by huge I mean 200Meg+. The problem is that in top of that the file is
stored on a network drive, so it's really slow. The solution I found
to improve it is to copy it to the C drive first, process it, and then
copy it back. That's substentially faster than processing it directly
from the network drive.

It doesn't take hours but it takes a good 4-5min, which feels like
hours when you are waiting in front of your screen!

Rick: I'll try your solution.

Thanks!
Charles
  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 62
Default Smart way to delete the first row of a huge file

Actually, not sure it works. Your code requires to load the whole
200MB file into memory in one go. Which would be probably faster than
having an access to the drive for each line read/written. But that's
going to be a big ask for Excel. Actually can a single string contain
200MB of data?

I actually have another question: I never really understood the
meaning of the "#" character before variables, or the "$" sign after
the name of the function. How different is it from simply using mid()
or filenum?

Charles


  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Smart way to delete the first row of a huge file

Yes, a single string can contain 200MB of data; the actual limit is
approximately 2 billion (2^31) characters. So, the only restriction on how
well the code I posted will work is available RAM (I think a total of 400MB
would be necessary for your file... 200MB for the string as it is read into
the TotalFile variable and another 200MB for the temporary string I think
gets created by the Mid$ function call). If there is not enough available
RAM, the code should still work, but the system paging back out to the hard
drive would be a performance killer. (All this is why I asked you how huge
"huge" was.)

The # signs are part of the syntax for addressing a channel number to a file
(although in some of the locations I used it in, it may be optional). As for
the $ sign, String functions have two forms... without the $ sign, they
return variants with a subtype of String (variants are slow and memory
wasters... not usually a concern outside of large loops); with the $ sign,
they return a pure String data type (and hence, when used in further String
manipulations, are as efficient as the function will get).

Rick


"Charles" wrote in message
...
Actually, not sure it works. Your code requires to load the whole
200MB file into memory in one go. Which would be probably faster than
having an access to the drive for each line read/written. But that's
going to be a big ask for Excel. Actually can a single string contain
200MB of data?

I actually have another question: I never really understood the
meaning of the "#" character before variables, or the "$" sign after
the name of the function. How different is it from simply using mid()
or filenum?

Charles


  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 62
Default Smart way to delete the first row of a huge file

Interesting. That's the kind of post to archive in the section "useful
stuff to know".

Thanks very much for your help!

Charles
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
huge huge excel file... why? Josh Excel Discussion (Misc queries) 12 February 9th 06 09:55 PM
File size is huge Lewis Excel Programming 3 November 18th 05 04:21 PM
File got huge BW Excel Discussion (Misc queries) 2 January 11th 05 01:57 AM
The XLS file is huge! John Wilson Excel Programming 0 October 6th 03 03:52 PM
The XLS file is huge! Ron de Bruin Excel Programming 0 October 6th 03 03:44 PM


All times are GMT +1. The time now is 11:15 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"