Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 7
Default VBA Excel: Opening Very Large Text Files

Trying to open very large text files (200 MB+) using an Excel VBA
program.

I am first using the "Open text_filename For Input As #1" command to
open the file.

I then use a "Do While Not EOF(1)" loop and within that use "Line
Input #1" command to look at one line at at time and parse for a
specific text string

(If line has the desired text string, I am then printing the entire
line to a new text output file; the new output file will be much
smaller)


I tested my program on smaller text files and it works, but the 200 MB
huge files are causing Excel to lock up.

Questions:
- I thought the use of the "Line Input #1" would allow me to read one
line of text at a time? So even though text file is very large, I only
look at one line at at time.

- Or is "Open filename For Input" command still causing Excel to have
to open the entire 200 MB file?

- Is there some other way I can parse out the text lines I want from
such a huge text file?
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 7
Default VBA Excel: Opening Very Large Text Files

Thanks.

Unfortunately it is not at all a column structure. It's basically a
big ugly logfile off a SUN server.

The only consistent structure is that every line begins with a date
and time stamp, but after the stamp it's just text (no consistent
commas or other delimiters).

I know what I really need to do: Learn Unix shell or perl scripting so
I can do the text parsing in Unix, and then bring the output file into
Excel.



On Tue, 16 Sep 2003 20:16:07 -0700, "Rob Bovey"
wrote:

If your text file is well-formed (i.e. it's got a database-like column
and row structure throughout) then you can use ADO to query the contents of
the text file. Here's some generic code that shows how to do this. Note that
you have to set a reference to the ADO object library in order to use this
code.

''' Set a reference to the Microsoft ActiveX Data
''' Objects 2.X Library under Tools/References.
Public Sub GenericQueryTextFile()

Dim rsData As ADODB.Recordset
Dim szPath As String
Dim szConnect As String
Dim szSQL As String

''' This is the path where the text file
''' is located (not the filename).
szPath = ThisWorkbook.Path & "\"

''' Create the connection string.
szConnect = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & szPath & ";" & _
"Extended Properties=""Text;HDR=NO"";"
''' If the first row of the text file has column
''' names, the last line above should be:
''' "Extended Properties=Text;"

''' Create the SQL statement. This can be as
''' complex as you like to limit the number
''' of records returned.
''' "MyData.txt" should be the name of the
''' text file you're trying to query.
szSQL = "SELECT * FROM MyData.txt;"

Set rsData = New ADODB.Recordset
rsData.Open szSQL, szConnect, adOpenForwardOnly, _
adLockReadOnly, adCmdText

''' Check to make sure we received data.
If Not rsData.EOF Then
''' Either dump the whole data set onto Sheet1.
'Sheet1.UsedRange.Clear
'Sheet1.Range("A1").CopyFromRecordset rsData
''' Or check each record individually.
Do While Not rsData.EOF
If rsData.Fields(0).Value 100 Then
''' Do something here.
End If
rsData.MoveNext
Loop
Else
MsgBox "No records returned.", vbCritical
End If

''' Clean up our Recordset object.
rsData.Close
Set rsData = Nothing

End Sub


  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 2,824
Default VBA Excel: Opening Very Large Text Files

There are text editors that can extract data.

I use one called ultraedit32. (http://www.ultraedit.com)

You can do Edit|find and one of the options is to list the lines that match the
search string. Click a button to copy to the clipboard and then paste to a new
file. Then save that file.

It might not be as automatic as you want, but it's really neat for the one time
shots.

IIRC, it has a 30 day evaluation period. (Or about $30 USD).

wrote:

Trying to open very large text files (200 MB+) using an Excel VBA
program.

I am first using the "Open text_filename For Input As #1" command to
open the file.

I then use a "Do While Not EOF(1)" loop and within that use "Line
Input #1" command to look at one line at at time and parse for a
specific text string

(If line has the desired text string, I am then printing the entire
line to a new text output file; the new output file will be much
smaller)

I tested my program on smaller text files and it works, but the 200 MB
huge files are causing Excel to lock up.

Questions:
- I thought the use of the "Line Input #1" would allow me to read one
line of text at a time? So even though text file is very large, I only
look at one line at at time.

- Or is "Open filename For Input" command still causing Excel to have
to open the entire 200 MB file?

- Is there some other way I can parse out the text lines I want from
such a huge text file?


--

Dave Peterson

  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 30
Default VBA Excel: Opening Very Large Text Files


wrote in message
...

I know what I really need to do: Learn Unix shell or perl scripting so
I can do the text parsing in Unix, and then bring the output file into
Excel.


Ask, and you shall receive:

grep search_string source_file output_file

(obviously, replace search_string, source_file, and output_file
accordingly.)
--
HTH -

-Frank
Microsoft Excel MVP
Dolphin Technology Corp.
http://vbapro.com



  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 2,824
Default VBA Excel: Opening Very Large Text Files

And there are versions of grep that have been ported to the windows platform.

Take a look at www.shareware.com for grep.

or maybe a DOS equivalent:

find /i "searchstring" source_file output_file
(that looks kind of familiar???)

But I think I'd check to see how big a file DOS's Find will work with. (I don't
recall if there's a limit.)


Frank Isaacs wrote:

wrote in message
...

I know what I really need to do: Learn Unix shell or perl scripting so
I can do the text parsing in Unix, and then bring the output file into
Excel.


Ask, and you shall receive:

grep search_string source_file output_file

(obviously, replace search_string, source_file, and output_file
accordingly.)
--
HTH -

-Frank
Microsoft Excel MVP
Dolphin Technology Corp.
http://vbapro.com


--

Dave Peterson



  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 7
Default VBA Excel: Opening Very Large Text Files


Thanks for all the suggestions, I appreciate everyone taking the time.

Guess I have to open the can of worms further :-)

I was actually starting with a 600MB+ file and using grep to pull out
lines with desired text string.

But there are many, many lines with duplicate text strings (see
example below) and the resulting grep output was still 200MB+

For example, say that I have the following lines in the text file
(this is my 600MB file):

look at that dog1
look at that cat
look at that dog1
look at that mouse
look at that dog1
look at that dog2
look at that dog2
look at that bird
look at that dog3
look at that dog4
look at that dog4


I can easily grep out only those lines with "dog" in them and I get
this output (this is my 200MB file):

look at that dog1
look at that dog1
look at that dog1
look at that dog2
look at that dog2
look at that dog3
look at that dog4
look at that dog4


But I don't know of a way to use grep so that I also eliminate
duplicates so that output looks like this:

look at that dog1
look at that dog2
look at that dog3
look at that dog4


Any ideas (short of a Perl script)?

Thanks again.


  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 30
Default VBA Excel: Opening Very Large Text Files

Well, if Windows is your primary platform, I'd say Rob's ADO idea is going
to be your best bet.

If you can play around on the *nix side of things, in addition to grep,
you've got awk and sed. Since the data is starting there, you may want to do
as much of that raw processing as possible before moving it over to Excel,
and let Excel do what it's good at.
--
HTH -

-Frank
Microsoft Excel MVP
Dolphin Technology Corp.
http://vbapro.com


wrote in message
...

Thanks for all the suggestions, I appreciate everyone taking the time.

Guess I have to open the can of worms further :-)

I was actually starting with a 600MB+ file and using grep to pull out
lines with desired text string.

But there are many, many lines with duplicate text strings (see
example below) and the resulting grep output was still 200MB+

For example, say that I have the following lines in the text file
(this is my 600MB file):

look at that dog1
look at that cat
look at that dog1
look at that mouse
look at that dog1
look at that dog2
look at that dog2
look at that bird
look at that dog3
look at that dog4
look at that dog4


I can easily grep out only those lines with "dog" in them and I get
this output (this is my 200MB file):

look at that dog1
look at that dog1
look at that dog1
look at that dog2
look at that dog2
look at that dog3
look at that dog4
look at that dog4


But I don't know of a way to use grep so that I also eliminate
duplicates so that output looks like this:

look at that dog1
look at that dog2
look at that dog3
look at that dog4


Any ideas (short of a Perl script)?

Thanks again.




Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Opening text files Sandy5590 Excel Discussion (Misc queries) 0 October 15th 09 04:35 PM
Extracting data from large text files for beginner [email protected] Excel Worksheet Functions 1 November 12th 08 09:23 AM
Importing Large Text Files lotstolearn Excel Discussion (Misc queries) 1 September 26th 08 08:55 PM
Text Import Wizard (use for large files) Traci Excel Discussion (Misc queries) 2 November 14th 06 03:12 AM
Opening large text files freekrill Excel Discussion (Misc queries) 6 September 7th 05 04:44 AM


All times are GMT +1. The time now is 09:49 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"