View Single Post
  #1   Report Post  
Posted to microsoft.public.excel.programming
[email protected] mwebb415@yahoo.com is offline
external usenet poster
 
Posts: 2
Default Parsing a Text file into worksheet with VBA

Hi,

I'm trying to take a structured text file (saved from a PDF file) and
read it into an Excel worksheet with a macro. The problem is that the
structure isn't straightforward. Every section in the file contains
~50 rows, and the delimiters aren't consistent. For example:

Section1
Header line
Customer: Acme Rockets Address: 22 Middle Street
State: AZ
Product: Super Rocket Qty: 12
.. . .
Section2
Header line
Customer: Acme Fireworks Address: 66 B Street
State: AB
Product: Coyote Killer Qty: 24
.. . .

The "Header line" is always the same, and not needed in the Excel file.
I want the worksheet to have one row of data for each section.

Customer Address State Product Qty
Acme Rockets 22 Middle Street AZ Super Rocket 12
Acme Fireworks 66 B Street AB Coyote Killer 24

I did look at the often-linked page:
http://www.cpearson.com/excel/imptext.htm
But since my delimiters are not consistent, I was torn on how to
accomplish this. Also, since they aren't necessarily all on newlines,
I'm having trouble coming up with the best way to break them out.

I was thinking of using an array of delimiters and then cycling through
that as I read each line of the file, but using the approach from the
link above, that gets problematic when the field is on a different
line. Anyone have any suggestions?

The text file from the PDF appears to be the only option - HTML and XML
both end up representing the details on the page as images. Same for
RTF or DOC files.

Thanks

Matt