View Single Post
  #2   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.vb.general.discussion
Tom Ogilvy Tom Ogilvy is offline
external usenet poster
 
Posts: 27,285
Default Simplest way to parse (read) HTML formatted data in via Excel VBA (or VB6)

Just open the html file in Excel using file open. then look at the result.
Most of it should be parsed into separate columns.

--
Regards,
Tom Ogilvy

"Steve" wrote in message
...
I distributed some workbook apps in 2001 that are currently being used
for budgeting and forecasting work. A main feature of the apps is the
ability to pull current-year data from the proprietary accounting software
that the businesses use. The users were able to run a general ledger
trial balance, save it in plain text format, and my apps would parse the
data and import it into the workbooks.

Well, now their accounting software has been updated to export the
reports into html format, not plain text.

So, instead of seeing:

ACCOUNT-NO
DESCRIPTION BEGINNING BALANCE ---------TOTAL--------

NET CHANGE ENDING BALANCE BUDGET PRIOR-YEAR

010-30120-00 14,606.00- DEBIT .00

8,185.00- 22,791.00- 13,952- 15,549-
IND PREFERRED CREDIT 8,185.00

TRX-DATE DR-AMOUNT CR-AMOUNT

RUNNING-BALANCE SOURCE REFERENCE
08/01/00 227.00

14,833.00- IJ01 FRIDAY
08/02/00 216.00

15,049.00- IJ02 SATURDAY
08/03/00 233.00

15,282.00- IJ03 SUNDAY


It now looks like this:


<table
<TR<TD010-30110-00 </TD<TD ALIGN=RIGHT

14,606.00-</TD<TD ALIGN=RIGHTDEBIT</TD<TD COLSPAN=2 ALIGN=RIGHT
..00 </TD<TD COLSPAN=2 ALIGN=RIGHT 8,185.00-</TD<TD ALIGN=RIGHT
22,791.00-</TD<TD ALIGN=RIGHT 13,952-</TD<TD ALIGN=RIGHT
15,549-</TD</TR
<TR<TDIND - FIT PACKAGE </TD<TD COLSPAN=2

ALIGN=RIGHTCREDIT</TD<TD COLSPAN=2 ALIGN=RIGHT 8,185.00</TD<TD
COLSPAN=6 &nbsp;</TD</TR
<TR<TD COLSPAN=10 &nbsp;</TD</TR
<TR<TD ALIGN=RIGHTTRX-DATE</TD<TD ALIGN=CENTERDR-AMOUNT</TD<TD

COLSPAN=2 ALIGN=CENTERCR-AMOUNT</TD<TD COLSPAN=2 ALIGN=RIGHT RUNNG-BAL
</TD<TD ALIGN=CENTERSOURCE</TD<TD COLSPAN=3
ALIGN=CENTERREFERENCE</TD</TR
<TR<TD ALIGN=RIGHT08/01/03</TD<TD ALIGN=RIGHT </TD<TD

COLSPAN=2 ALIGN=RIGHT 227.00</TD<TD COLSPAN=2 ALIGN=RIGHT
14,833.00-</TD<TD ALIGN=CENTERIJ01 </TD<TD COLSPAN=3 Friday
</TD</TR
<TR<TD ALIGN=RIGHT08/02/03</TD<TD ALIGN=RIGHT </TD<TD

COLSPAN=2 ALIGN=RIGHT 216.00</TD<TD COLSPAN=2 ALIGN=RIGHT
15,049.00-</TD<TD ALIGN=CENTERIJ02 </TD<TD COLSPAN=3 Saturday
</TD</TR
<TR<TD ALIGN=RIGHT08/03/03</TD<TD ALIGN=RIGHT </TD<TD

COLSPAN=2 ALIGN=RIGHT 233.00</TD<TD COLSPAN=2 ALIGN=RIGHT
15,282.00-</TD<TD ALIGN=CENTERIJ03 </TD<TD COLSPAN=3 Sunday
</TD</TR
</table


Any advice on what is the best way to parse data in HTML format?
Does VBA have any built-in support for this purpose?
The versions of Excel being used are 2000 and 2002 exclusively.

Should I bypass Excel altogether and write a VB6 app that converts
the html formatted report into the text format that Excel expects?
I'm thinking that might be the best bet, in case future software updates
to their accounting software make any html formatting changes.

In any case, I would still love to know if there is an easier way to
parse the html. It looks like all I really need to do is to be able to

read
the table cell contents in a logical manner.

Thanks

Steve