ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Creating Table from Web-Based Material (https://www.excelbanter.com/excel-programming/288688-creating-table-web-based-material.html)

Ken[_19_]

Creating Table from Web-Based Material
 
This is a large question, and I appreciate any level of response that
can be given, including whether what I seek to accomplish is possible
or practical.

I am trying to determine how completely I can automate the process of
taking material from a web site and putting it into an Excel table.
The web-based tables show names and "x" for a vote under "Aye" or
"Nay." An example may be found at
http://frwebgate.access.gpo.gov/cgi-...8_cong_reports
by searching the page for fc-14 (there are other votes below this
one). As you will see, these vote tables are a small portion of a
larger report.

I would like each vote to be pulled into an Excel sheet and displayed
as follows:

Name Aye(vote1) Nay(vote1) Aye(vote2) Nay (vote2) etc.

Name1 x x
Name2 x x
etc.

Even better would be to pull the votes from multiple reports which may
be found by going to
http://www.gpoaccess.gov/serialset/creports/index.html then checking
House and putting "financial services" in the search box and clicking
submit. Not all of the reports that come up show votes. There are,
of course, a number of committees, but start with this one.

This can be done, of course, by a web query to the report in Excel,
but non-table text must be deleted, the votes lined up with "text to
columns" (fixed width), each set of votes copied and pasted
horizontally next to the preceding and repeated names (across)
deleted.

Since search engine spiders grab huge amounts of information and
e-mail addresses and other information are culled from the web, what I
am trying to do seems possible to this non-programmer. Your advice is
appreciated.


Ken

tolgag[_6_]

Creating Table from Web-Based Material
 
Hi Ken,

Actually I don't know, how deep you want to analyze the data embedded
in the html, but if you want a custom-solving, you have to write an
HTML-Parser (as Add-In or dll).

I needed a similar thing with Access once, but I have to admit, if your
knowledge is not far enough with VBA, it will be very difficult to code
it. And data in irregular tables is komplex to extract. I think there
are some standart solutions for Excel on the market which you can buy
and easily install.


---
Message posted from http://www.ExcelForum.com/


Tim Williams

Creating Table from Web-Based Material
 
Ken,

Your sample page is not an HTML document as such, just a lot of plain
text wrapped in <pre</pre to preserve the formatting. This makes it
more difficult to extract data, since there's no real tabular
structure to attack.

To get at the plain text you'd have to come up with some sort of
parameterized parsing routine: is definitely doable if the tables are
have similar structure. If structures vary slightly then the routine
might be able to compensate for (e.g.) different column widths, but
beyond that the amount of coding required would increase.

So yes, it's possible but as a "non-programmer" it would present a
challenge.

Regards,
Tim.


"Ken" wrote in message
om...
This is a large question, and I appreciate any level of response

that
can be given, including whether what I seek to accomplish is

possible
or practical.

I am trying to determine how completely I can automate the process

of
taking material from a web site and putting it into an Excel table.
The web-based tables show names and "x" for a vote under "Aye" or
"Nay." An example may be found at

http://frwebgate.access.gpo.gov/cgi-...8_cong_reports
by searching the page for fc-14 (there are other votes below this
one). As you will see, these vote tables are a small portion of a
larger report.

I would like each vote to be pulled into an Excel sheet and

displayed
as follows:

Name Aye(vote1) Nay(vote1) Aye(vote2) Nay (vote2)

etc.

Name1 x x
Name2 x x
etc.

Even better would be to pull the votes from multiple reports which

may
be found by going to
http://www.gpoaccess.gov/serialset/creports/index.html then checking
House and putting "financial services" in the search box and

clicking
submit. Not all of the reports that come up show votes. There are,
of course, a number of committees, but start with this one.

This can be done, of course, by a web query to the report in Excel,
but non-table text must be deleted, the votes lined up with "text to
columns" (fixed width), each set of votes copied and pasted
horizontally next to the preceding and repeated names (across)
deleted.

Since search engine spiders grab huge amounts of information and
e-mail addresses and other information are culled from the web, what

I
am trying to do seems possible to this non-programmer. Your advice

is
appreciated.


Ken





All times are GMT +1. The time now is 12:22 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com