ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Extracting/Exporting HTML Tables or PDF Tables into Excel (https://www.excelbanter.com/excel-programming/366756-extracting-exporting-html-tables-pdf-tables-into-excel.html)

[email protected]

Extracting/Exporting HTML Tables or PDF Tables into Excel
 
Hi All,

I just had a quick programming/general excel question surrounding my
current dilemma. Essentially, I am trying to extract financial tables
from SEC filings (either made in PDF or HTML). Ideally I would like to
have the capability of searching an SEC filing for a specific table
(i.e. lets say a "Consolidated Income Statement") and then have a macro
which would export that table into excel without losing the formatting.
If you guys have any idea as to how to go about doing this and
potentially provide me some starter code I would greatly appreciate
that.

Thanks

Mohammed

P.S. I know next to nothing about VB. So if you could explain what
parameters I may need it would be quite useful and what is going on
with the the code that would be helpful.


Mo Money

Extracting/Exporting HTML Tables or PDF Tables into Excel
 
Hmm I have no idea...but your questions seems quite interesting.

wrote:
Hi All,

I just had a quick programming/general excel question surrounding my
current dilemma. Essentially, I am trying to extract financial tables
from SEC filings (either made in PDF or HTML). Ideally I would like to
have the capability of searching an SEC filing for a specific table
(i.e. lets say a "Consolidated Income Statement") and then have a macro
which would export that table into excel without losing the formatting.
If you guys have any idea as to how to go about doing this and
potentially provide me some starter code I would greatly appreciate
that.

Thanks

Mohammed

P.S. I know next to nothing about VB. So if you could explain what
parameters I may need it would be quite useful and what is going on
with the the code that would be helpful.



NickHK

Extracting/Exporting HTML Tables or PDF Tables into Excel
 
Mohammed,
Reading the HTML files could be achieved with a web query. Look into
DataGet External DataNew Web Query, selecting the table to import from.
Getting data out PDF and into XL can be done manually as I've not looked
into coding this:
Open the PDF in Acrobat, NOT the Reader.
Use the Select Table tool.
Right click and export or open in Excel, depending on your version of
Acrobat.
Or can save the PDF as HTML, then web query that.

NickHK

"Mo Money" wrote in message
oups.com...
Hmm I have no idea...but your questions seems quite interesting.

wrote:
Hi All,

I just had a quick programming/general excel question surrounding my
current dilemma. Essentially, I am trying to extract financial tables
from SEC filings (either made in PDF or HTML). Ideally I would like to
have the capability of searching an SEC filing for a specific table
(i.e. lets say a "Consolidated Income Statement") and then have a macro
which would export that table into excel without losing the formatting.
If you guys have any idea as to how to go about doing this and
potentially provide me some starter code I would greatly appreciate
that.

Thanks

Mohammed

P.S. I know next to nothing about VB. So if you could explain what
parameters I may need it would be quite useful and what is going on
with the the code that would be helpful.





[email protected]

Extracting/Exporting HTML Tables or PDF Tables into Excel
 
For HTML to Excel, you might consider using the following script
extract -
---------------------------------------------------------------
sURL = "http://www.ibm.com"
On Error GoTo error_handler
Set objIE = CreateObject("InternetExplorer.Application")
With objIE
.Navigate sURL
Do While .Busy: DoEvents: Loop
RowNum = 1
ColNum = 1
With objIE.Document
Set theTables = .all.tags("table")
For Each Table In theTables
For Each Row In Table.Rows
For Each cell In Row.Cells
ws.Cells(RowNum, ColNum) = cell.innerText
ColNum = ColNum + 1
Next
RowNum = RowNum + 1
Next
Next
End With
End With
Set objIE = Nothing
Exit Sub
---------------------------------------------------------------
For PDF to Excel, there's no direct tool I could found, but you might
try PDF-HTML-Excel.

For PDF to HTML, you can use pdf2html, freely available on
sourceforge.net

NickHK wrote:
Mohammed,
Reading the HTML files could be achieved with a web query. Look into
DataGet External DataNew Web Query, selecting the table to import from.
Getting data out PDF and into XL can be done manually as I've not looked
into coding this:
Open the PDF in Acrobat, NOT the Reader.
Use the Select Table tool.
Right click and export or open in Excel, depending on your version of
Acrobat.
Or can save the PDF as HTML, then web query that.

NickHK

"Mo Money" wrote in message
oups.com...
Hmm I have no idea...but your questions seems quite interesting.

wrote:
Hi All,

I just had a quick programming/general excel question surrounding my
current dilemma. Essentially, I am trying to extract financial tables
from SEC filings (either made in PDF or HTML). Ideally I would like to
have the capability of searching an SEC filing for a specific table
(i.e. lets say a "Consolidated Income Statement") and then have a macro
which would export that table into excel without losing the formatting.
If you guys have any idea as to how to go about doing this and
potentially provide me some starter code I would greatly appreciate
that.

Thanks

Mohammed

P.S. I know next to nothing about VB. So if you could explain what
parameters I may need it would be quite useful and what is going on
with the the code that would be helpful.





All times are GMT +1. The time now is 03:09 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com