ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Capture Web Page Source Code (https://www.excelbanter.com/excel-programming/445339-capture-web-page-source-code.html)

[email protected][_2_]

Capture Web Page Source Code
 
I was trying to collect some real estate data from a website and found that I was unable to capture the source code using the usual techniques. For example,

my_url = "http://www.coloproperty.com/index.cfm?&Action=ShowFrameSet&GotoApp=Listings&Go toAction=DoAddressSearch&StreetName=Eisenhower&Cit y=61&ResultType=quickReport"

With ie
.Visible = True
.Navigate my_url
.Top = 50
.Left = 530
.Height = 400
.Width = 400

Do Until .ReadyState = 4
DoEvents
Loop
End With

my_var = ie.document.body.innerhtml

opens the desired web page but does not capture the source code. If I right-click the web page and select "View Source", I can see the source code I'd like to capture, but the variable contains just a few lines of something different. Any help on capturing the actual source code behind this web page would be much appreciated...TIA, Ron

Don Guillett[_2_]

Capture Web Page Source Code
 
How about an external query? correct word wrap
MLS# Price SQFT Beds Baths Address Subdivision Locale Photo
Pano
672274 $279,900 1728 3 2 376 Eisenhower Dr Hunters Ridge
Louisville
665060 $499,000 3076 3 3 1806 Eisenhower Dr PONDEROSA
Louisville


On Feb 5, 9:36*am, " wrote:
I was trying to collect some real estate data from a website and found that I was unable to capture the source code using the usual techniques. *For example,

* my_url = "http://www.coloproperty.com/index.cfm?&Action=ShowFrameSet&GotoApp=Li..."

* With ie
* * * .Visible = True
* * * .Navigate my_url
* * * .Top = 50
* * * .Left = 530
* * * .Height = 400
* * * .Width = 400

* * Do Until .ReadyState = 4
* * * *DoEvents
* * Loop
* End With

* my_var = ie.document.body.innerhtml

opens the desired web page but does not capture the source code. *If I right-click the web page and select "View Source", I can see the source code I'd like to capture, but the variable contains just a few lines of something different. *Any help on capturing the actual source code behind this web page would be much appreciated...TIA, Ron



[email protected][_2_]

Capture Web Page Source Code
 
Thanks Don, what code did you use? When I use

With ActiveSheet.QueryTables.Add(Connection:= _
"URL;" & my_url, _
Destination:=Range("A1"))
.BackgroundQuery = True
.TablesOnlyFromHTML = False
.Refresh BackgroundQuery:=False
.SaveData = True
End With

nothing is imported into the activesheet. Also, do you know of anyway to capture the source code and not import anything directly into the workbook?..Ron

[email protected][_2_]

Capture Web Page Source Code
 
For anyone who follows, turns out the info I wanted was in a frame and

my_var = my_var = ie.Document.frames("content").Document.DocumentEle ment.innerhtml

does the trick.

Thanks again to Don...Ron

Don Guillett[_2_]

Capture Web Page Source Code
 
On Feb 5, 11:59*am, " wrote:
Thanks Don, what code did you use? *When I use

* * With ActiveSheet.QueryTables.Add(Connection:= _
* * * "URL;" & my_url, _
* * * Destination:=Range("A1"))
* * * * .BackgroundQuery = True
* * * * .TablesOnlyFromHTML = False
* * * * .Refresh BackgroundQuery:=False
* * * * .SaveData = True
* * End With

nothing is imported into the activesheet. *Also, do you know of anyway to capture the source code and not import anything directly into the workbook?..Ron


I clicked on the urlit changed to which I used for the query.

With ActiveSheet.QueryTables.Add(Connection:= _
"URL;http://www.coloproperty.com/Listings/index.cfm?
&Action=DoAddressSearch&ResultType=quickReport&Cit y=61&StreetName=Eisenhower"
_
, Destination:=Range("A5"))

[email protected][_2_]

Capture Web Page Source Code
 
Don...Thanks for pointing that out. With your url, the information is not in a frame so I can simply use
my_var = ie.document.body.innerhtml

Better yet, I don't even need to open IE. With your url the "GET" method works just fine.
Set my_obj = CreateObject("MSXML2.XMLHTTP")
my_obj.Open "GET", my_url, False
my_obj.send
my_var = my_obj.responsetext
Set my_obj = Nothing

....Ron


All times are GMT +1. The time now is 03:04 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com