Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 25
Default getting the source code of a web page

I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page. (a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet) If I
open the page in IE, view the source, I can programmatically filter out the
coordinates.
So, my question: How to get the source saved to a string in vba
thanks.
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 9,101
Default getting the source code of a web page

You don't need to get the source. You can use the IE explorer like below to
help you filter the data. The source is under the IE.Document.all property.
You can get these lines by doing a loop like this

for each itm in IE.Document
'your code here
next itm

You can use the methods getElementsByTagname(), and getElementById() to help
you filter the data. I often for debuggin use something like this

RowCount = 1
for each itm in IE.Document
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.Classname
Range("C" & RowCount) = itm.id
Range("D" & RowCount) = left(itm.innertext,1024)

RowCount = rowcount + 1
next itm

The source is partioned into items. Has you go down the ALL property the
innertext is partioned into small pices. When you do the dump above you will
see the same innertext repeated over and over again but broken into samller
pieces each time it is repeated.

If you need more help give me the URL and I will get what you need. I help
lots of people who have had problems.




Sub GetZipCodes()


ZIPCODE = InputBox("Enter 5 digit zipcode : ")

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://zip4.usps.com/zip4/citytown_zip.jsp"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4 And _
IE.busy = True

DoEvents
Loop

Set Form = IE.document.getElementsByTagname("Form")

Set zip5 = IE.document.getElementById("zip5")
zip5.Value = ZIPCODE


Set ZipCodebutton = Form(0).onsubmit

Form(0).submit
Do While IE.busy = True
DoEvents
Loop

Set Table = IE.document.getElementsByTagname("Table")
Location = Table(0).Rows(2).innertext
IE.Quit
MsgBox ("Zip code = " & ZIPCODE & " City/State = " & Location)


End Sub



"Southern at Heart" wrote:

I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page. (a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet) If I
open the page in IE, view the source, I can programmatically filter out the
coordinates.
So, my question: How to get the source saved to a string in vba
thanks.

  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 25
Default getting the source code of a web page

....so where would I put the URL in this code.
On this page,
http://www.ymca.net/maps/profile.aspx?assn=6757
I'n needing to asign the name, address, & GPS coordinates to a string.
thanks.

"Joel" wrote:

You don't need to get the source. You can use the IE explorer like below to
help you filter the data. The source is under the IE.Document.all property.
You can get these lines by doing a loop like this

for each itm in IE.Document
'your code here
next itm

You can use the methods getElementsByTagname(), and getElementById() to help
you filter the data. I often for debuggin use something like this

RowCount = 1
for each itm in IE.Document
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.Classname
Range("C" & RowCount) = itm.id
Range("D" & RowCount) = left(itm.innertext,1024)

RowCount = rowcount + 1
next itm

The source is partioned into items. Has you go down the ALL property the
innertext is partioned into small pices. When you do the dump above you will
see the same innertext repeated over and over again but broken into samller
pieces each time it is repeated.

If you need more help give me the URL and I will get what you need. I help
lots of people who have had problems.




Sub GetZipCodes()


ZIPCODE = InputBox("Enter 5 digit zipcode : ")

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://zip4.usps.com/zip4/citytown_zip.jsp"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4 And _
IE.busy = True

DoEvents
Loop

Set Form = IE.document.getElementsByTagname("Form")

Set zip5 = IE.document.getElementById("zip5")
zip5.Value = ZIPCODE


Set ZipCodebutton = Form(0).onsubmit

Form(0).submit
Do While IE.busy = True
DoEvents
Loop

Set Table = IE.document.getElementsByTagname("Table")
Location = Table(0).Rows(2).innertext
IE.Quit
MsgBox ("Zip code = " & ZIPCODE & " City/State = " & Location)


End Sub



"Southern at Heart" wrote:

I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page. (a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet) If I
open the page in IE, view the source, I can programmatically filter out the
coordinates.
So, my question: How to get the source saved to a string in vba
thanks.

  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 9,101
Default getting the source code of a web page

IO got the address. I'm not sue if I can get the GPS address from here. the
zip code is passed to an activex utility and I don't think that is available
at this URL. To get the URL there is probably a Google utility that will do
thhis. I would have to look for it. Never tried it before.

Sub GetAddress()


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.ymca.net/maps/profile.aspx?assn=6757"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4 Or _
IE.Busy = True
DoEvents
Loop
Call Dump(IE)


a = 1
Set Title = IE.document.getElementsByTagName("Title")
YName = Title.Item(0).innertext
MsgBox (YName)
Set P = IE.document.getElementsByTagName("p")
Address = P.Item(1).innertext
MsgBox (Address)

IE.Quit


End Sub
Sub Dump(IE)
RowCount = 1
Cells.ClearContents
For Each itm In IE.document.all
Range("A" & RowCount) = itm.tagName
Range("B" & RowCount) = itm.ID
Range("C" & RowCount) = itm.className
Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm

End Sub



  #5   Report Post  
Posted to microsoft.public.excel.programming
ron ron is offline
external usenet poster
 
Posts: 118
Default getting the source code of a web page

On Oct 6, 8:52*pm, Southern at Heart
wrote:
I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page. *(a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet) *If I
open the page in IE, view the source, I can programmatically filter out the
coordinates. *
So, my question: *How to get the source saved to a string in vba
thanks.


If you are repetitively interrogating a website, then capturing the
source code using the "GET" construction shown below can be used to
ones advantage. This method assigns the source code to a variable
(my_var in the construction shown below). In a subsequent step the
variable can be parsed for the desired data. This method is typically
much faster than opening IE and extracting the same information. I
have a macro that collects data on over 700 real estate addresses
found at a website. The macro takes over an hour when I run it
through IE, but only 14 minutes when I use the GET method...Ron

Sub Test()
my_url = "http://www.google.com"
Set my_obj = CreateObject("MSXML2.XMLHTTP")
my_obj.Open "GET", my_url, False
my_obj.send
my_var = my_obj.responsetext
Set my_obj = Nothing
End Sub


  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 9,101
Default getting the source code of a web page

I now got the GPS locations. It was buried in the webpage at a spot I didn't
check.

Sub GetAddress()

Dim Latitude As String
Dim Longitude As String


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.ymca.net/maps/profile.aspx?assn=6757"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4 Or _
IE.Busy = True
DoEvents
Loop
'Call Dump(IE)



Set Title = IE.document.getElementsByTagName("Title")
YName = Title.Item(0).innertext
MsgBox (YName)


Set P = IE.document.getElementsByTagName("p")
Address = P.Item(1).innertext
MsgBox (Address)

Set Scripts = IE.document.getElementsByTagName("Script")
RowCount = 1
For Each Script In Scripts
If InStr(Script.outerHTML, "GLatLng") 0 Then
GLatLngStart = InStr(Script.outerHTML, "GLatLng")
GLatLng = Mid(Script.outerHTML, GLatLngStart)
'remove parenthisis
GLatLngStart = InStr(GLatLng, "(")
GLatLng = Mid(GLatLng, GLatLngStart + 1)

GLatLngEnd = InStr(GLatLng, ")")
GLatLng = Left(GLatLng, GLatLngEnd - 1)

GPS = Split(GLatLng, ",")
MsgBox ("Latitude : " & GPS(0) & vbCrLf & "Longitude : " & GPS(1))

Exit For

End If
RowCount = RowCount + 1
Next Script

IE.Quit


End Sub
Sub Dump(IE)
RowCount = 1
Cells.ClearContents
For Each itm In IE.document.all
Range("A" & RowCount) = itm.tagName
Range("B" & RowCount) = itm.ID
Range("C" & RowCount) = itm.className
Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm

End Sub


"Joel" wrote:

IO got the address. I'm not sue if I can get the GPS address from here. the
zip code is passed to an activex utility and I don't think that is available
at this URL. To get the URL there is probably a Google utility that will do
thhis. I would have to look for it. Never tried it before.

Sub GetAddress()


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.ymca.net/maps/profile.aspx?assn=6757"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4 Or _
IE.Busy = True
DoEvents
Loop
Call Dump(IE)


a = 1
Set Title = IE.document.getElementsByTagName("Title")
YName = Title.Item(0).innertext
MsgBox (YName)
Set P = IE.document.getElementsByTagName("p")
Address = P.Item(1).innertext
MsgBox (Address)

IE.Quit


End Sub
Sub Dump(IE)
RowCount = 1
Cells.ClearContents
For Each itm In IE.document.all
Range("A" & RowCount) = itm.tagName
Range("B" & RowCount) = itm.ID
Range("C" & RowCount) = itm.className
Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm

End Sub



  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 25
Default getting the source code of a web page

Okay, I'm needing to do this same thing again, but can't figure out where the
item/innertext I need is:
The webpage is:
http://www.gocomics.com/calvinandhobbes/2010/01/16

....and what I need is the image (on this particular page it's):
http://imgsrv.gocomics.com/dim/?fh=8...8010a8&w=900.0

thanks, SouthernAtHeart


"Joel" wrote:

You don't need to get the source. You can use the IE explorer like below to
help you filter the data. The source is under the IE.Document.all property.
You can get these lines by doing a loop like this

for each itm in IE.Document
'your code here
next itm

You can use the methods getElementsByTagname(), and getElementById() to help
you filter the data. I often for debuggin use something like this

RowCount = 1
for each itm in IE.Document
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.Classname
Range("C" & RowCount) = itm.id
Range("D" & RowCount) = left(itm.innertext,1024)

RowCount = rowcount + 1
next itm

The source is partioned into items. Has you go down the ALL property the
innertext is partioned into small pices. When you do the dump above you will
see the same innertext repeated over and over again but broken into samller
pieces each time it is repeated.

If you need more help give me the URL and I will get what you need. I help
lots of people who have had problems.




Sub GetZipCodes()


ZIPCODE = InputBox("Enter 5 digit zipcode : ")

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://zip4.usps.com/zip4/citytown_zip.jsp"

'get web page
IE.Navigate2 URL
Do While IE.readyState < 4 And _
IE.busy = True

DoEvents
Loop

Set Form = IE.document.getElementsByTagname("Form")

Set zip5 = IE.document.getElementById("zip5")
zip5.Value = ZIPCODE


Set ZipCodebutton = Form(0).onsubmit

Form(0).submit
Do While IE.busy = True
DoEvents
Loop

Set Table = IE.document.getElementsByTagname("Table")
Location = Table(0).Rows(2).innertext
IE.Quit
MsgBox ("Zip code = " & ZIPCODE & " City/State = " & Location)


End Sub



"Southern at Heart" wrote:

I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page. (a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet) If I
open the page in IE, view the source, I can programmatically filter out the
coordinates.
So, my question: How to get the source saved to a string in vba
thanks.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
creating drop down lists from source on another page Shaun Excel Worksheet Functions 7 March 12th 10 07:37 PM
Get address from iFrame reference in page source [email protected] Excel Programming 11 September 15th 08 07:48 PM
Importing Web Page Source Code tx12345[_6_] Excel Programming 5 January 22nd 06 12:16 AM
Viewing source code when saved as web page Mark Excel Discussion (Misc queries) 1 March 31st 05 06:26 AM
Get source from a web page into a worksheet [email protected] Excel Programming 1 July 30th 04 03:17 PM


All times are GMT +1. The time now is 06:47 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"