Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 3,101
Default How to scrape web page data in VBA

I need EXCEL to open a web address and parse out a few data items from that
web location. What EXCEL macro function(s) can be used to open a web URL and
parse the page source?

The web pages in question are not set up to allow the external date
functions to find the needed data.
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 267
Default How to scrape web page data in VBA

Use joels dump sub to ascertain the tag name or ID for the data you want:
with worksheets ("sheet1")
RowCount = 1
for each itm in IE.Document.all
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.classname
Range("C" & RowCount) = itm.id 'comment out if errors
Range("D" & RowCount) = itm.name 'comment out if errors
Range("E" & RowCount) = left(itm.innertext,1024)
RowCount = Rowcount + 1
next itm
end with
once you have this information you can reference the data in your sub
look at the following piece of zip code sub as an example

Set Form = IE.document.getElementsByTagname("Form")

Set zip5 = IE.document.getElementById("zip5")
zip5.Value = ZIPCODE


Set ZipCodebutton = Form(0).onsubmit

Form(0).submit
Do While IE.busy = True
DoEvents
Loop

Set Table = IE.document.getElementsByTagname("Table")
Location = Table(0).Rows(2).innertext
IE.Quit
MsgBox ("Zip code = " & ZIPCODE & " City/State = " & Location)

good luck

"Mike" wrote:

I need EXCEL to open a web address and parse out a few data items from that
web location. What EXCEL macro function(s) can be used to open a web URL and
parse the page source?

The web pages in question are not set up to allow the external date
functions to find the needed data.

  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 395
Default How to scrape web page data in VBA

Here is one I used to parse out some data. I never got it working as well as
I wanted; if you are working with simple pages it shouldn't be too hard, but
I was trying to parse a page that the designer specifically didn't want
parsed, and I was having trouble getting Excel to grab 100% of the page
content for parsing. The code below grabbed the active page and used Regex to
parse it. If you use Regex you will need to set a reference to it in the VBE.
HTH,
Keith

Sub GetTheData()

Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String

strPartURL = "providername"

Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows

If objShellWindows.Count = 0 Then
Exit Sub
End If

For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
'Creates a TextRange object for the element.
Set rng = objIE.document.body.createTextRange
' Set rng2 = objIE.document.body.parentelement
' Str2 = rng2.Text
strPageHTML = rng.Text
End If
Next i

Dim re As RegExp
Dim s As String
Dim ObjID As String
Dim matches As MatchCollection
Dim mcmatch As Match

s1 = strPageHTML
's = Replace(s, Chr(10), Chr(32))
's = Replace(s, Chr(13), Chr(32))
's = Replace(s, "Here to View More Results", "CustomDeletedString")
's = Replace(s, " View ", " ViewView")
Debug.Print s1

'grab the long string to find LSVs
Set re = New RegExp
re.IgnoreCase = True
re.Global = True
re.MultiLine = True
re.Pattern = "Family[\s\S]*?View"

Set s2 = re.Execute(s)
Debug.Print s2

Set re2 = New RegExp
re.IgnoreCase = True
re.Global = True
re.MultiLine = True
re.Pattern = "Family|View[\s\S]*?View" '(gives full string )
"Family|View[\s\S]*?View

Set matches = re.Execute(s)
For Each mcmatch In matches
tempstr = mcmatch
Trim (tempstr)
tempstr = Left(tempstr, Len(tempstr) - 4)
tempstr = Trim(Right(tempstr, Len(tempstr) - 4))
MsgBox (tempstr)
'MsgBox Asc(Mid(tempstr, 2, 1))
Next

Set objShellWindows = Nothing
Set objShell = Nothing
End Sub





"Mike" wrote:

I need EXCEL to open a web address and parse out a few data items from that
web location. What EXCEL macro function(s) can be used to open a web URL and
parse the page source?

The web pages in question are not set up to allow the external date
functions to find the needed data.

  #4   Report Post  
Posted to microsoft.public.excel.programming
ron ron is offline
external usenet poster
 
Posts: 118
Default How to scrape web page data in VBA

On Mar 31, 1:47*pm, Mike wrote:
I need EXCEL to open a web address and *parse out a few data items from that
web location. *What EXCEL macro function(s) can be used to open a web URL and
parse the page source?

The web pages in question are not set up to allow the external date
functions to find the needed data.


Here's another method. It doesn't involve opening and closing IE so
it runs faster. All of the web page information can be extracted from
the source code (using instr, mid, etc) which is contained in
my_var...Ron

Sub Test()
my_url = "http://www.google.com"
Set my_obj = CreateObject("MSXML2.XMLHTTP")
my_obj.Open "GET", my_url, False
my_obj.send
my_var = my_obj.responsetext
Set my_obj = Nothing
End Sub
  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 395
Default How to scrape web page data in VBA

Ron's code is much better than mine if you have a static web address that you
can poll directly. Mine was more complicated because I had to follow a
complicated login and navigation process to reach the page I needed to parse.
If you have the option, use the simpler code :)

"ker_01" wrote:

Here is one I used to parse out some data. I never got it working as well as
I wanted; if you are working with simple pages it shouldn't be too hard, but
I was trying to parse a page that the designer specifically didn't want
parsed, and I was having trouble getting Excel to grab 100% of the page
content for parsing. The code below grabbed the active page and used Regex to
parse it. If you use Regex you will need to set a reference to it in the VBE.
HTH,
Keith

Sub GetTheData()

Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String

strPartURL = "providername"

Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows

If objShellWindows.Count = 0 Then
Exit Sub
End If

For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
'Creates a TextRange object for the element.
Set rng = objIE.document.body.createTextRange
' Set rng2 = objIE.document.body.parentelement
' Str2 = rng2.Text
strPageHTML = rng.Text
End If
Next i

Dim re As RegExp
Dim s As String
Dim ObjID As String
Dim matches As MatchCollection
Dim mcmatch As Match

s1 = strPageHTML
's = Replace(s, Chr(10), Chr(32))
's = Replace(s, Chr(13), Chr(32))
's = Replace(s, "Here to View More Results", "CustomDeletedString")
's = Replace(s, " View ", " ViewView")
Debug.Print s1

'grab the long string to find LSVs
Set re = New RegExp
re.IgnoreCase = True
re.Global = True
re.MultiLine = True
re.Pattern = "Family[\s\S]*?View"

Set s2 = re.Execute(s)
Debug.Print s2

Set re2 = New RegExp
re.IgnoreCase = True
re.Global = True
re.MultiLine = True
re.Pattern = "Family|View[\s\S]*?View" '(gives full string )
"Family|View[\s\S]*?View

Set matches = re.Execute(s)
For Each mcmatch In matches
tempstr = mcmatch
Trim (tempstr)
tempstr = Left(tempstr, Len(tempstr) - 4)
tempstr = Trim(Right(tempstr, Len(tempstr) - 4))
MsgBox (tempstr)
'MsgBox Asc(Mid(tempstr, 2, 1))
Next

Set objShellWindows = Nothing
Set objShell = Nothing
End Sub





"Mike" wrote:

I need EXCEL to open a web address and parse out a few data items from that
web location. What EXCEL macro function(s) can be used to open a web URL and
parse the page source?

The web pages in question are not set up to allow the external date
functions to find the needed data.

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
How to "Scrape" currency conversions off web... Maybe using Google MikeZz Excel Programming 1 February 1st 08 05:08 PM
Green screen scrape Nick Excel Programming 5 July 12th 07 09:51 PM
Excel formula to change page # when data entered in other page Solograndma Excel Discussion (Misc queries) 2 March 12th 07 01:35 PM
excel fit to 1 page shows 1 page but not all data is on that page Jans Excel Programming 1 September 2nd 04 01:49 AM


All times are GMT +1. The time now is 10:31 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"