Thread
:
Web Query? Is This Possible?
View Single Post
#
8
Posted to microsoft.public.excel.programming
ryguy7272
external usenet poster
Posts: 2,836
Web Query? Is This Possible?
Ah! There WAS something weird with my IE settings. Ok, got it working
now!!! This is totally awesome!!! The only things is€¦I wanted the code to
drill down one more level. If you run the code, you will see a link in cell
E2. When E2 is selected, hit F2, then hit enter, now click on the
link€¦thats what I wanted; that data right there. I wanted to import
everything there; looks like .WebTables = "4". With all youve done, youve
made me believe that this is actually possible!! Can you please modify the
code to go one level deeper, import that data, then continue until all
imports are complete?
Please send me an email:
Thanks a ton for all of this!!
You have totally made my weekend!!
This is going to save me so much time!!
Ryan---
--
Ryan---
If this information was helpful, please indicate this by clicking ''Yes''.
"ryguy7272" wrote:
Thanks for the Effort Joel!!! This is awesome!!!
The code looks right, but does actually this work for you? When I run it
here IE opens and hangs for a long time. Then, I close the browser and this
line is yellow:
Do While IE.busy = True Or IE.readystate < 4
No data was imported in about 7 minutes or so. I have a cable connection;
pretty fast. I imagine it will take a while to run, but I don't think it was
actually doing anything. unless, everything has to import and then update
all at onece. however, I don't think that's what it is doing. Any thoughts?
Suggestions? I'm going to try to troubleshoot, but if you can think of
something that may prevent this from running (maybe a setting in IE), please
let me know.
Kindest of regards,
Ryan---
--
Ryan---
If this information was helpful, please indicate this by clicking ''Yes''.
"Joel" wrote:
Hre is the final results. Use two macros. I wrote the 1st which gets you
all the URL's. Then write the 2nd macro which gets the actual data .
Sub USA()
'define states in searching webpage
Const FindCategories = 0
Const ExtractCategory = 1
USAStates = Array("AK", "AL", "AR", "AZ", _
"CA", "CO", "CT", "DC", "DE", "FL", "GA", _
"HI", "IA", "ID", "IL", "IN", "KS", "KY", _
"LA", "MA", "MD", "ME", "MI", "MN", "MO", _
"MS", "MT", "NC", "ND", "NE", "NH", "NJ", _
"NM", "NV", "NY", "OH", "OK", "OR", "PA", _
"RI", "SC", "SD", "TN", "TX", "UT", "VA", _
"VT", "WA", "WI", "WV", "WY")
BaseName = "http://www.dsireusa.org/incentives/index.cfm?" & _
"re=1&ee=1&spv=0&st=0&srp=1&state="
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
With Sheets("USA")
.Cells.ClearContents
RowCount = 1
.Range("A" & RowCount) = "State"
.Range("B" & RowCount) = "Category"
.Range("C" & RowCount) = "Topic"
.Range("D" & RowCount) = "SubTopic"
.Range("E" & RowCount) = "URL"
RowCount = 2
For Each USAState In USAStates
URL = BaseName & USAState
IE.Navigate2 URL
Do While IE.busy = True Or _
IE.readystate < 4
DoEvents
Loop
Set document = IE.document
'Call Dump(document)
'code for extracting table
State = FindCategories
For Each itm In IE.document.all
Select Case itm.classname
Case "categorytype"
Select Case Trim(itm.innertext)
Case "Financial Incentives", _
"Rules, Regulations & Policies"
Category = Trim(itm.innertext)
State = ExtractCategory
Case Else
State = FindCategories
End Select
End Select
If State = ExtractCategory Then
Select Case itm.classname
Case "copybold"
Topic = Trim(itm.innertext)
Case "copy"
.Range("A" & RowCount) = USAState
.Range("B" & RowCount) = Category
.Range("C" & RowCount) = Topic
Subtopic = Trim(itm.innertext)
.Range("D" & RowCount) = Subtopic
HREF = itm.FirstChild.HREF
.Range("E" & RowCount) = HREF
RowCount = RowCount + 1
End Select
End If
Next itm
.Columns.AutoFit
Next USAState
End With
IE.Quit
Set IE = Nothing
End Sub
Sub Dump(document)
With Sheets("dump")
.Cells.ClearContents
RowCount = 1
For Each itm In document.all
.Range("A" & RowCount) = itm.tagname
.Range("B" & RowCount) = itm.classname
.Range("C" & RowCount) = itm.ID
.Range("D" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
Next itm
End With
End Sub
"Joel" wrote:
This is what I have so far. I'm only doing California and NY as a test.
California is working. New York I 'm having problems with one of the links.
It is not working like the others and giving me an error. Will Look at it
tomorrow.
You need to create a worksheet called USA.
Sub USA()
'define states in searching webpage
Const FindCategories = 0
Const ExtractCategory = 1
USAStates = Array("CA", "NY")
BaseName = "http://www.dsireusa.org/incentives/index.cfm?" & _
"re=1&ee=1&spv=0&st=0&srp=1&state="
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
With Sheets("USA")
.Cells.ClearContents
RowCount = 1
For Each USAState In USAStates
URL = BaseName & USAState
IE.Navigate2 URL
Do While IE.busy = True And _
IE.readystate < 4
DoEvents
Loop
.Range("A" & RowCount) = USAState
RowCount = RowCount + 1
Set document = IE.document
'Call Dump(document)
'code for extracting table
State = FindCategories
For Each itm In IE.document.all
Select Case itm.classname
Case "categorytype"
Select Case Trim(itm.innertext)
Case "Financial Incentives", _
"Rules, Regulations & Policies"
.Range("B" & RowCount) = itm.innertext
RowCount = RowCount + 1
State = ExtractCategory
Case Else
State = FindCategories
End Select
End Select
If State = ExtractCategory Then
Select Case itm.classname
Case "copybold"
.Range("C" & RowCount) = itm.innertext
RowCount = RowCount + 1
Case "copy"
.Range("D" & RowCount) = itm.innertext
.Range("E" & RowCount) = _
itm.FirstChild.href
RowCount = RowCount + 1
End Select
End If
Next itm
.Columns.AutoFit
Next USAState
End With
IE.Quit
Set IE = Nothing
End Sub
Sub Dump(document)
With Sheets("dump")
.Cells.ClearContents
RowCount = 1
For Each itm In document.all
.Range("A" & RowCount) = itm.tagname
.Range("B" & RowCount) = itm.classname
.Range("C" & RowCount) = itm.ID
.Range("D" & RowCount) = Left(itm.innertext, 1024)
RowCount = RowCount + 1
Next itm
End With
End Sub
"Joel" wrote:
I'm thinking that you may need to open two internet explorer applications.
the links will probably href properties. I probably won't get to look at
this until late tonight or tomorrow morning.
to get to each state webpage I would simply create a table of each atates
abbreviation and use it as part of the URL
California webpage is this
http://www.dsireusa.org/incentives/i...srp=1&state=CA
New York is this
http://www.dsireusa.org/incentives/i...srp=1&state=NY
So you can put all the state abbreviation on a spreadsheet
Then your loop would be
BaseName = "http://www.dsireusa.org/incentives/index.cfm?" &
"_re=1&ee=1&spv=0&st=0&srp=1&state="
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True
for RowCount = 1 to 50
URL = BaseName & Range("A" & RowCount)
IE.Navigate2 URL
Do While IE.busy = True And _
IE.readystate < 4
DoEvents
Loop
Set document = IE.document
'add your code here
next rowCount
"ryguy7272" wrote:
Thanks for taking an interest in this Joel. I've seen your work with web
queries; quite remarkable. If anyone can do this, it's probably you.
Ok, this is the main page for the site:
http://www.dsireusa.org/
If you click on 'CA', for instance, you end up he
http://www.dsireusa.org/incentives/i...srp=1&state=CA
What I want to do on this page, is get Excel to drill down to each link,
between 'Financial Incentives' and 'Related Programs & Initiatives' (not
including the information below this).
Clicking on the first hyperlink takes you he
http://www.dsireusa.org/incentives/i...164F&re=1&ee=1
I want to import the date from .WebTables = "6,7,8", offset 2 or 3 rows and
then back out of the page, drill down to the next link, and so on and so
forth. I've done lots of web queries and recorded macros many times to
automate the process of importing this data from the web, but the loop is the
tricky thing here. I don't know how to handle the looping part. Does it
make sense so far?
Reply With Quote
ryguy7272
View Public Profile
Find all posts by ryguy7272