View Single Post
  #8   Report Post  
Posted to microsoft.public.excel.programming
ryguy7272 ryguy7272 is offline
external usenet poster
 
Posts: 2,836
Default Web Query? Is This Possible?

Ah! There WAS something weird with my IE settings. Ok, got it working
now!!! This is totally awesome!!! The only things is€¦I wanted the code to
drill down one more level. If you run the code, you will see a link in cell
E2. When E2 is selected, hit F2, then hit enter, now click on the
link€¦thats what I wanted; that data right there. I wanted to import
everything there; looks like .WebTables = "4". With all youve done, youve
made me believe that this is actually possible!! Can you please modify the
code to go one level deeper, import that data, then continue until all
imports are complete?

Please send me an email:

Thanks a ton for all of this!!
You have totally made my weekend!!
This is going to save me so much time!!


Ryan---



--
Ryan---
If this information was helpful, please indicate this by clicking ''Yes''.


"ryguy7272" wrote:

Thanks for the Effort Joel!!! This is awesome!!!

The code looks right, but does actually this work for you? When I run it
here IE opens and hangs for a long time. Then, I close the browser and this
line is yellow:
Do While IE.busy = True Or IE.readystate < 4

No data was imported in about 7 minutes or so. I have a cable connection;
pretty fast. I imagine it will take a while to run, but I don't think it was
actually doing anything. unless, everything has to import and then update
all at onece. however, I don't think that's what it is doing. Any thoughts?
Suggestions? I'm going to try to troubleshoot, but if you can think of
something that may prevent this from running (maybe a setting in IE), please
let me know.

Kindest of regards,
Ryan---

--
Ryan---
If this information was helpful, please indicate this by clicking ''Yes''.


"Joel" wrote:

Hre is the final results. Use two macros. I wrote the 1st which gets you
all the URL's. Then write the 2nd macro which gets the actual data .

Sub USA()

'define states in searching webpage
Const FindCategories = 0
Const ExtractCategory = 1

USAStates = Array("AK", "AL", "AR", "AZ", _
"CA", "CO", "CT", "DC", "DE", "FL", "GA", _
"HI", "IA", "ID", "IL", "IN", "KS", "KY", _
"LA", "MA", "MD", "ME", "MI", "MN", "MO", _
"MS", "MT", "NC", "ND", "NE", "NH", "NJ", _
"NM", "NV", "NY", "OH", "OK", "OR", "PA", _
"RI", "SC", "SD", "TN", "TX", "UT", "VA", _
"VT", "WA", "WI", "WV", "WY")

BaseName = "http://www.dsireusa.org/incentives/index.cfm?" & _
"re=1&ee=1&spv=0&st=0&srp=1&state="
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

With Sheets("USA")
.Cells.ClearContents
RowCount = 1
.Range("A" & RowCount) = "State"
.Range("B" & RowCount) = "Category"
.Range("C" & RowCount) = "Topic"
.Range("D" & RowCount) = "SubTopic"
.Range("E" & RowCount) = "URL"

RowCount = 2

For Each USAState In USAStates
URL = BaseName & USAState
IE.Navigate2 URL
Do While IE.busy = True Or _
IE.readystate < 4

DoEvents
Loop

Set document = IE.document
'Call Dump(document)
'code for extracting table
State = FindCategories
For Each itm In IE.document.all
Select Case itm.classname

Case "categorytype"
Select Case Trim(itm.innertext)
Case "Financial Incentives", _
"Rules, Regulations & Policies"

Category = Trim(itm.innertext)
State = ExtractCategory

Case Else
State = FindCategories
End Select

End Select

If State = ExtractCategory Then
Select Case itm.classname
Case "copybold"
Topic = Trim(itm.innertext)

Case "copy"
.Range("A" & RowCount) = USAState
.Range("B" & RowCount) = Category
.Range("C" & RowCount) = Topic
Subtopic = Trim(itm.innertext)
.Range("D" & RowCount) = Subtopic
HREF = itm.FirstChild.HREF
.Range("E" & RowCount) = HREF
RowCount = RowCount + 1
End Select

End If

Next itm

.Columns.AutoFit

Next USAState
End With
IE.Quit
Set IE = Nothing

End Sub

Sub Dump(document)

With Sheets("dump")
.Cells.ClearContents
RowCount = 1
For Each itm In document.all
.Range("A" & RowCount) = itm.tagname
.Range("B" & RowCount) = itm.classname
.Range("C" & RowCount) = itm.ID
.Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm
End With
End Sub


"Joel" wrote:

This is what I have so far. I'm only doing California and NY as a test.
California is working. New York I 'm having problems with one of the links.
It is not working like the others and giving me an error. Will Look at it
tomorrow.

You need to create a worksheet called USA.

Sub USA()

'define states in searching webpage
Const FindCategories = 0
Const ExtractCategory = 1

USAStates = Array("CA", "NY")
BaseName = "http://www.dsireusa.org/incentives/index.cfm?" & _
"re=1&ee=1&spv=0&st=0&srp=1&state="
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

With Sheets("USA")
.Cells.ClearContents
RowCount = 1

For Each USAState In USAStates
URL = BaseName & USAState
IE.Navigate2 URL
Do While IE.busy = True And _
IE.readystate < 4

DoEvents
Loop

.Range("A" & RowCount) = USAState
RowCount = RowCount + 1

Set document = IE.document
'Call Dump(document)
'code for extracting table
State = FindCategories
For Each itm In IE.document.all
Select Case itm.classname

Case "categorytype"
Select Case Trim(itm.innertext)
Case "Financial Incentives", _
"Rules, Regulations & Policies"

.Range("B" & RowCount) = itm.innertext
RowCount = RowCount + 1
State = ExtractCategory

Case Else
State = FindCategories
End Select

End Select

If State = ExtractCategory Then
Select Case itm.classname
Case "copybold"
.Range("C" & RowCount) = itm.innertext
RowCount = RowCount + 1

Case "copy"
.Range("D" & RowCount) = itm.innertext
.Range("E" & RowCount) = _
itm.FirstChild.href
RowCount = RowCount + 1
End Select

End If

Next itm

.Columns.AutoFit

Next USAState
End With
IE.Quit
Set IE = Nothing

End Sub

Sub Dump(document)

With Sheets("dump")
.Cells.ClearContents
RowCount = 1
For Each itm In document.all
.Range("A" & RowCount) = itm.tagname
.Range("B" & RowCount) = itm.classname
.Range("C" & RowCount) = itm.ID
.Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm
End With
End Sub


"Joel" wrote:

I'm thinking that you may need to open two internet explorer applications.
the links will probably href properties. I probably won't get to look at
this until late tonight or tomorrow morning.

to get to each state webpage I would simply create a table of each atates
abbreviation and use it as part of the URL

California webpage is this

http://www.dsireusa.org/incentives/i...srp=1&state=CA

New York is this

http://www.dsireusa.org/incentives/i...srp=1&state=NY

So you can put all the state abbreviation on a spreadsheet

Then your loop would be

BaseName = "http://www.dsireusa.org/incentives/index.cfm?" &
"_re=1&ee=1&spv=0&st=0&srp=1&state="
Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

for RowCount = 1 to 50
URL = BaseName & Range("A" & RowCount)
IE.Navigate2 URL
Do While IE.busy = True And _
IE.readystate < 4

DoEvents
Loop
Set document = IE.document
'add your code here
next rowCount

"ryguy7272" wrote:

Thanks for taking an interest in this Joel. I've seen your work with web
queries; quite remarkable. If anyone can do this, it's probably you.

Ok, this is the main page for the site:
http://www.dsireusa.org/

If you click on 'CA', for instance, you end up he
http://www.dsireusa.org/incentives/i...srp=1&state=CA

What I want to do on this page, is get Excel to drill down to each link,
between 'Financial Incentives' and 'Related Programs & Initiatives' (not
including the information below this).

Clicking on the first hyperlink takes you he
http://www.dsireusa.org/incentives/i...164F&re=1&ee=1

I want to import the date from .WebTables = "6,7,8", offset 2 or 3 rows and
then back out of the page, drill down to the next link, and so on and so
forth. I've done lots of web queries and recorded macros many times to
automate the process of importing this data from the web, but the loop is the
tricky thing here. I don't know how to handle the looping part. Does it
make sense so far?