Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I'm a newbie with VBA, but want to automate a procedure we are doing.
How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
See my posting
http://www.microsoft.com/office/comm...e-ea1664ee9d7d I wouldn't use class, instead use ID in your HTML. With ID is easy find find the items using the internet browser Set SearchResults = IE.document.getElementById("bl gb") Range("A1") = SearchResults.innertext "Jason" wrote: I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Wed, 17 Jun 2009 13:16:01 -0700, Jason
wrote: I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! In general, it is a simple problem if you define how you are identifying these three numbers, but you don't give sufficient information. I assume that the 85, 76 and 67 are merely examples of what could be in those locations, otherwise you could just write those number into the respective cells. A few of the possibilities from examining your data: Are the numbers identified by being the last three values in the source, and are they all integers? Are the values always on the 3rd, 6th and 9th line? Will the values always be integers; are could they be decimal values (or even fractions)? Or is there something else special about these particular numbers to identify them? For example, is what you really want in B1:B3 the numbers associated with the different td classes "bl gb", "gb", and "br gb" ??? --ron |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Jason,
Is your HTML source a document, or a webpage? There are all sorts of different ways to approach this - depending on what the source actually is. For Example, reading the document from a web page: Dim myStr As String Dim test As Variant Dim Val As Variant Dim IE As Object Dim i As Integer Dim j As Integer j = 1 Set IE = CreateObject("InternetExplorer.Application") IE.Navigate "http://somewebpage.shtml" myStr = IE.Document.body.innerText test = Split(myStr, "</td") For i = LBound(test) To UBound(test) If test(i) < "" Then Val = Split(test(i), "") Cells(j, 2).Value = CDbl(Replace(Replace(Val(UBound(Val)), " ", ""), "'", "")) j = j + 1 End If Next i HTH, Bernie MS Excel MVP "Jason" wrote in message ... I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Sorry if I was too ambiguous with my original post...
This HTML source is from the following webpage on temperature data. http://www.wunderground.com/history/...tory.html?MR=1 In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in the address) 85 is the Max temperatue 76 is the mean 67 is the low temperature. Essentially we want to setup a process to 1. look up the date in column A 2. go the website and parse the data for that date 3. enter the max in column B, low in column C, mean in column D 4. then repeat this process for the next date in the next row Hopefully this helps. "Jason" wrote: I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Jason
Something like this perhaps? Private Sub MaxMinMeanTemps() Dim IE As Object Dim doc As Object Dim tbls As Object Dim tbl As Object Dim tblData As Object Dim tblRow As Object Dim rwCells As Object Dim ws As Worksheet Dim rng As Range Dim strDate As String Dim strURL As String Dim I As Long Set ws = Worksheets(1) Set rng = ws.Range("A2") Set IE = CreateObject("InternetExplorer.Application") IE.Visible = True While rng.Value < "" strDate = Year(rng.Value) & "/" & Month(rng.Value) & "/" & Day(rng.Value) strURL = "http://www.wunderground.com/history/airport/KNYC/" & strDate & "/DailyHistory.html" IE.Navigate strURL Do While IE.Busy: DoEvents: Loop Do While IE.ReadyState < 4: DoEvents: Loop Set doc = IE.Document Set tbls = doc.getElementsByTagName("TABLE") For Each tbl In tbls If tbl.className = "dataTable tm10" Then Set tblData = tbl Exit For End If Next For I = 2 To 4 Set tblRow = tblData.Rows(I) Set rwCells = tblRow.Cells rng.Offset(, I - 1) = rwCells(1).innertext Next I Set rng = rng.Offset(1) Wend IE.Quit: Set IE = Nothing End Sub "Jason" wrote in message ... Sorry if I was too ambiguous with my original post... This HTML source is from the following webpage on temperature data. http://www.wunderground.com/history/...tory.html?MR=1 In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in the address) 85 is the Max temperatue 76 is the mean 67 is the low temperature. Essentially we want to setup a process to 1. look up the date in column A 2. go the website and parse the data for that date 3. enter the max in column B, low in column C, mean in column D 4. then repeat this process for the next date in the next row Hopefully this helps. "Jason" wrote: I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#7
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Norie,
Thanks, that did it. How can I get it to enter just the number, say "85" instead of "85 <degree signF"? I know that is simple, but I'm still learning. I changed Set rng = ws.Range("A2") to something similar to Joel's post RowCount = 1 Set rng = ws.Range("A" & RowCount) RowCount = RowCount + 1 so that I can enter in a whole list of dates and have it go through and do it for all the dates listed. Of course, now my mind is going... Any idea how I could have it look at the table, find the last date listed in column A, enter the next date below it, and fill in all the dates up to present day minus 1? Meaning from the last date until "yesterday". Of course "yesterday" would depend on when it is run. This way, we could run the script to fill in all our missing data "Norie" wrote: Jason Something like this perhaps? Private Sub MaxMinMeanTemps() Dim IE As Object Dim doc As Object Dim tbls As Object Dim tbl As Object Dim tblData As Object Dim tblRow As Object Dim rwCells As Object Dim ws As Worksheet Dim rng As Range Dim strDate As String Dim strURL As String Dim I As Long Set ws = Worksheets(1) Set rng = ws.Range("A2") Set IE = CreateObject("InternetExplorer.Application") IE.Visible = True While rng.Value < "" strDate = Year(rng.Value) & "/" & Month(rng.Value) & "/" & Day(rng.Value) strURL = "http://www.wunderground.com/history/airport/KNYC/" & strDate & "/DailyHistory.html" IE.Navigate strURL Do While IE.Busy: DoEvents: Loop Do While IE.ReadyState < 4: DoEvents: Loop Set doc = IE.Document Set tbls = doc.getElementsByTagName("TABLE") For Each tbl In tbls If tbl.className = "dataTable tm10" Then Set tblData = tbl Exit For End If Next For I = 2 To 4 Set tblRow = tblData.Rows(I) Set rwCells = tblRow.Cells rng.Offset(, I - 1) = rwCells(1).innertext Next I Set rng = rng.Offset(1) Wend IE.Quit: Set IE = Nothing End Sub "Jason" wrote in message ... Sorry if I was too ambiguous with my original post... This HTML source is from the following webpage on temperature data. http://www.wunderground.com/history/...tory.html?MR=1 In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in the address) 85 is the Max temperatue 76 is the mean 67 is the low temperature. Essentially we want to setup a process to 1. look up the date in column A 2. go the website and parse the data for that date 3. enter the max in column B, low in column C, mean in column D 4. then repeat this process for the next date in the next row Hopefully this helps. "Jason" wrote: I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Jason
I'm not quite sure what you are asking and it seems actually be more to do with inserting dates into a worksheet. The code I posted assumes you have dates starting in row 2, column A. It goes down the column, gets the data, inserts it in the relevant columns (B,C and D) and then moves on. It stops when a blank cell is found. So in theory if you have all the dates you want, in whatever order it should work. If you need help filling in dates there are various ways to do that but they really depend on what you've currently got and what you want to end up with. Perhaps a topic for another thread. As for just getting the number, shouldn't be a problem - I'll see if I can do something. Would you like it in Fahrenheit or Celsius?:) PS If you just need the number for calculations then you could probably do that without code, a simple formula should suffice. "Jason" wrote in message ... Norie, Thanks, that did it. How can I get it to enter just the number, say "85" instead of "85 <degree signF"? I know that is simple, but I'm still learning. I changed Set rng = ws.Range("A2") to something similar to Joel's post RowCount = 1 Set rng = ws.Range("A" & RowCount) RowCount = RowCount + 1 so that I can enter in a whole list of dates and have it go through and do it for all the dates listed. Of course, now my mind is going... Any idea how I could have it look at the table, find the last date listed in column A, enter the next date below it, and fill in all the dates up to present day minus 1? Meaning from the last date until "yesterday". Of course "yesterday" would depend on when it is run. This way, we could run the script to fill in all our missing data "Norie" wrote: Jason Something like this perhaps? Private Sub MaxMinMeanTemps() Dim IE As Object Dim doc As Object Dim tbls As Object Dim tbl As Object Dim tblData As Object Dim tblRow As Object Dim rwCells As Object Dim ws As Worksheet Dim rng As Range Dim strDate As String Dim strURL As String Dim I As Long Set ws = Worksheets(1) Set rng = ws.Range("A2") Set IE = CreateObject("InternetExplorer.Application") IE.Visible = True While rng.Value < "" strDate = Year(rng.Value) & "/" & Month(rng.Value) & "/" & Day(rng.Value) strURL = "http://www.wunderground.com/history/airport/KNYC/" & strDate & "/DailyHistory.html" IE.Navigate strURL Do While IE.Busy: DoEvents: Loop Do While IE.ReadyState < 4: DoEvents: Loop Set doc = IE.Document Set tbls = doc.getElementsByTagName("TABLE") For Each tbl In tbls If tbl.className = "dataTable tm10" Then Set tblData = tbl Exit For End If Next For I = 2 To 4 Set tblRow = tblData.Rows(I) Set rwCells = tblRow.Cells rng.Offset(, I - 1) = rwCells(1).innertext Next I Set rng = rng.Offset(1) Wend IE.Quit: Set IE = Nothing End Sub "Jason" wrote in message ... Sorry if I was too ambiguous with my original post... This HTML source is from the following webpage on temperature data. http://www.wunderground.com/history/...tory.html?MR=1 In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in the address) 85 is the Max temperatue 76 is the mean 67 is the low temperature. Essentially we want to setup a process to 1. look up the date in column A 2. go the website and parse the data for that date 3. enter the max in column B, low in column C, mean in column D 4. then repeat this process for the next date in the next row Hopefully this helps. "Jason" wrote: I'm a newbie with VBA, but want to automate a procedure we are doing. How would I write a function to look at the following HTML source: '<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td ' <td class="bl gb" ' 85 '</td ' <td class="gb" ' 76 '</td ' <td class="br gb" ' 67 '</td And put the 85 into cell B1, 76 into B2, 67 into B3? Thanks! |
#9
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Thu, 18 Jun 2009 06:52:00 -0700, Jason
wrote: Sorry if I was too ambiguous with my original post... This HTML source is from the following webpage on temperature data. http://www.wunderground.com/history/...tory.html?MR=1 In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in the address) 85 is the Max temperatue 76 is the mean 67 is the low temperature. Essentially we want to setup a process to 1. look up the date in column A 2. go the website and parse the data for that date 3. enter the max in column B, low in column C, mean in column D 4. then repeat this process for the next date in the next row Hopefully this helps. Much more clear description of your requirements. Perhaps the following Macro would help: To enter this Macro (Sub), <alt-F11 opens the Visual Basic Editor. Ensure your project is highlighted in the Project Explorer window. Then, from the top menu, select Insert/Module and paste the code below into the window that opens. To use this Macro (Sub), <alt-F8 opens the macro dialog box. Select the macro by name, and <RUN. As written, the entry cell is set to A2, and the results are in B2, C2 and D2. This can be easily changed. ========================================== Option Explicit Sub gettemps() Dim c As Range Dim myStr As String Dim test As Variant Dim Val As Variant Dim IE As Object Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/" Dim sURLdate As String Const sURL2 As String = "/DailyHistory.html?MR=1" Set c = Range("A2") sURLdate = Format(c.Value2, "yyyy/m/d") Application.Cursor = xlWait Set IE = CreateObject("InternetExplorer.Application") IE.Navigate sURL1 & sURLdate & sURL2 While IE.ReadyState < 4 DoEvents Wend myStr = IE.Document.body.innerhtml c.Offset(0, 1).Value = RegexMid(myStr, "max") c.Offset(0, 2).Value = RegexMid(myStr, "min") c.Offset(0, 3).Value = RegexMid(myStr, "mean") Application.Cursor = xlDefault End Sub Private Function RegexMid(s As String, sTempType As String) As String Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.MultiLine = True re.Pattern = "\b" & sTempType & "\b\D+(\d+)" If re.test(s) = True Then Set mc = re.Execute(s) RegexMid = mc(0).submatches(0) End If End Function ================================== --ron |
#11
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Thu, 18 Jun 2009 11:14:01 -0700, Jason
wrote: Ron, Thanks, I like it alot. If you look at my reply to Norie. I setup a way to repeat it for a list of dates, but could not do that with yours...I am sure I missing something besides just changing c=A2 to: RowCount = 1 Set c = Range("A" & RowCount) RowCount = RowCount + 1 Also, if you notice from my reply to Norie, I now want to add a way to fill in column A Thanks! Jason It's relatively easy, but perhaps not intuitive, to extend my macro to handle a range of cells: ================================= Option Explicit Sub gettemps() Dim c As Range, rng As Range Dim j As Long Dim myStr As String Dim test As Variant Dim Val As Variant Dim IE As Object Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/" Dim sURLdate As String Const sURL2 As String = "/DailyHistory.html?MR=1" Application.Cursor = xlWait Set IE = CreateObject("InternetExplorer.Application") Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy/m/d") IE.Navigate sURL1 & sURLdate & sURL2 While IE.ReadyState < 4 DoEvents Wend myStr = IE.Document.body.innerhtml c.Offset(0, 1).Value = RegexMid(myStr, "max") c.Offset(0, 2).Value = RegexMid(myStr, "min") c.Offset(0, 3).Value = RegexMid(myStr, "mean") j = j + 1 Next c Application.Cursor = xlDefault End Sub Private Function RegexMid(s As String, sTempType As String) As String Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.MultiLine = True re.Pattern = "\b" & sTempType & "\b\D+(\d+)" If re.test(s) = True Then Set mc = re.Execute(s) RegexMid = mc(0).submatches(0) End If End Function ========================================= But this just reads in each page one at a time. Whether it is worthwhile to do the coding to try to download the custom range of dates from the site, and then parse out the appropriate data, depends on how often you'll be downloading multiple dates, and how many dates each time. --ron |
#12
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
It works great!
However, once I saw it run, I tried to run it in an existing spreadsheet where the first cell would be A3838 instead of A2 and that did not go too well... Right now, I'm just trying to speed it up, which you are helping with immensely. I'm sure the next step would be to put all this into a database and have it run automatically daily. "Ron Rosenfeld" wrote: On Thu, 18 Jun 2009 11:14:01 -0700, Jason wrote: Ron, Thanks, I like it alot. If you look at my reply to Norie. I setup a way to repeat it for a list of dates, but could not do that with yours...I am sure I missing something besides just changing c=A2 to: RowCount = 1 Set c = Range("A" & RowCount) RowCount = RowCount + 1 Also, if you notice from my reply to Norie, I now want to add a way to fill in column A Thanks! Jason It's relatively easy, but perhaps not intuitive, to extend my macro to handle a range of cells: ================================= Option Explicit Sub gettemps() Dim c As Range, rng As Range Dim j As Long Dim myStr As String Dim test As Variant Dim Val As Variant Dim IE As Object Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/" Dim sURLdate As String Const sURL2 As String = "/DailyHistory.html?MR=1" Application.Cursor = xlWait Set IE = CreateObject("InternetExplorer.Application") Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy/m/d") IE.Navigate sURL1 & sURLdate & sURL2 While IE.ReadyState < 4 DoEvents Wend myStr = IE.Document.body.innerhtml c.Offset(0, 1).Value = RegexMid(myStr, "max") c.Offset(0, 2).Value = RegexMid(myStr, "min") c.Offset(0, 3).Value = RegexMid(myStr, "mean") j = j + 1 Next c Application.Cursor = xlDefault End Sub Private Function RegexMid(s As String, sTempType As String) As String Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.MultiLine = True re.Pattern = "\b" & sTempType & "\b\D+(\d+)" If re.test(s) = True Then Set mc = re.Execute(s) RegexMid = mc(0).submatches(0) End If End Function ========================================= But this just reads in each page one at a time. Whether it is worthwhile to do the coding to try to download the custom range of dates from the site, and then parse out the appropriate data, depends on how often you'll be downloading multiple dates, and how many dates each time. --ron |
#13
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() You want to extract some numbers from a web page. The web page is http://www.wunderground.com/history/...DailyHistory.h tml?MR=1 In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in the address) You want to extract: 85 is the Max temperatue 76 is the mean 67 is the low temperature. Here is a quick script in biterscripting. # Script WeatherHistory.txt var str html ; cat "http://www.wunderground.com/history/airport/KNYC/2008/7/2/DailyHistory. html?MR=1" $html while ( { sen -c "^<span class=\"b\"^" $html } 0 ) do stex -c "^<span class=\"b\"^]" $html null stex -c "]^</span^" $html done Script is in biterscripting. I tested this script. It works. (The markers are <span class="b"...</span.) To try, 1. Save the above script as C:\WeatherHistory.txt. 2. Download biterscripting from http://www.biterscripting.com. 3. Start biterscripting. Enter the following command. script "C:\WeatherHistory.txt" You will get the numbers you want. I am seeing the following. 76 75 85 .. .. .. Email me if you have questions. Patrick *** Sent via Developersdex http://www.developersdex.com *** |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Parse cells with line breaks | Excel Programming | |||
Ways to parse cells in mulitple selected locations | Excel Programming | |||
How in to parse constants in formula to cells | Excel Programming | |||
Generate charts with HMTL | Excel Discussion (Misc queries) | |||
Is it possible to merge columns for HMTL output? | Excel Discussion (Misc queries) |