ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Parse HMTL source to fill in cells? (https://www.excelbanter.com/excel-programming/429991-parse-hmtl-source-fill-cells.html)

Jason

Parse HMTL source to fill in cells?
 
I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!

joel

Parse HMTL source to fill in cells?
 
See my posting
http://www.microsoft.com/office/comm...e-ea1664ee9d7d

I wouldn't use class, instead use ID in your HTML. With ID is easy find
find the items using the internet browser

Set SearchResults = IE.document.getElementById("bl gb")
Range("A1") = SearchResults.innertext



"Jason" wrote:

I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!


Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Wed, 17 Jun 2009 13:16:01 -0700, Jason
wrote:

I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!


In general, it is a simple problem if you define how you are identifying these
three numbers, but you don't give sufficient information.

I assume that the 85, 76 and 67 are merely examples of what could be in those
locations, otherwise you could just write those number into the respective
cells.

A few of the possibilities from examining your data:

Are the numbers identified by being the last three values in the source, and
are they all integers?

Are the values always on the 3rd, 6th and 9th line?

Will the values always be integers; are could they be decimal values (or even
fractions)?

Or is there something else special about these particular numbers to identify
them? For example, is what you really want in B1:B3 the numbers associated
with the different td classes "bl gb", "gb", and "br gb" ???

--ron

Bernie Deitrick

Parse HMTL source to fill in cells?
 
Jason,

Is your HTML source a document, or a webpage? There are all sorts of different ways to approach
this - depending on what the source actually is.

For Example, reading the document from a web page:

Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object
Dim i As Integer
Dim j As Integer
j = 1

Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate "http://somewebpage.shtml"
myStr = IE.Document.body.innerText

test = Split(myStr, "</td")
For i = LBound(test) To UBound(test)
If test(i) < "" Then
Val = Split(test(i), "")
Cells(j, 2).Value = CDbl(Replace(Replace(Val(UBound(Val)), " ", ""), "'", ""))
j = j + 1
End If
Next i

HTH,
Bernie
MS Excel MVP


"Jason" wrote in message
...
I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!




Jason

Parse HMTL source to fill in cells?
 
Sorry if I was too ambiguous with my original post...

This HTML source is from the following webpage on temperature data.
http://www.wunderground.com/history/...tory.html?MR=1

In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in
the address)
85 is the Max temperatue
76 is the mean
67 is the low temperature.

Essentially we want to setup a process to
1. look up the date in column A
2. go the website and parse the data for that date
3. enter the max in column B, low in column C, mean in column D
4. then repeat this process for the next date in the next row

Hopefully this helps.


"Jason" wrote:

I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!


Norie

Parse HMTL source to fill in cells?
 
Jason

Something like this perhaps?

Private Sub MaxMinMeanTemps()

Dim IE As Object
Dim doc As Object
Dim tbls As Object
Dim tbl As Object
Dim tblData As Object
Dim tblRow As Object
Dim rwCells As Object
Dim ws As Worksheet
Dim rng As Range
Dim strDate As String
Dim strURL As String
Dim I As Long

Set ws = Worksheets(1)

Set rng = ws.Range("A2")

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

While rng.Value < ""

strDate = Year(rng.Value) & "/" & Month(rng.Value) & "/" &
Day(rng.Value)

strURL = "http://www.wunderground.com/history/airport/KNYC/" &
strDate & "/DailyHistory.html"

IE.Navigate strURL

Do While IE.Busy: DoEvents: Loop
Do While IE.ReadyState < 4: DoEvents: Loop

Set doc = IE.Document

Set tbls = doc.getElementsByTagName("TABLE")

For Each tbl In tbls
If tbl.className = "dataTable tm10" Then
Set tblData = tbl
Exit For
End If
Next

For I = 2 To 4

Set tblRow = tblData.Rows(I)

Set rwCells = tblRow.Cells

rng.Offset(, I - 1) = rwCells(1).innertext

Next I

Set rng = rng.Offset(1)

Wend

IE.Quit: Set IE = Nothing

End Sub

"Jason" wrote in message
...
Sorry if I was too ambiguous with my original post...

This HTML source is from the following webpage on temperature data.
http://www.wunderground.com/history/...tory.html?MR=1

In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/...
in
the address)
85 is the Max temperatue
76 is the mean
67 is the low temperature.

Essentially we want to setup a process to
1. look up the date in column A
2. go the website and parse the data for that date
3. enter the max in column B, low in column C, mean in column D
4. then repeat this process for the next date in the next row

Hopefully this helps.


"Jason" wrote:

I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a
href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!



Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Thu, 18 Jun 2009 06:52:00 -0700, Jason
wrote:

Sorry if I was too ambiguous with my original post...

This HTML source is from the following webpage on temperature data.
http://www.wunderground.com/history/...tory.html?MR=1

In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in
the address)
85 is the Max temperatue
76 is the mean
67 is the low temperature.

Essentially we want to setup a process to
1. look up the date in column A
2. go the website and parse the data for that date
3. enter the max in column B, low in column C, mean in column D
4. then repeat this process for the next date in the next row

Hopefully this helps.


Much more clear description of your requirements.

Perhaps the following Macro would help:

To enter this Macro (Sub), <alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this Macro (Sub), <alt-F8 opens the macro dialog box. Select the macro
by name, and <RUN.

As written, the entry cell is set to A2, and the results are in B2, C2 and D2.
This can be easily changed.

==========================================
Option Explicit
Sub gettemps()
Dim c As Range
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object
Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/"
Dim sURLdate As String
Const sURL2 As String = "/DailyHistory.html?MR=1"

Set c = Range("A2")

sURLdate = Format(c.Value2, "yyyy/m/d")

Application.Cursor = xlWait

Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate sURL1 & sURLdate & sURL2
While IE.ReadyState < 4
DoEvents
Wend

myStr = IE.Document.body.innerhtml

c.Offset(0, 1).Value = RegexMid(myStr, "max")
c.Offset(0, 2).Value = RegexMid(myStr, "min")
c.Offset(0, 3).Value = RegexMid(myStr, "mean")

Application.Cursor = xlDefault
End Sub

Private Function RegexMid(s As String, sTempType As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Pattern = "\b" & sTempType & "\b\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
End Function
==================================
--ron

Jason

Parse HMTL source to fill in cells?
 
Norie,

Thanks, that did it. How can I get it to enter just the number, say "85"
instead of "85 <degree signF"? I know that is simple, but I'm still
learning.

I changed
Set rng = ws.Range("A2")
to something similar to Joel's post
RowCount = 1
Set rng = ws.Range("A" & RowCount)
RowCount = RowCount + 1
so that I can enter in a whole list of dates and have it go through and do
it for all the dates listed.

Of course, now my mind is going...

Any idea how I could have it look at the table, find the last date listed in
column A, enter the next date below it, and fill in all the dates up to
present day minus 1? Meaning from the last date until "yesterday". Of course
"yesterday" would depend on when it is run.

This way, we could run the script to fill in all our missing data

"Norie" wrote:

Jason

Something like this perhaps?

Private Sub MaxMinMeanTemps()

Dim IE As Object
Dim doc As Object
Dim tbls As Object
Dim tbl As Object
Dim tblData As Object
Dim tblRow As Object
Dim rwCells As Object
Dim ws As Worksheet
Dim rng As Range
Dim strDate As String
Dim strURL As String
Dim I As Long

Set ws = Worksheets(1)

Set rng = ws.Range("A2")

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

While rng.Value < ""

strDate = Year(rng.Value) & "/" & Month(rng.Value) & "/" &
Day(rng.Value)

strURL = "http://www.wunderground.com/history/airport/KNYC/" &
strDate & "/DailyHistory.html"

IE.Navigate strURL

Do While IE.Busy: DoEvents: Loop
Do While IE.ReadyState < 4: DoEvents: Loop

Set doc = IE.Document

Set tbls = doc.getElementsByTagName("TABLE")

For Each tbl In tbls
If tbl.className = "dataTable tm10" Then
Set tblData = tbl
Exit For
End If
Next

For I = 2 To 4

Set tblRow = tblData.Rows(I)

Set rwCells = tblRow.Cells

rng.Offset(, I - 1) = rwCells(1).innertext

Next I

Set rng = rng.Offset(1)

Wend

IE.Quit: Set IE = Nothing

End Sub

"Jason" wrote in message
...
Sorry if I was too ambiguous with my original post...

This HTML source is from the following webpage on temperature data.
http://www.wunderground.com/history/...tory.html?MR=1

In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/...
in
the address)
85 is the Max temperatue
76 is the mean
67 is the low temperature.

Essentially we want to setup a process to
1. look up the date in column A
2. go the website and parse the data for that date
3. enter the max in column B, low in column C, mean in column D
4. then repeat this process for the next date in the next row

Hopefully this helps.


"Jason" wrote:

I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a
href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!




Jason

Parse HMTL source to fill in cells?
 
Ron,

Thanks, I like it alot.

If you look at my reply to Norie. I setup a way to repeat it for a list of
dates, but could not do that with yours...I am sure I missing something
besides just changing c=A2 to:

RowCount = 1
Set c = Range("A" & RowCount)
RowCount = RowCount + 1

Also, if you notice from my reply to Norie, I now want to add a way to fill
in column A

Thanks!

Jason



"Ron Rosenfeld" wrote:

On Thu, 18 Jun 2009 06:52:00 -0700, Jason
wrote:

Sorry if I was too ambiguous with my original post...

This HTML source is from the following webpage on temperature data.
http://www.wunderground.com/history/...tory.html?MR=1

In this case, it is for NYC on July 2, 2008 (from the ...NYC/2008/7/2/... in
the address)
85 is the Max temperatue
76 is the mean
67 is the low temperature.

Essentially we want to setup a process to
1. look up the date in column A
2. go the website and parse the data for that date
3. enter the max in column B, low in column C, mean in column D
4. then repeat this process for the next date in the next row

Hopefully this helps.


Much more clear description of your requirements.

Perhaps the following Macro would help:

To enter this Macro (Sub), <alt-F11 opens the Visual Basic Editor.
Ensure your project is highlighted in the Project Explorer window.
Then, from the top menu, select Insert/Module and
paste the code below into the window that opens.

To use this Macro (Sub), <alt-F8 opens the macro dialog box. Select the macro
by name, and <RUN.

As written, the entry cell is set to A2, and the results are in B2, C2 and D2.
This can be easily changed.

==========================================
Option Explicit
Sub gettemps()
Dim c As Range
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object
Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/"
Dim sURLdate As String
Const sURL2 As String = "/DailyHistory.html?MR=1"

Set c = Range("A2")

sURLdate = Format(c.Value2, "yyyy/m/d")

Application.Cursor = xlWait

Set IE = CreateObject("InternetExplorer.Application")
IE.Navigate sURL1 & sURLdate & sURL2
While IE.ReadyState < 4
DoEvents
Wend

myStr = IE.Document.body.innerhtml

c.Offset(0, 1).Value = RegexMid(myStr, "max")
c.Offset(0, 2).Value = RegexMid(myStr, "min")
c.Offset(0, 3).Value = RegexMid(myStr, "mean")

Application.Cursor = xlDefault
End Sub

Private Function RegexMid(s As String, sTempType As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Pattern = "\b" & sTempType & "\b\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
End Function
==================================
--ron


Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Thu, 18 Jun 2009 11:14:01 -0700, Jason
wrote:

Ron,

Thanks, I like it alot.

If you look at my reply to Norie. I setup a way to repeat it for a list of
dates, but could not do that with yours...I am sure I missing something
besides just changing c=A2 to:

RowCount = 1
Set c = Range("A" & RowCount)
RowCount = RowCount + 1

Also, if you notice from my reply to Norie, I now want to add a way to fill
in column A

Thanks!

Jason


It's relatively easy, but perhaps not intuitive, to extend my macro to handle a
range of cells:

=================================
Option Explicit
Sub gettemps()
Dim c As Range, rng As Range
Dim j As Long
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object
Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/"
Dim sURLdate As String
Const sURL2 As String = "/DailyHistory.html?MR=1"

Application.Cursor = xlWait
Set IE = CreateObject("InternetExplorer.Application")

Set rng = Range("A2").CurrentRegion
Set rng = rng.Resize(rng.Rows.Count, 1)

For Each c In rng
sURLdate = Format(c.Value2, "yyyy/m/d")
IE.Navigate sURL1 & sURLdate & sURL2
While IE.ReadyState < 4
DoEvents
Wend
myStr = IE.Document.body.innerhtml

c.Offset(0, 1).Value = RegexMid(myStr, "max")
c.Offset(0, 2).Value = RegexMid(myStr, "min")
c.Offset(0, 3).Value = RegexMid(myStr, "mean")
j = j + 1
Next c

Application.Cursor = xlDefault
End Sub

Private Function RegexMid(s As String, sTempType As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Pattern = "\b" & sTempType & "\b\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
End Function
=========================================

But this just reads in each page one at a time. Whether it is worthwhile to do
the coding to try to download the custom range of dates from the site, and then
parse out the appropriate data, depends on how often you'll be downloading
multiple dates, and how many dates each time.
--ron

Jason

Parse HMTL source to fill in cells?
 
It works great!

However, once I saw it run, I tried to run it in an existing spreadsheet
where the first cell would be A3838 instead of A2 and that did not go too
well...

Right now, I'm just trying to speed it up, which you are helping with
immensely. I'm sure the next step would be to put all this into a database
and have it run automatically daily.

"Ron Rosenfeld" wrote:

On Thu, 18 Jun 2009 11:14:01 -0700, Jason
wrote:

Ron,

Thanks, I like it alot.

If you look at my reply to Norie. I setup a way to repeat it for a list of
dates, but could not do that with yours...I am sure I missing something
besides just changing c=A2 to:

RowCount = 1
Set c = Range("A" & RowCount)
RowCount = RowCount + 1

Also, if you notice from my reply to Norie, I now want to add a way to fill
in column A

Thanks!

Jason


It's relatively easy, but perhaps not intuitive, to extend my macro to handle a
range of cells:

=================================
Option Explicit
Sub gettemps()
Dim c As Range, rng As Range
Dim j As Long
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object
Const sURL1 As String = "http://www.wunderground.com/history/airport/KNYC/"
Dim sURLdate As String
Const sURL2 As String = "/DailyHistory.html?MR=1"

Application.Cursor = xlWait
Set IE = CreateObject("InternetExplorer.Application")

Set rng = Range("A2").CurrentRegion
Set rng = rng.Resize(rng.Rows.Count, 1)

For Each c In rng
sURLdate = Format(c.Value2, "yyyy/m/d")
IE.Navigate sURL1 & sURLdate & sURL2
While IE.ReadyState < 4
DoEvents
Wend
myStr = IE.Document.body.innerhtml

c.Offset(0, 1).Value = RegexMid(myStr, "max")
c.Offset(0, 2).Value = RegexMid(myStr, "min")
c.Offset(0, 3).Value = RegexMid(myStr, "mean")
j = j + 1
Next c

Application.Cursor = xlDefault
End Sub

Private Function RegexMid(s As String, sTempType As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Pattern = "\b" & sTempType & "\b\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
End Function
=========================================

But this just reads in each page one at a time. Whether it is worthwhile to do
the coding to try to download the custom range of dates from the site, and then
parse out the appropriate data, depends on how often you'll be downloading
multiple dates, and how many dates each time.
--ron


Norie

Parse HMTL source to fill in cells?
 
Jason

I'm not quite sure what you are asking and it seems actually be more to do
with inserting
dates into a worksheet.

The code I posted assumes you have dates starting in row 2, column A.

It goes down the column, gets the data, inserts it in the relevant columns
(B,C and D) and then
moves on.

It stops when a blank cell is found.

So in theory if you have all the dates you want, in whatever order it should
work.

If you need help filling in dates there are various ways to do that but they
really depend on what you've currently
got and what you want to end up with.

Perhaps a topic for another thread.

As for just getting the number, shouldn't be a problem - I'll see if I can
do something.

Would you like it in Fahrenheit or Celsius?:)

PS If you just need the number for calculations then you could probably do
that without
code, a simple formula should suffice.

"Jason" wrote in message
...
Norie,

Thanks, that did it. How can I get it to enter just the number, say "85"
instead of "85 <degree signF"? I know that is simple, but I'm still
learning.

I changed
Set rng = ws.Range("A2")
to something similar to Joel's post
RowCount = 1
Set rng = ws.Range("A" & RowCount)
RowCount = RowCount + 1
so that I can enter in a whole list of dates and have it go through and do
it for all the dates listed.

Of course, now my mind is going...

Any idea how I could have it look at the table, find the last date listed
in
column A, enter the next date below it, and fill in all the dates up to
present day minus 1? Meaning from the last date until "yesterday". Of
course
"yesterday" would depend on when it is run.

This way, we could run the script to fill in all our missing data

"Norie" wrote:

Jason

Something like this perhaps?

Private Sub MaxMinMeanTemps()

Dim IE As Object
Dim doc As Object
Dim tbls As Object
Dim tbl As Object
Dim tblData As Object
Dim tblRow As Object
Dim rwCells As Object
Dim ws As Worksheet
Dim rng As Range
Dim strDate As String
Dim strURL As String
Dim I As Long

Set ws = Worksheets(1)

Set rng = ws.Range("A2")

Set IE = CreateObject("InternetExplorer.Application")

IE.Visible = True

While rng.Value < ""

strDate = Year(rng.Value) & "/" & Month(rng.Value) & "/" &
Day(rng.Value)

strURL = "http://www.wunderground.com/history/airport/KNYC/" &
strDate & "/DailyHistory.html"

IE.Navigate strURL

Do While IE.Busy: DoEvents: Loop
Do While IE.ReadyState < 4: DoEvents: Loop

Set doc = IE.Document

Set tbls = doc.getElementsByTagName("TABLE")

For Each tbl In tbls
If tbl.className = "dataTable tm10" Then
Set tblData = tbl
Exit For
End If
Next

For I = 2 To 4

Set tblRow = tblData.Rows(I)

Set rwCells = tblRow.Cells

rng.Offset(, I - 1) = rwCells(1).innertext

Next I

Set rng = rng.Offset(1)

Wend

IE.Quit: Set IE = Nothing

End Sub

"Jason" wrote in message
...
Sorry if I was too ambiguous with my original post...

This HTML source is from the following webpage on temperature data.
http://www.wunderground.com/history/...tory.html?MR=1

In this case, it is for NYC on July 2, 2008 (from the
...NYC/2008/7/2/...
in
the address)
85 is the Max temperatue
76 is the mean
67 is the low temperature.

Essentially we want to setup a process to
1. look up the date in column A
2. go the website and parse the data for that date
3. enter the max in column B, low in column C, mean in column D
4. then repeat this process for the next date in the next row

Hopefully this helps.


"Jason" wrote:

I'm a newbie with VBA, but want to automate a procedure we are doing.

How would I write a function to look at the following HTML source:

'<td<a
href="/history/airport/KNYC/2008/7/2/DailyHistory.html"2</a</td
' <td class="bl gb"
' 85
'</td
' <td class="gb"
' 76
'</td
' <td class="br gb"
' 67
'</td

And put the 85 into cell B1, 76 into B2, 67 into B3?

Thanks!





Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Thu, 18 Jun 2009 13:17:01 -0700, Jason
wrote:

It works great!


Good to hear! Thanks for the feedback.


However, once I saw it run, I tried to run it in an existing spreadsheet
where the first cell would be A3838 instead of A2 and that did not go too
well...


You really need to be specific. And you will probably find that by doing so,
and by clearly elaborating your problem, a solution will become apparent.

I'm guessing that the "did not go too well" problem perhaps has to do with
repetitive calls to the web page? If that is the case, then you probably are
not properly setting rng. But with limited information, it's hard to advise
further.

Right now, I'm just trying to speed it up, which you are helping with
immensely.


I suppose by "it", you mean the process of entering these dates.


I'm sure the next step would be to put all this into a database
and have it run automatically daily.


If you are going to run it every day, then calling the web page as I've done
should be adequate. If you were going to run it weekly or monthly, then a call
to the web page that downloads a customizable series of dates might be better.
--ron

Jason

Parse HMTL source to fill in cells?
 
Ron,

I changed
Range("A2").CurrentRegion
to
Range("A3838").CurrentRegion

and it started putting temperatures in B1, C1, D1, and then on down instead
of B3838, C3838, D3838 and then down.

My last request, is more of a wish, and not an urgent need, but I want to
setup a sheet with a list of city airport codes, for example:
Sheet name "Airports"
In "Airports" Cells B1-B25 would contain a list of 25 airport codes in a
specific order
A modified version of your existing script would then
-take the airport code in cell B1, likely NYC as we have already done
-insert it into the website address for sURL1
http://www.wunderground.com/history/airport/k
-run your script, inserting the temps in B*,C*,D*
-then read the next airport from "Airports" B2, likely LAX
-modify the sURL1 to
-http://www.wunderground.com/history/airport/kLAX
-insert the temperatures into E*,F*,G*

Essentially column A is the date, then next 3 columns are the Max, Min, Mean
for City 1, then the next 3 columns are the Max, Min, Mean for City 2 and so
on...

I think what we have now is already leaps and bounds ahead of the manual
copying and pasting. So this will definitely work for us, if we need more I
am sure I will post those needs under a new thread.

However, I would like to know in your script how you got it to look at the
website and pull the data...if you wouldn't mind adding a couple comments to
your code that would great. This way, I can hopefully learn a little instead
of just copying and pasting your work.

Thank you immensely,

Jason

"Ron Rosenfeld" wrote:

On Thu, 18 Jun 2009 13:17:01 -0700, Jason
wrote:

It works great!


Good to hear! Thanks for the feedback.


However, once I saw it run, I tried to run it in an existing spreadsheet
where the first cell would be A3838 instead of A2 and that did not go too
well...


You really need to be specific. And you will probably find that by doing so,
and by clearly elaborating your problem, a solution will become apparent.

I'm guessing that the "did not go too well" problem perhaps has to do with
repetitive calls to the web page? If that is the case, then you probably are
not properly setting rng. But with limited information, it's hard to advise
further.

Right now, I'm just trying to speed it up, which you are helping with
immensely.


I suppose by "it", you mean the process of entering these dates.


I'm sure the next step would be to put all this into a database
and have it run automatically daily.


If you are going to run it every day, then calling the web page as I've done
should be adequate. If you were going to run it weekly or monthly, then a call
to the web page that downloads a customizable series of dates might be better.
--ron


Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Fri, 19 Jun 2009 07:14:02 -0700, Jason
wrote:

Ron,

I changed
Range("A2").CurrentRegion
to
Range("A3838").CurrentRegion

and it started putting temperatures in B1, C1, D1, and then on down instead
of B3838, C3838, D3838 and then down.


And what, exactly, is the range that results from that command? Take a look at
HELP for the CurrentRegion property and you may be able to figure out for
yourself what's going on and how to fix it. Or else you'll have to provide
more details of your setup.


My last request, is more of a wish, and not an urgent need, but I want to
setup a sheet with a list of city airport codes, for example:
Sheet name "Airports"
In "Airports" Cells B1-B25 would contain a list of 25 airport codes in a
specific order
A modified version of your existing script would then
-take the airport code in cell B1, likely NYC as we have already done
-insert it into the website address for sURL1
http://www.wunderground.com/history/airport/k
-run your script, inserting the temps in B*,C*,D*
-then read the next airport from "Airports" B2, likely LAX
-modify the sURL1 to
-http://www.wunderground.com/history/airport/kLAX
-insert the temperatures into E*,F*,G*

Essentially column A is the date, then next 3 columns are the Max, Min, Mean
for City 1, then the next 3 columns are the Max, Min, Mean for City 2 and so
on...


That should be easy to do. But perhaps that is something you'd rather work on?
I know I learn a lot by doing...



I think what we have now is already leaps and bounds ahead of the manual
copying and pasting. So this will definitely work for us, if we need more I
am sure I will post those needs under a new thread.

However, I would like to know in your script how you got it to look at the
website and pull the data...if you wouldn't mind adding a couple comments to
your code that would great. This way, I can hopefully learn a little instead
of just copying and pasting your work.


That is a matter of knowing a little (unfortunately too little) about the
InternetExplorer object (see

http://msdn.microsoft.com/en-us/libr...84(VS.85).aspx

Constructing the proper URL to download the web page
This was done by examining the URL when I had navigated to the web page
in question.

Examining the HTML source to try to figure out an unambiguous way of defining
the data I wanted to obtain, using the VBScript Regular Expression engine.

The URL parts all have "URL" as part of their name.

The Regular Expression pattern parts all have "Pat" in their name.


Hope this helps.




Thank you immensely,

Jason

"Ron Rosenfeld" wrote:

On Thu, 18 Jun 2009 13:17:01 -0700, Jason
wrote:

It works great!


Good to hear! Thanks for the feedback.


However, once I saw it run, I tried to run it in an existing spreadsheet
where the first cell would be A3838 instead of A2 and that did not go too
well...


You really need to be specific. And you will probably find that by doing so,
and by clearly elaborating your problem, a solution will become apparent.

I'm guessing that the "did not go too well" problem perhaps has to do with
repetitive calls to the web page? If that is the case, then you probably are
not properly setting rng. But with limited information, it's hard to advise
further.

Right now, I'm just trying to speed it up, which you are helping with
immensely.


I suppose by "it", you mean the process of entering these dates.


I'm sure the next step would be to put all this into a database
and have it run automatically daily.


If you are going to run it every day, then calling the web page as I've done
should be adequate. If you were going to run it weekly or monthly, then a call
to the web page that downloads a customizable series of dates might be better.
--ron

--ron

Patrick Mc

Parse HMTL source to fill in cells?
 




You want to extract some numbers from a web page. The web page is
http://www.wunderground.com/history/...DailyHistory.h
tml?MR=1

In this case, it is for NYC on July 2, 2008 (from the
...NYC/2008/7/2/... in
the address)

You want to extract:

85 is the Max temperatue
76 is the mean
67 is the low temperature.


Here is a quick script in biterscripting.

# Script WeatherHistory.txt
var str html ; cat
"http://www.wunderground.com/history/airport/KNYC/2008/7/2/DailyHistory.
html?MR=1" $html
while ( { sen -c "^<span class=\"b\"^" $html } 0 )
do
stex -c "^<span class=\"b\"^]" $html null
stex -c "]^</span^" $html
done


Script is in biterscripting. I tested this script. It works. (The
markers are <span class="b"...</span.) To try,

1. Save the above script as C:\WeatherHistory.txt.

2. Download biterscripting from http://www.biterscripting.com.

3. Start biterscripting. Enter the following command.

script "C:\WeatherHistory.txt"

You will get the numbers you want. I am seeing the following.

76
75
85
..
..
..



Email me if you have questions.

Patrick




*** Sent via Developersdex http://www.developersdex.com ***

Jason

Parse HMTL source to fill in cells?
 

Ron,

Just wanted to say thanks for your help. I had to leave town for a few
days, but will work on this and hopefully get the results I need.

Thanks!

"Ron Rosenfeld" wrote:

On Fri, 19 Jun 2009 07:14:02 -0700, Jason
wrote:

Ron,

I changed
Range("A2").CurrentRegion
to
Range("A3838").CurrentRegion

and it started putting temperatures in B1, C1, D1, and then on down instead
of B3838, C3838, D3838 and then down.


And what, exactly, is the range that results from that command? Take a look at
HELP for the CurrentRegion property and you may be able to figure out for
yourself what's going on and how to fix it. Or else you'll have to provide
more details of your setup.


My last request, is more of a wish, and not an urgent need, but I want to
setup a sheet with a list of city airport codes, for example:
Sheet name "Airports"
In "Airports" Cells B1-B25 would contain a list of 25 airport codes in a
specific order
A modified version of your existing script would then
-take the airport code in cell B1, likely NYC as we have already done
-insert it into the website address for sURL1
http://www.wunderground.com/history/airport/k
-run your script, inserting the temps in B*,C*,D*
-then read the next airport from "Airports" B2, likely LAX
-modify the sURL1 to
-http://www.wunderground.com/history/airport/kLAX
-insert the temperatures into E*,F*,G*

Essentially column A is the date, then next 3 columns are the Max, Min, Mean
for City 1, then the next 3 columns are the Max, Min, Mean for City 2 and so
on...


That should be easy to do. But perhaps that is something you'd rather work on?
I know I learn a lot by doing...



I think what we have now is already leaps and bounds ahead of the manual
copying and pasting. So this will definitely work for us, if we need more I
am sure I will post those needs under a new thread.

However, I would like to know in your script how you got it to look at the
website and pull the data...if you wouldn't mind adding a couple comments to
your code that would great. This way, I can hopefully learn a little instead
of just copying and pasting your work.


That is a matter of knowing a little (unfortunately too little) about the
InternetExplorer object (see

http://msdn.microsoft.com/en-us/libr...84(VS.85).aspx

Constructing the proper URL to download the web page
This was done by examining the URL when I had navigated to the web page
in question.

Examining the HTML source to try to figure out an unambiguous way of defining
the data I wanted to obtain, using the VBScript Regular Expression engine.

The URL parts all have "URL" as part of their name.

The Regular Expression pattern parts all have "Pat" in their name.


Hope this helps.




Thank you immensely,

Jason

"Ron Rosenfeld" wrote:

On Thu, 18 Jun 2009 13:17:01 -0700, Jason
wrote:

It works great!

Good to hear! Thanks for the feedback.


However, once I saw it run, I tried to run it in an existing spreadsheet
where the first cell would be A3838 instead of A2 and that did not go too
well...

You really need to be specific. And you will probably find that by doing so,
and by clearly elaborating your problem, a solution will become apparent.

I'm guessing that the "did not go too well" problem perhaps has to do with
repetitive calls to the web page? If that is the case, then you probably are
not properly setting rng. But with limited information, it's hard to advise
further.

Right now, I'm just trying to speed it up, which you are helping with
immensely.

I suppose by "it", you mean the process of entering these dates.


I'm sure the next step would be to put all this into a database
and have it run automatically daily.

If you are going to run it every day, then calling the web page as I've done
should be adequate. If you were going to run it weekly or monthly, then a call
to the web page that downloads a customizable series of dates might be better.
--ron

--ron


Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Thu, 25 Jun 2009 10:43:02 -0700, Jason
wrote:

Ron,

Just wanted to say thanks for your help. I had to leave town for a few
days, but will work on this and hopefully get the results I need.

Thanks!


You're very welcome. Thanks for the feedback. Let me know what you need.
--ron

-----------------------------------------------------------------------------
Our Peering Groups change
Visit : http://spacesst.com/peerin

Jason

Parse HMTL source to fill in cells?
 
Ron,

Okay, I'm back...

I was able to actually take your code and reference the sheet with the 25
airport codes and add the data for each code in 3 columns.

Of course, then I got greedy and thought I wanted something faster. So I
tried taking your code and getting it to work for the Custom Date Range so
that data could be entered in much quicker.

I'm still learning, so I just want to have it work for 1 city and a hardset
date range. I think I can get it to repeat for the different airports later.

For this example, I am trying to run for Houston from 9/1/2008 to 6/28/2009.
I have setup a series of dates in column A of my Excel spreadsheet,
beginning with 9/1/2008 in A1 and daily increments down. I have claimed a
small victory in that when run, it does go the site and grab the correct max,
min and mean from the list.

The problem is that it enters the max, min, and mean for 9/1/2008 as the
temperatures for every day following. Hopefully it is something simple, that
will allow it to move down to the next date and grab that temperature.

Hopefully you can understand my explanation, if not, you can simply enter a
list of dates from 9/1/2008 to 9/10/2008 in A1:A10 and run the code below and
see the repeat.

As always thanks!


Option Explicit
Sub getCityTemps()
Dim AirCode As Range, ACrng As Range
Dim c As Range, rng As Range
Dim j As Long
Dim sURLairport As String
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object

'Set ACrng = Sheets("City_Airport").Range("B2:B26")
'For Each AirCode In ACrng
' sURLairport = AirCode
Const sURL1 As String =
"http://www.wunderground.com/history/airport/KHOU/2008/9/1/CustomHistory.html?dayend=28&monthend=6&yearend=20 09&req_city=NA&req_state=NA&req_statename=NA"
'Const sURL2 As String = "/"
Dim sURLdate As String
'Const sURL3 As String = "/DailyHistory.html?MR=1"

Application.Cursor = xlWait
Set IE = CreateObject("InternetExplorer.Application")


IE.Navigate sURL1
While IE.ReadyState < 4
DoEvents
Wend
myStr = IE.Document.body.innerhtml

Set rng = Range("A2").CurrentRegion
Set rng = rng.Resize(rng.Rows.Count, 1)
For Each c In rng
sURLdate = Format(c.Value2, "yyyy/m/d")
c.Offset(0, 1).Value = RegexMid(myStr, "bl gb")
c.Offset(0, 2).Value = RegexMid(myStr, "br gb")
c.Offset(0, 3).Value = RegexMid(myStr, "class=gb")
j = j + 1
Next c



'i = i + 3

'Next AirCode

Application.Cursor = xlDefault
End Sub

Private Function RegexMid(s As String, sTempType As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Pattern = "\b" & sTempType & "\b\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
End Function




"Ron Rosenfeld" wrote:

On Thu, 25 Jun 2009 10:43:02 -0700, Jason
wrote:

Ron,

Just wanted to say thanks for your help. I had to leave town for a few
days, but will work on this and hopefully get the results I need.

Thanks!


You're very welcome. Thanks for the feedback. Let me know what you need.
--ron


-----------------------------------------------------------------------------
Less Spam Better enjoyable experience
Visit : news://spacesst.com

Ron Rosenfeld

Parse HMTL source to fill in cells?
 
On Mon, 29 Jun 2009 15:17:03 -0700, Jason
wrote:

Of course, then I got greedy and thought I wanted something faster. So I
tried taking your code and getting it to work for the Custom Date Range so
that data could be entered in much quicker.


I looked at that a bit when I was coming up with my original recommendations.
But I did not pursue the Custom Date Range approach. I don't know when I'll
have a chance to look at that in depth, though. This may be a busy week for
me, but if I have time ...

If I recall correctly, one problem is that the identifiers you are using:

"bl gb"
"br gb"
"class=gb"

may not uniquely identify that which you are looking for.
--ron

-----------------------------------------------------------------------------
Less Spam Better enjoyable experience
Visit : news://spacesst.com

Jason

Parse HMTL source to fill in cells?
 

Ron,

No worries, thanks!

Yeah, you are right about the un-unique identifiers. I am trying to come up
with a way to iterate the code below for each date then look for those
identifiers. I will take a look at the original link you sent a while back
on the IE Object.

<td<a href="/history/airport/KHOU/2008/9/1/DailyHistory.html"1</a</td
<td class="bl gb"
91
</td
<td class="gb"
84
</td
<td class="br gb"
76
</td

"Ron Rosenfeld" wrote:

On Mon, 29 Jun 2009 15:17:03 -0700, Jason
wrote:

Of course, then I got greedy and thought I wanted something faster. So I
tried taking your code and getting it to work for the Custom Date Range so
that data could be entered in much quicker.


I looked at that a bit when I was coming up with my original recommendations.
But I did not pursue the Custom Date Range approach. I don't know when I'll
have a chance to look at that in depth, though. This may be a busy week for
me, but if I have time ...

If I recall correctly, one problem is that the identifiers you are using:

"bl gb"
"br gb"
"class=gb"

may not uniquely identify that which you are looking for.
--ron


Ron Rosenfeld

Parse HMTL source to fill in cells?
 

On Tue, 30 Jun 2009 09:19:01 -0700, Jason
wrote:

Ron,

No worries, thanks!

Yeah, you are right about the un-unique identifiers. I am trying to come up
with a way to iterate the code below for each date then look for those
identifiers. I will take a look at the original link you sent a while back
on the IE Object.

<td<a href="/history/airport/KHOU/2008/9/1/DailyHistory.html"1</a</td
<td class="bl gb"
91
</td
<td class="gb"
84
</td
<td class="br gb"
76
</td


At least in a single test, I believe the following modifications will uniquely
identify the required segments. I suspect there are easier ways to do this,
but ...

Note that I changed the pattern; I also changed the RegexMid function and added
some "clean-up" to both the main Sub and the private Sub.

(If you don't explicitly quit IE, you wind up with multiple IE processes
running; and it will eventually crash. In some other program, the limit was
about sixteen).

Anyway try this:

====================================
Option Explicit
Sub getCityTemps()
Dim AirCode As Range, ACrng As Range
Dim c As Range, rng As Range
Dim j As Long
Dim sURLairport As String
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object

'Set ACrng = Sheets("City_Airport").Range("B2:B26")
'For Each AirCode In ACrng
' sURLairport = AirCode
Const sURL1 As String = _
"http://www.wunderground.com/history/airport/KHOU/2008/9/1/CustomHistory.html?dayend=28&monthend=6&yearend=20 09&req_city=NA&req_state=NA&req_statename=NA"
'Const sURL2 As String = "/"
Dim sURLdate As String
'Const sURL3 As String = "/DailyHistory.html?MR=1"

Application.Cursor = xlWait
Set IE = CreateObject("InternetExplorer.Application")

IE.Navigate sURL1
While IE.ReadyState < 4
DoEvents
Wend
myStr = IE.Document.body.innerhtml

Set rng = Range("A2").CurrentRegion
Set rng = rng.Resize(rng.Rows.Count, 1)

For Each c In rng
sURLdate = Format(c.Value2, "yyyy/m/d")
c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "bl gb")
c.Offset(0, 2).Value = RegexMid(myStr, sURLdate, "br gb")
c.Offset(0, 3).Value = RegexMid(myStr, sURLdate, "class=gb")
Next c


'i = i + 3

'Next AirCode
IE.Quit
Set IE = Nothing
Application.Cursor = xlDefault
End Sub
Private Function RegexMid(s As String, sDate As String, sTempType As String) _
As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Global = True
re.Pattern = "\b" & sDate & "/DailyHistory[\s\S]+?" & sTempType _
& "\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
Set re = Nothing
End Function
====================================
--ron

Jason

Parse HMTL source to fill in cells?
 

Ron,

Worked like a champ! I was able to edit to include referencing the airport
codes and get it to run for all the cities needed.

I am still a bit confused on why I cannot get it to run on a worksheet that
say, already has a couple hundred days of information listed. It will work
when I want it to start with Row 1 and then go down, but if I wanted to start
at Row 250 and go down it will not work. I have tried adjust rng and
researching CurrentRegion, but keep getting stuck.

Don't worry too much about this since it is not a big issue, just more of
curiosity on my part. The existing code works great, since we can just run it
in a blank sheet and then copy and and paste it into our existing data.

THANKS!


"Ron Rosenfeld" wrote:

On Tue, 30 Jun 2009 09:19:01 -0700, Jason
wrote:

Ron,

No worries, thanks!

Yeah, you are right about the un-unique identifiers. I am trying to come up
with a way to iterate the code below for each date then look for those
identifiers. I will take a look at the original link you sent a while back
on the IE Object.

<td<a href="/history/airport/KHOU/2008/9/1/DailyHistory.html"1</a</td
<td class="bl gb"
91
</td
<td class="gb"
84
</td
<td class="br gb"
76
</td


At least in a single test, I believe the following modifications will uniquely
identify the required segments. I suspect there are easier ways to do this,
but ...

Note that I changed the pattern; I also changed the RegexMid function and added
some "clean-up" to both the main Sub and the private Sub.

(If you don't explicitly quit IE, you wind up with multiple IE processes
running; and it will eventually crash. In some other program, the limit was
about sixteen).

Anyway try this:

====================================
Option Explicit
Sub getCityTemps()
Dim AirCode As Range, ACrng As Range
Dim c As Range, rng As Range
Dim j As Long
Dim sURLairport As String
Dim myStr As String
Dim test As Variant
Dim Val As Variant
Dim IE As Object

'Set ACrng = Sheets("City_Airport").Range("B2:B26")
'For Each AirCode In ACrng
' sURLairport = AirCode
Const sURL1 As String = _
"http://www.wunderground.com/history/airport/KHOU/2008/9/1/CustomHistory.html?dayend=28&monthend=6&yearend=20 09&req_city=NA&req_state=NA&req_statename=NA"
'Const sURL2 As String = "/"
Dim sURLdate As String
'Const sURL3 As String = "/DailyHistory.html?MR=1"

Application.Cursor = xlWait
Set IE = CreateObject("InternetExplorer.Application")

IE.Navigate sURL1
While IE.ReadyState < 4
DoEvents
Wend
myStr = IE.Document.body.innerhtml

Set rng = Range("A2").CurrentRegion
Set rng = rng.Resize(rng.Rows.Count, 1)

For Each c In rng
sURLdate = Format(c.Value2, "yyyy/m/d")
c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "bl gb")
c.Offset(0, 2).Value = RegexMid(myStr, sURLdate, "br gb")
c.Offset(0, 3).Value = RegexMid(myStr, sURLdate, "class=gb")
Next c


'i = i + 3

'Next AirCode
IE.Quit
Set IE = Nothing
Application.Cursor = xlDefault
End Sub
Private Function RegexMid(s As String, sDate As String, sTempType As String) _
As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.MultiLine = True
re.Global = True
re.Pattern = "\b" & sDate & "/DailyHistory[\s\S]+?" & sTempType _
& "\D+(\d+)"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
Set re = Nothing
End Function
====================================
--ron


Ron Rosenfeld

Parse HMTL source to fill in cells?
 

On Tue, 30 Jun 2009 12:27:00 -0700, Jason
wrote:

Ron,

Worked like a champ! I was able to edit to include referencing the airport
codes and get it to run for all the cities needed.

I am still a bit confused on why I cannot get it to run on a worksheet that
say, already has a couple hundred days of information listed. It will work
when I want it to start with Row 1 and then go down, but if I wanted to start
at Row 250 and go down it will not work. I have tried adjust rng and
researching CurrentRegion, but keep getting stuck.

Don't worry too much about this since it is not a big issue, just more of
curiosity on my part. The existing code works great, since we can just run it
in a blank sheet and then copy and and paste it into our existing data.

THANKS!


You're very welcome. Glad to help. Thanks for the feedback.

Hopefully, the HTML code won't change in such a way as to break your routine.

I suspect your problem is here, or whatever is the equivalent code for your
real data.

-------------------
Set rng = Range("A2").CurrentRegion
Set rng = rng.Resize(rng.Rows.Count, 1)

For Each c In rng
---------------------

Something like the following would look for the first row that is not
completely filled in, and set rg = to start at that row, and continue down as
far as you have entered dates.

For example, if you have dates entered A2:A30; and data in B2:D11, then rg
would be set to A12:A30.

This routine assumes that there is nothing in Column E or below row 30 that
might extend CurrentRegion.

You'd have to set this up for your own data range. You also need to make sure
that the initial setting of rg is only where your data might be. But there are
a variety of ways to show that.

============================
Option Explicit
Sub SetRange()
Dim rg As Range, c As Range
Dim StartRow As Long
Set rg = Range("A2").CurrentRegion
'look for last filled in row
Set c = rg.SpecialCells(xlCellTypeBlanks).Areas(1)
Set c = c.Resize(1, 1)

StartRow = c.Row - rg.Row

Set rg = rg.Offset(rowoffset:=StartRow).Resize( _
rowsize:=rg.Rows.Count - StartRow, columnsize:=1)

Debug.Print rg.Address

End Sub
===============================
--ron

Jason

Parse HMTL source to fill in cells?
 

I guess this wear my newness comes in. Am I supposed to run this sub, then
run the original code or add this in?

I've tried running this new sub then running the original code, but it seems
to clear out the first 3 columns. Again, I know we are close and this is not
major, but it is more of a learning thing now. It's not urgent, but if you
would prefer me to just email you the spreadsheet, I can do that as well if
that would be quicker.

Ron Rosenfeld

Parse HMTL source to fill in cells?
 

On Tue, 30 Jun 2009 20:30:01 -0700, Jason
wrote:

I guess this wear my newness comes in. Am I supposed to run this sub, then
run the original code or add this in?

I've tried running this new sub then running the original code, but it seems
to clear out the first 3 columns. Again, I know we are close and this is not
major, but it is more of a learning thing now. It's not urgent, but if you
would prefer me to just email you the spreadsheet, I can do that as well if
that would be quicker.


It's just an example of code you can use to set rg to the range you want to
process. You would include it in your own code, modified to appropriately
select your range. You will need to study and understand what it is doing,
though, in order to modify it.
--ron

Jason

Parse HMTL source to fill in cells?
 

Okay, I'll tinker with the range and hopefully get it to work. I believe the
problem lies in some blank cells above the data where the header and title
information are located.

I've already managed to use the code to get ideas for other tasks and had
one final question in regards to Reg Exp syntax.

I modified the Pattern you included in the original code to grab a number
from a different site.

re.Pattern = "\b" & "nbsp;" & sYear & "[\s\S]+?" & sMonthTotal & "\D+(\d+)"

Where the sMonthTotal is a monthly total number

However the number includes commas and so the modified pattern only grabs
the numbers before the first comma.
Example:
-The number is 13,456,876 then the next line is 457,754
-The code grabs 13 and 457 for the next one

Thanks again! I am working with it and have no doubt I will benefit greatly
from your expertise.

Jason

Parse HMTL source to fill in cells?
 

Ron,

Okay I got the range to accurately work. The problem was the first 2 rows
had merged cells in them. If I delete the merged cells it works fine. Which
I can make work.

Is there a way to get the search for the first blank row to ignore merged
cells or not start the search until Row 4? I tried changing the Range"A2" to
Range "A4", Range "A10" etc, but no luck.

Thanks!

Jason

Ron Rosenfeld

Parse HMTL source to fill in cells?
 

On Wed, 1 Jul 2009 11:41:01 -0700, Jason
wrote:

Ron,

Okay I got the range to accurately work. The problem was the first 2 rows
had merged cells in them. If I delete the merged cells it works fine. Which
I can make work.

Is there a way to get the search for the first blank row to ignore merged
cells or not start the search until Row 4? I tried changing the Range"A2" to
Range "A4", Range "A10" etc, but no luck.

Thanks!

Jason



I'll look into that. Merged cells, in general are a PITA.
--ron

Ron Rosenfeld

Parse HMTL source to fill in cells?
 

On Wed, 1 Jul 2009 07:58:14 -0700, Jason
wrote:

Okay, I'll tinker with the range and hopefully get it to work. I believe the
problem lies in some blank cells above the data where the header and title
information are located.

I've already managed to use the code to get ideas for other tasks and had
one final question in regards to Reg Exp syntax.

I modified the Pattern you included in the original code to grab a number
from a different site.

re.Pattern = "\b" & "nbsp;" & sYear & "[\s\S]+?" & sMonthTotal & "\D+(\d+)"

Where the sMonthTotal is a monthly total number

However the number includes commas and so the modified pattern only grabs
the numbers before the first comma.
Example:
-The number is 13,456,876 then the next line is 457,754
-The code grabs 13 and 457 for the next one

Thanks again! I am working with it and have no doubt I will benefit greatly
from your expertise.


I have not tested this, but you might try using "[\d,]+" in place of the "\d".
It is not as robust as a regex that looks for proper comma placement, but after
an adult beverage, it might be good enough.
--ron

Ron Rosenfeld

Parse HMTL source to fill in cells?
 

On Wed, 1 Jul 2009 11:41:01 -0700, Jason
wrote:

Ron,

Okay I got the range to accurately work. The problem was the first 2 rows
had merged cells in them. If I delete the merged cells it works fine. Which
I can make work.

Is there a way to get the search for the first blank row to ignore merged
cells or not start the search until Row 4? I tried changing the Range"A2" to
Range "A4", Range "A10" etc, but no luck.

Thanks!

Jason


Here's one way to start at Row 4, even if there is confusing stuff in rows 1-3:

==========================
Sub SetRange()
Dim rg As Range, c As Range
Const FirstValidRow As Long = 4
Dim StartRow As Long
Set rg = Range("A2").CurrentRegion
Set rg = rg.Resize(rowsize:=rg.Rows.Count - _
FirstValidRow + rg.Row)
Set rg = rg.Offset(rowoffset:=FirstValidRow - rg.Row)
'look for last filled in row
Set c = rg.SpecialCells(xlCellTypeBlanks).Areas(1)
Set c = c.Resize(1, 1)

StartRow = c.Row - rg.Row

Set rg = rg.Offset(rowoffset:=StartRow).Resize( _
rowsize:=rg.Rows.Count - StartRow, columnsize:=1)

Debug.Print rg.Address

End Sub
==============================

So far as ignoring merged cells, and whether that would be effective, it could
be done. How to do it would depend on your precise layout. But there are VBA
properties and methods available to deal with merged cells.
--ron


All times are GMT +1. The time now is 11:22 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com