Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Okay so the title is stupid, but I don't know how to explain in just a few words my issue. We have a range of dates in Column A Jan 1973 Feb 1973 .... Apr 2009 I am trying to pull numbers from a website that correspond to totals for each month, but the website ( http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm) only lists the years, and then each line below it corresponds to a total for each month in that year. Here is an example of the source for year 1973: <tr <td class='B4' 1973</td <td class='B3'60</td <td class='B3'384</td <td class='B3'1,167</td <td class='B3'931</td <td class='B3'1,670</td <td class='B3'1,598</td <td class='B3'1,758</td <td class='B3'1,829</td <td class='B3'1,022</td <td class='B3'1,465</td <td class='B3'1,483</td <td class='B3'1,456</td </tr I used the following code to test getting the January total for each year, but how can I set my loop to skip down 12 cells to the next year? Currently it is putting the January value in each cell for each year Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "class=b3") |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() I can't tell if this is what you want or not, but the code takes all the data from that site and sweeps it into two columns. Try it and you will see: Sub settupp() Set s1 = Sheets("Sheet1") Set s2 = Sheets("Sheet2") s1.Activate n = Cells(Rows.Count, 1).End(xlUp).Row k = 1 For i = 1 To n namee = Cells(i, 1).Value For j = 1 To 13 numberr = Cells(i, j).Value s2.Cells(k, 1) = namee s2.Cells(k, 2) = numberr k = k + 1 Next Next End Sub Also, just for fun, here is a recorded macro that imports the data from that site: Sub Macro1() With ActiveSheet.QueryTables.Add(Connection:= _ "URL;http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm", Destination:=Range _ ("A1")) .Name = "n9132cn2m" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "10" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With End Sub You probably know that already............. HTH, Ryan--- -- Ryan--- If this information was helpful, please indicate this by clicking ''Yes''. "Jason" wrote: Okay so the title is stupid, but I don't know how to explain in just a few words my issue. We have a range of dates in Column A Jan 1973 Feb 1973 ... Apr 2009 I am trying to pull numbers from a website that correspond to totals for each month, but the website ( http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm) only lists the years, and then each line below it corresponds to a total for each month in that year. Here is an example of the source for year 1973: <tr <td class='B4' 1973</td <td class='B3'60</td <td class='B3'384</td <td class='B3'1,167</td <td class='B3'931</td <td class='B3'1,670</td <td class='B3'1,598</td <td class='B3'1,758</td <td class='B3'1,829</td <td class='B3'1,022</td <td class='B3'1,465</td <td class='B3'1,483</td <td class='B3'1,456</td </tr I used the following code to test getting the January total for each year, but how can I set my loop to skip down 12 cells to the next year? Currently it is putting the January value in each cell for each year Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "class=b3") |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Maybe... Dim FirstRow as long dim LastRow as long dim iRow as long dim wks as worksheet set wks = activesheet with wks firstrow = 2 'A2 lastrow = .cells(.rows.count,"A").end(xlup).row for irow = firstrow to lastrow step 12 'sURLdate = Format(c.Value2, "yyyy") sURLdate = Format(.cells(irow,"A").Value2, "yyyy") 'I would have used (if I could use a real number sURLdate = year(.cells(irow,"A").value) .... next irow end with Jason wrote: Okay so the title is stupid, but I don't know how to explain in just a few words my issue. We have a range of dates in Column A Jan 1973 Feb 1973 ... Apr 2009 I am trying to pull numbers from a website that correspond to totals for each month, but the website ( http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm) only lists the years, and then each line below it corresponds to a total for each month in that year. Here is an example of the source for year 1973: <tr <td class='B4' 1973</td <td class='B3'60</td <td class='B3'384</td <td class='B3'1,167</td <td class='B3'931</td <td class='B3'1,670</td <td class='B3'1,598</td <td class='B3'1,758</td <td class='B3'1,829</td <td class='B3'1,022</td <td class='B3'1,465</td <td class='B3'1,483</td <td class='B3'1,456</td </tr I used the following code to test getting the January total for each year, but how can I set my loop to skip down 12 cells to the next year? Currently it is putting the January value in each cell for each year Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "class=b3") -- Dave Peterson |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
It isn't entirely clear from your code snippet how the text is placed in the
myStr variable nor whether it contains the entire html table from the website or if it contains a line-by-line "read ins" of that table. The following code (not RegEx) assumes the **entire** HTML table has been saved to a file on the hard drive; it then reads that entire file in, parses the data and place it on the active worksheet with Column A containing the years and Columns B through M containing January through December's values for that year appearing in Column A. I have marked the code that reads the entire file into the myStr variable in case you are using a different method to fill the myStr variable with the **entire** HTML table of data and need to change that part of my code. If you keep my code as posted, you will need to change the example location of the data file and filename that I used in the Open statement. Sub DistributeYearData() Dim Rng As Range Dim X As Long Dim Z As Long Dim FileNum As Long Dim myStr As String Dim MonthData() As String Dim YearParts() As String ' Read in entire file all at once ' {change file location in line below} FileNum = FreeFile Open "d:\temp\WebYearData.txt" For Binary As #FileNum myStr = Space(LOF(FileNum)) Get #FileNum, , myStr Close #FileNum ' Entire file now resides in the myStr variable Set Rng = Range("A2") YearParts = Split(myStr, "<td class='B4' ") For X = 0 To UBound(YearParts) - 1 Rng.Offset(X).Value = Val(YearParts(X + 1)) MonthData = Split(YearParts(X + 1), "<td class='B3'") For Z = 1 To 12 Rng.Offset(X, Z).Value = Val(Replace(MonthData(Z), ",", "")) Next Next End Sub -- Rick (MVP - Excel) "Jason" wrote in message ... Okay so the title is stupid, but I don't know how to explain in just a few words my issue. We have a range of dates in Column A Jan 1973 Feb 1973 ... Apr 2009 I am trying to pull numbers from a website that correspond to totals for each month, but the website ( http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm) only lists the years, and then each line below it corresponds to a total for each month in that year. Here is an example of the source for year 1973: <tr <td class='B4' 1973</td <td class='B3'60</td <td class='B3'384</td <td class='B3'1,167</td <td class='B3'931</td <td class='B3'1,670</td <td class='B3'1,598</td <td class='B3'1,758</td <td class='B3'1,829</td <td class='B3'1,022</td <td class='B3'1,465</td <td class='B3'1,483</td <td class='B3'1,456</td </tr I used the following code to test getting the January total for each year, but how can I set my loop to skip down 12 cells to the next year? Currently it is putting the January value in each cell for each year Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "class=b3") |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() I am pulling the data from the HTML source code off the website. Here is my full code below to help. To replicate the problem I am seeing, column A with Jan 2000 to say Dec 2004 incrementing by month Jan 2000 Feb 2000 ..... Running the code will put the value for January total for 2000 in every month with the year 2000, Janurary total for 2001 in every month for 2001 and so on. We want to use a program (that works correctly) similar to the one below, so we can add on later for other countries, etc. Option Explicit Sub getExportCanada() Dim c As Range, rng As Range Dim myStr As String Dim test As Variant Dim Val As Variant Dim IE As Object Const sURL1 As String = "http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm" Dim sURLdate As String Application.Cursor = xlWait Set IE = CreateObject("InternetExplorer.Application") IE.Navigate sURL1 While IE.ReadyState < 4 DoEvents Wend myStr = IE.Document.body.innerhtml Dim StartRow As Long Set rng = Range("A2").CurrentRegion 'look for last filled in row Set c = rng.SpecialCells(xlCellTypeBlanks).Areas(1) Set c = c.Resize(1, 1) StartRow = c.Row - rng.Row Set rng = rng.Offset(rowoffset:=StartRow).Resize( _ rowsize:=rng.Rows.Count - StartRow, columnsize:=1) For Each c In rng sURLdate = Format(c.Value2, "yyyy") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "class=b3") Next c IE.Quit Set IE = Nothing Application.Cursor = xlDefault End Sub Private Function RegexMid(s As String, sYear As String, sMonthTotal) As String Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.MultiLine = True re.Global = True re.Pattern = "\b" & "nbsp;" & sYear & "[\s\S]+?" & sMonthTotal & "\D+(\d+)" ' Yes I know the syntax for the line above is wrong, I am working on that as well If re.test(s) = True Then Set mc = re.Execute(s) RegexMid = mc(0).SubMatches(0) End If Set re = Nothing End Function |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Hi Jason, Your codes only get the first matched class="B3", so it will always return the first month data for a specified year. We need to modify the regular expression to make it work. The regular expression should include 12 months' sub matches. And we also need to modify the RegexMid function to make it has another parameter month. So this function can return the desired month data depending on the passed in parameter. The followings are codes work on my side now, ----------------------------------------------------------------------- Sub getExportCanada() Dim c As Range, rng As Range Dim myStr As String Dim test As Variant Dim Val As Variant Dim IE As Object Const sURL1 As String = "http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm" Dim sURLdate As String Dim m As String Application.Cursor = xlWait Set IE = CreateObject("InternetExplorer.Application") IE.Navigate sURL1 While IE.ReadyState < 4 DoEvents Wend myStr = IE.Document.body.innerhtml Dim StartRow As Long Set rng = Range("A2").CurrentRegion For Each c In rng sURLdate = Format(c.Value2, "yyyy") m = Format(c.Value2, "m") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, CInt(m), "class=b3") Next c IE.Quit Set IE = Nothing Application.Cursor = xlDefault End Sub Private Function RegexMid(s As String, sYear As String, iMonth As Integer, sMonthTotal) As String Dim re As Object, mc As Object Set re = CreateObject("vbscript.regexp") re.IgnoreCase = True re.MultiLine = True re.Global = True re.Pattern = "\b" & "nbsp;" & sYear & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" & _ "[\s\S]+?" & sMonthTotal & "\D+(\d\D?\d+)" ' Yes I know the syntax for the line above is wrong, I am working on that as well If re.test(s) = True Then Set mc = re.Execute(s) RegexMid = mc(0).SubMatches(iMonth - 1) End If Set re = Nothing End Function ------------------------------------------------------------------ Please note, to make the codes work correctly, we must pass valid month parsed from the Column A. That is to say, we need to fill Date in ColumnA using 1/1/1973 instead of Jan 1973 as you used. Best regards, Colbert Zhou Microsoft Newsgroup Support Team |
#7
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Colbert Thanks! That was exactly what I was looking for! You even fixed my syntax problem in regular expression. However, I noticed that for some reason, if the total for the month is greater than 10,000 it only grabs the numbers before the comma. Or if the number is just a single digit, it messes that up as well... If you scroll down the months for the totals after Dec 1976, you'll see the single digit problem. And after say, year 2001, you can see the issue with the numbers over 10,000... You can compare to the totals on the website http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm I know its just a syntax issue in the Pattern Thanks again for your work so far. |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() If it helps you any, here is the macro I posted earlier modified to read the table of values into myStr using the method you posted in your code. Again, it places the data on the active worksheet with Column A containing the years and Columns B through M containing January through December's values for that year appearing in Column A starting on Row 2. This leaves room for headers on Row 1 (if the layout I used is what you want, the header placements can be added to the code quite easily). Sub DistributeYearData() Dim IE As Object Dim Rng As Range Dim X As Long Dim Z As Long Dim FileNum As Long Dim myStr As String Dim sURLdate As String Dim MonthData() As String Dim YearParts() As String Const sURL1 As String = "http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm" Set IE = CreateObject("InternetExplorer.Application") IE.Navigate sURL1 While IE.ReadyState < 4 DoEvents Wend myStr = IE.Document.body.innerhtml Set Rng = Range("A2") YearParts = Split(myStr, "<td class=B4 ", , vbTextCompare) For X = 0 To UBound(YearParts) - 1 Rng.Offset(X).Value = Val(YearParts(X + 1)) MonthData = Split(YearParts(X + 1), "<td class=B3", , vbTextCompare) For Z = 1 To 12 Rng.Offset(X, Z).Value = Val(Replace(MonthData(Z), ",", "")) Next Next End Sub -- Rick (MVP - Excel) "Jason" wrote in message ... Colbert Thanks! That was exactly what I was looking for! You even fixed my syntax problem in regular expression. However, I noticed that for some reason, if the total for the month is greater than 10,000 it only grabs the numbers before the comma. Or if the number is just a single digit, it messes that up as well... If you scroll down the months for the totals after Dec 1976, you'll see the single digit problem. And after say, year 2001, you can see the issue with the numbers over 10,000... You can compare to the totals on the website http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm I know its just a syntax issue in the Pattern Thanks again for your work so far. |
#9
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Why not do a web query? Data/Import External Data/New Web Query... You can pull in that entire table, and then rearrange it any way you want. HTH, Eric "Jason" wrote: Okay so the title is stupid, but I don't know how to explain in just a few words my issue. We have a range of dates in Column A Jan 1973 Feb 1973 ... Apr 2009 I am trying to pull numbers from a website that correspond to totals for each month, but the website ( http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm) only lists the years, and then each line below it corresponds to a total for each month in that year. Here is an example of the source for year 1973: <tr <td class='B4' 1973</td <td class='B3'60</td <td class='B3'384</td <td class='B3'1,167</td <td class='B3'931</td <td class='B3'1,670</td <td class='B3'1,598</td <td class='B3'1,758</td <td class='B3'1,829</td <td class='B3'1,022</td <td class='B3'1,465</td <td class='B3'1,483</td <td class='B3'1,456</td </tr I used the following code to test getting the January total for each year, but how can I set my loop to skip down 12 cells to the next year? Currently it is putting the January value in each cell for each year Set rng = Range("A2").CurrentRegion Set rng = rng.Resize(rng.Rows.Count, 1) For Each c In rng sURLdate = Format(c.Value2, "yyyy") c.Offset(0, 1).Value = RegexMid(myStr, sURLdate, "class=b3") |
#10
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() In VBA, it looks like this: With ActiveSheet.QueryTables.Add(Connection:= _ "URL;http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm", Destination:=Range _ ("A1")) .Name = "n9132cn2m" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = True .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "10" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With |
#11
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Thanks! My only problem with web queries, and I am sure there is an easy way to get around this, is usually I need to the data in a different format than the way it is after the web query import. I am new to the programming side of Excel, so I am not sure if there is a way to run the web query, then re-arrange it. |
#12
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]() Here's one way to do it. Add a temporary sheet, import the data from the web site, the bring the data into memory using array variables. Once you have it in memory, you can do whatever you want with it, including adding it to another worksheet in a different order. In the example I assume that there are always 12 months, so I don't bother to store the months in memory. Stick the example in a general VBA module and run it. I left off the part of what you do with the data after you get it in memory. Option Explicit Option Base 1 ' ' This routine adds a temporary sheet to ' grab data from a URL table, then once the ' data are in memory, deletes the sheet. ' Sub Grab_and_Reorder() Dim i As Long, j As Long Dim nRows As Long Dim nYears As Long Dim wsh As Worksheet Dim theYears() As Integer, theData() As Double ' Set wsh = ActiveWorkbook.Sheets.Add ' ' First grab the table of data and put in on ' the active worksheet. ' With ActiveSheet.QueryTables.Add(Connection:= _ "URL;http://tonto.eia.doe.gov/dnav/ng/hist/n9132cn2m.htm", _ Destination:=Range("A1")) .Name = "n9132cn2m" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = True .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlSpecifiedTables .WebFormatting = xlWebFormattingNone .WebTables = "10" .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With ' ' Delete any blank rows (those whose first cell is blank) ' Range("A:A").SpecialCells(xlCellTypeBlanks).Entire Row.Delete ' ' Next, determine how much data you just grabbed. ' ActiveSheet.Cells(1, 1).Select nYears = ActiveCell.CurrentRegion.Rows.Count - 1 ' Ignore header row ' ' Grab the data and store in memory ' ReDim theYears(nYears) ReDim theData(nYears * 12) ' For i = 1 To nYears theYears(i) = ActiveSheet.Cells(i + 1, 1) For j = 1 To 12 theData((i - 1) * 12 + j) = ActiveSheet.Cells(i + 1, j + 1) Next j Next i ' wsh.Delete Set wsh = Nothing ' ' Now reorder the data as you please... ' End Sub "Jason" wrote: Thanks! My only problem with web queries, and I am sure there is an easy way to get around this, is usually I need to the data in a different format than the way it is after the web query import. I am new to the programming side of Excel, so I am not sure if there is a way to run the web query, then re-arrange it. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
If a date range contains a leap year (date) | Excel Worksheet Functions | |||
Date - I would like to skip weekends in a date range | Excel Worksheet Functions | |||
Locate month n year from range of date | New Users to Excel | |||
Query on Date range does not qualify the Year | Excel Programming | |||
Excel VBA macro - need to edit code to skip a year | Excel Programming |