Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
I am attempting to import a large amount of webpages containing historical
odds on NBA games (about 4 years of data). The data is available on webpages similar to this example: http://archive.scoresandodds.com/grid_20070418.html As you can see, the date is used in the URL, meaning it should be possible to make a macro to automatically import each page from then until now into Excel. The problem is, however, complicated further due to these factors: 1) NBA odds aren't posted daily meaning some dates/URLs wont have any data to import (for instance the next two pages/days in the sequence, that is 20070419.html and 20070420.html doesnt contain NBA data, while 20070421.html once again has the NBA table) 2) Each page also contains tables with data for others sports such as baseball, but I only need data from the NBA table to be imported. What is the easiest way to achieve this goal? Is it possible to make a macro/VBA that loops through all dates since april 2007 and imports only the NBA table and discards pages where no NBA table is present? I am new to Excel and would appreciate any help and solutions you can provide. Thank you! |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
On Aug 21, 8:35*am, "KG" wrote:
I am attempting to import a large amount of webpages containing historical odds on NBA games (about 4 years of data). The data is available on webpages similar to this example: http://archive.scoresandodds.com/grid_20070418.html As you can see, the date is used in the URL, meaning it should be possible to make a macro to automatically import each page from then until now into Excel. The problem is, however, complicated further due to these factors: 1) NBA odds aren't posted daily meaning some dates/URLs wont have any data to import (for instance the next two pages/days in the sequence, that is 20070419.html and 20070420.html doesn’t contain NBA data, while 20070421.html once again has the NBA table) 2) Each page also contains tables with data for others sports such as baseball, but I only need data from the NBA table to be imported. What is the easiest way to achieve this goal? Is it possible to make a macro/VBA that loops through all dates since april 2007 and imports only the NBA table and discards pages where no NBA table is present? I am new to Excel and would appreciate any help and solutions you can provide. Thank you! This is the sort of thing I do for clients. What is this for? Contact me privately dguillett1 @gmail.com NBA05/22/2009 · Live betting trends on all these games at sports.com · Team Open Line Movements Current Moneyline Scores Notes 8:35 PM EDT 507 orlando magic 188o08 188 / 188.5 / 189 188.5 460 95 Over 188.5 508 cleveland cavaliers -18 -0.105263158 -14 -620 96 final EASTERN CONFERENCE FINALS - Gm#2 [TNT]ORL. MAGIC LEAD SERIES 1-0SERIES PRICE: CLE -280 / ORL +220CLE-F-James-Probable |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
Yes, this can be done in a relatively straightforward manner. I don't think opening IE and scraping the data is the best way to go. For so many web pages such an approach would take a long time. A better way might be to use the following construction, it will take the source code behind the web page and assign it to a variable. You can then use the "instr" and "mid" functions to extract the data you are interested in and place it in your workbook.
my_url = "your_url" Set my_obj = CreateObject("MSXML2.XMLHTTP") my_obj.Open "GET", my_url, False my_obj.send my_var = my_obj.responsetext Set my_obj = Nothing This code would placed inside a loop where you increment the start date by 1 each time through and then use that date to build the next url. The lop could also contain an if/then/else construction to see if the word "NBA" appears on the current webpage and react accordingly...Ron |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
On Aug 21, 12:46*pm, ron wrote:
Yes, this can be done in a relatively straightforward manner. *I don't think opening IE and scraping the data is the best way to go. *For so many web pages such an approach would take a long time. *A better way might be to use the following construction, it will take the source code behind the web page and assign it to a variable. *You can then use the "instr" and "mid" functions to extract the data you are interested in and place it in your workbook. * * my_url = "your_url" * * Set my_obj = CreateObject("MSXML2.XMLHTTP") * * my_obj.Open "GET", my_url, False * * my_obj.send * * my_var = my_obj.responsetext * * Set my_obj = Nothing This code would placed inside a loop where you increment the start date by 1 each time through and then use that date to build the next url. *The lop could also contain an if/then/else construction to see if the word "NBA" appears on the current webpage and react accordingly...Ron ================== Why don't you provide a complete sub or send a file to me at dguillett1 @gmail.com. Using the query only takes about 1 second for each date and results in: Team Open Line Movements Current Moneyline Scores NBA 04/21/2007 501 new jersey nets 191 193 / 193.5 / 194 194.5 165 96 Under 194.5 502 toronto raptors -4.5 -14 -4.5 -190 91 final 503 miami heat 183 183 / 182.5 / 183 183.5 175 91 Over 183.5 504 chicago bulls -4.5 -1 -5 -210 96 final 505 orlando magic 180.5 181 / 180.5 / 181 181.5 475 92 Over 181.5 506 detroit pistons -8 -24.65789474 -10 -650 100 final 507 utah jazz 184 188.5 / 189 / 189.5 189 240 75 Under 189 508 houston rockets -6 -20.91666667 -6.5 -280 84 final NBA 04/22/2007 511 los angeles lakers 214.5 212.5 / 212 / 211.5 212 550 87 Under 212 512 phoenix suns -10.5 -0.090909091 -15.5 -750 95 final 513 denver nuggets 196 196.5 / 197 / 197.5 197 320 95 Under 197 514 san antonio spurs -8 0.941176471 -8 -400 89 final 515 golden state warriors 214.5 214.5 / 215 / 214.5 215 500 97 Under 215 516 dallas mavericks -10 -15.15 -10 -700 85 final NBA 04/23/2007 701 orlando magic 182.5 182.5 / 183 / 183.5 184 400 90 Over 184 702 detroit pistons -8.5 1.0625 -23 -500 98 final 703 utah jazz 185.5 185 / 184.5 / 185 185.5 190 90 Over 185.5 704 houston rockets -5.5 -0.181818182 -10 -230 98 final NBA 04/24/2007 705 new jersey nets 194 194 / 194.5 / 194 193.5 200 83 Under 193.5 706 toronto raptors -5 -5.545454545 -20.5 -240 89 final 707 miami heat 184.5 184.5 / 185 / 184.5 184 135 89 Over 184 708 chicago bulls -4.5 -9.408163265 -3 -155 107 final 709 los angeles lakers 208 208 / 207.5 / 208 208.5 380 98 Over 208.5 710 phoenix suns -9.5 -9 -05 / -9 EVEN / -8.5 -14 -480 126 final NBA 04/25/2007 711 denver nuggets 195 194.5 / 194 / 194.5 194 330 88 Under 194 712 san antonio spurs -8.5 -8.5 -13.5 -420 97 final 715 golden state warriors 210 210.5 / 210 / 209.5 210 425 99 Over 210 716 dallas mavericks -9.5 -14.66620499 -9.5 EVEN -550 112 final NBA 04/26/2007 501 detroit pistons -2 -20.2 -22.5 -155 93 Under 183 502 orlando magic 183 183 / 183.5 183 135 77 final 503 houston rockets 186 186 / 186.5 / 187.5 188 170 67 Under 188 504 utah jazz -4 -0.222222222 -4.5 -200 81 final 505 phoenix suns -4 -10.25 -24 -190 89 Under 211 506 los angeles lakers 210 210 / 211.5 211 165 95 final NBA 04/27/2007 507 toronto raptors 192 192 / 191.5 191 165 89 Push 191 508 new jersey nets -4.5 -4.5 -4 -190 102 final 509 chicago bulls 185 185 / 184.5 / 185 185.5 200 104 Over 185.5 510 miami heat -5.5 -5.5 -10.5 -240 96 final 511 dallas mavericks -4.5 -13.875 -4 -180 91 Under 212 512 golden state warriors 211 211 / 211.5 212 160 109 |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
Don...I was comparing the "MSXML2.XMLHTTP" method of capturing web information via assigning the source code to a variable to the method that actually opens IE
Set ie = CreateObject("InternetExplorer.Application") With ie .Visible = True .Navigate "http://www.your_url" .Top = 50 .Left = 530 .Height = 400 .Width = 400 ' Loop until the page is fully loaded Do Until Not ie.Busy And ie.ReadyState = 4 DoEvents Loop The former method is much faster then the latter. I haven't used the web query approach so I did not comment on it. Here is a link to an earlier post in this NG providing a bit more detail around the "MSXML2.XMLHTTP" method. https://groups.google.com/forum/#!searchin/microsoft.public.excel.programming/oitbso|sort:date/microsoft.public.excel.programming/pJryKM1aCh4/VBbkT8YCUdoJ ....Ron |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
Thank you for the suggestion. I have tried to use this solution but have ran
into trouble when trying to extract the info I need and put it into Excel since the HTML code is a bit more complex than in the example you linked to. I actually found a better site to extract from since it only contains the NBA data and in a clear way. http://www.oddsportal.com/basketball...esults/page/2/ The problem is to extract the exact strings i need from the source code as I haven't worked with VB much. Basically, I will need these 5 pieces of data placed into their own column in Excel: Teamname1 Teamname2 Match result Odds1 Odds2 I have discovered that each row in the table on the webpage contains the word "name table-participant" in the source code, so I tried to use this to extract the strings and get the data I want, but it requires more coding to get the 5 strings above than I'm capable of. Do you have any suggestion to how I can get these 5 pieces of information extracted from the source code to the page above? I find it difficult to narrow it down so I just get a string containing each of the above. |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
I think this will do most of what you want. You'll need to add a loop to move from web page to web page...Ron
Sub NBA() ' Set the url for the desired web page my_url = "http://www.oddsportal.com/basketball/usa/nba-2008-2009/results/page/2/" ' Get the source code behind the desired web page Set my_obj = CreateObject("MSXML2.XMLHTTP") my_obj.Open "GET", my_url, False my_obj.send my_var = my_obj.responsetext Set my_obj = Nothing ' Find the two team names, final score and 2 odds pos_0 = 1 Do Until InStr(pos_0, my_var, "name table-participant") = 0 pos_1 = InStr(pos_0, my_var, "name table-participant") pos_2 = InStr(pos_0, my_var, "deactivate") If InStr(Mid(my_var, pos_2, 25), "span class") 0 Then ' first team is in boldface pos_3 = InStr(pos_2, my_var, "span class") pos_4 = InStr(pos_3, my_var, "") pos_5 = InStr(pos_4, my_var, "<") teamname_1 = Mid(my_var, 1 + pos_4, pos_5 - (1 + pos_4)) pos_6 = InStr(pos_5, my_var, "-") pos_7 = InStr(pos_6, my_var, "<") teamname_2 = Trim(Mid(my_var, 1 + pos_6, pos_7 - (1 + pos_6))) Else ' the second team is in boldface pos_4 = InStr(pos_2, my_var, "") pos_5 = InStr(pos_4, my_var, "-") teamname_1 = Trim(Mid(my_var, 1 + pos_4, pos_5 - (1 + pos_4))) pos_6 = InStr(pos_5, my_var, "") pos_7 = InStr(pos_6, my_var, "<") teamname_2 = Mid(my_var, 1 + pos_6, pos_7 - (1 + pos_6)) End If pos_7 = InStr(pos_6, my_var, "table-odds") pos_8 = InStr(pos_7, my_var, "") pos_9 = InStr(pos_8, my_var, "<") score = Mid(my_var, 1 + pos_8, pos_9 - (1 + pos_8)) pos_10 = InStr(pos_9, my_var, "addMatch") pos_11 = InStr(pos_10, my_var, "") pos_12 = InStr(pos_11, my_var, "<") odds_1 = Mid(my_var, 1 + pos_11, pos_12 - (1 + pos_11)) pos_13 = InStr(pos_12, my_var, "addMatch") pos_14 = InStr(pos_13, my_var, "") pos_15 = InStr(pos_14, my_var, "<") odds_2 = Mid(my_var, 1 + pos_14, pos_15 - (1 + pos_14)) pos_0 = pos_15 ' Post the data to the worksheet ActiveCell = teamname_1 ActiveCell.Offset(0, 1) = teamname_2 ActiveCell.Offset(0, 2) = score ActiveCell.Offset(0, 3) = odds_1 ActiveCell.Offset(0, 4) = odds_2 ActiveCell.Offset(1, 0).Select Loop ' Format the columns Columns("A:E").Select Selection.Columns.AutoFit Columns("C:C").Select With Selection .HorizontalAlignment = xlCenter .VerticalAlignment = xlCenter .WrapText = False .Orientation = 0 .AddIndent = False .IndentLevel = 0 .ShrinkToFit = False .ReadingOrder = xlContext .MergeCells = False End With Range("A1").Select End Sub |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
Thanks a lot! I added a loop with a static loop count and it's working just
fine. I was struggling with the fact that both teams could be bold in the HTML code and that the odds could be both three and four digits, but you clearly solved that in a great way. I get an "1004 - Application-Defined or Object-Defined Error" error though, which seem to be related to the snip that formats the columns. If I remove that part the error disappears. Any ideas on that? The formatting isn't really necessary for me, but just thought I would mention it anyway :) |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
Which line throws the error?
|
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
Importing several web pages with macro/VBA?
I can't seem to recreate the error now for some reason. The formatting works
fine now, so I must have done something wrong. Thanks again for the help, it is much appreciated! |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Number of pages in worksheet doesn't match Print Preview pages | Excel Discussion (Misc queries) | |||
Importing Specific Data From Multiple Pages From The Web | Excel Programming | |||
macro to print only pages that are filled in | Excel Discussion (Misc queries) | |||
Importing Yahoo Yellow Pages Into Excel | New Users to Excel | |||
Importing French (Canadian) format html pages | Setting up and Configuration of Excel |