Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I use hyperlinks for chemicals that access a webpage for the individual
compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Ther are tow basic mathods that can be used.
1) Perform a web query. First setup a Reord Macro (Tools - Macro - Start Recording).. then perform one query manually by going to Data - Import External Data - New Web Query. Next modify the the recorded macro as required to add a loop changing the chemicals and the destination location so the data doesn't over-write each other. If one above doesn't work 2) Open an Internet Explorer application in Excel. through the Internet Explorer request each chemical and extract each results through the Internet Explorer Application. Let me know which approach you want to use. I can help with both approaches. "Roger on Excel" wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Hi Joel,
Thanks for responding. The first approach would work best for me - i have over 1000 chemicals in my list. here is an example of the website i would like to gather specific data tables from : http://www.ilo.org/public/english/pr...5/icsc0553.htm each chemical has a hyperlink to its own web page like this one. Let me know what you think Best regards, Roger ---------------------------------- "Joel" wrote: Ther are tow basic mathods that can be used. 1) Perform a web query. First setup a Reord Macro (Tools - Macro - Start Recording).. then perform one query manually by going to Data - Import External Data - New Web Query. Next modify the the recorded macro as required to add a loop changing the chemicals and the destination location so the data doesn't over-write each other. If one above doesn't work 2) Open an Internet Explorer application in Excel. through the Internet Explorer request each chemical and extract each results through the Internet Explorer Application. Let me know which approach you want to use. I can help with both approaches. "Roger on Excel" wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Thhe code below was very simple. I did it in about 15 minutes. It gets the
names of all the chemicals. I will work on the rest later. Sub Getchemicals() Set TempSht = Sheets.Add(after:=Sheets(Sheets.Count)) TempSht.Name = "Temp" Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count)) ChemicalSht.Name = "Summary" URLFolder = "URL;http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/" ChemicalRowCount = 1 For i = 0 To 25 AlphaLetter = Chr(Asc("a") + i) TempSht.Cells.ClearContents With TempSht.QueryTables.Add(Connection:= _ URLFolder & AlphaLetter & "_index.htm", _ Destination:=TempSht.Range("A1")) .Name = "a_index" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlEntirePage .WebFormatting = xlWebFormattingNone .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With 'move data from tempory sheet to chemical sheet TempRowCount = 17 Do While Range("C" & TempRowCount) < "" ChemicalSht.Range("A" & ChemicalRowCount) = _ TempSht.Range("C" & TempRowCount) ChemicalRowCount = ChemicalRowCount + 1 TempRowCount = TempRowCount + 1 Loop Next i TempSht.Cells.ClearContents End Sub "Roger on Excel" wrote: Hi Joel, Thanks for responding. The first approach would work best for me - i have over 1000 chemicals in my list. here is an example of the website i would like to gather specific data tables from : http://www.ilo.org/public/english/pr...5/icsc0553.htm each chemical has a hyperlink to its own web page like this one. Let me know what you think Best regards, Roger ---------------------------------- "Joel" wrote: Ther are tow basic mathods that can be used. 1) Perform a web query. First setup a Reord Macro (Tools - Macro - Start Recording).. then perform one query manually by going to Data - Import External Data - New Web Query. Next modify the the recorded macro as required to add a loop changing the chemicals and the destination location so the data doesn't over-write each other. If one above doesn't work 2) Open an Internet Explorer application in Excel. through the Internet Explorer request each chemical and extract each results through the Internet Explorer Application. Let me know which approach you want to use. I can help with both approaches. "Roger on Excel" wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
I had to switch to method 2 to get each chemical webpage. Try this code.
Which data do you need? try using method one manually (Data - Import External Data - New Web Query) with one of the webpages from the code below and select one or more tables and see if the results are usable. I will help as required. Sub Getchemicals2() Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count)) ChemicalSht.Name = "Chemicals" Set IE = CreateObject("InternetExplorer.Application") IE.Visible = True URLFolder = _ "http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/" ChemicalRowCount = 1 For Letters = 0 To 25 AlphaLetter = Chr(Asc("a") + Letters) URL = URLFolder & AlphaLetter & "_index.htm" 'get web page IE.Navigate2 URL Do While IE.readyState < 4 DoEvents Loop Do While IE.busy = True DoEvents Loop H2Found = False For Each itm In IE.document.all If H2Found = False Then If itm.tagname = "H2" Then H2Found = True End If Else If itm.tagname = "A" Then If itm.innertext = "" Then Exit For 'chemical name ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext 'webpage ChemicalSht.Range("B" & ChemicalRowCount) = itm.href ChemicalRowCount = ChemicalRowCount + 1 End If End If Next itm Next Letters End Sub "Joel" wrote: Thhe code below was very simple. I did it in about 15 minutes. It gets the names of all the chemicals. I will work on the rest later. Sub Getchemicals() Set TempSht = Sheets.Add(after:=Sheets(Sheets.Count)) TempSht.Name = "Temp" Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count)) ChemicalSht.Name = "Summary" URLFolder = "URL;http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/" ChemicalRowCount = 1 For i = 0 To 25 AlphaLetter = Chr(Asc("a") + i) TempSht.Cells.ClearContents With TempSht.QueryTables.Add(Connection:= _ URLFolder & AlphaLetter & "_index.htm", _ Destination:=TempSht.Range("A1")) .Name = "a_index" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlEntirePage .WebFormatting = xlWebFormattingNone .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With 'move data from tempory sheet to chemical sheet TempRowCount = 17 Do While Range("C" & TempRowCount) < "" ChemicalSht.Range("A" & ChemicalRowCount) = _ TempSht.Range("C" & TempRowCount) ChemicalRowCount = ChemicalRowCount + 1 TempRowCount = TempRowCount + 1 Loop Next i TempSht.Cells.ClearContents End Sub "Roger on Excel" wrote: Hi Joel, Thanks for responding. The first approach would work best for me - i have over 1000 chemicals in my list. here is an example of the website i would like to gather specific data tables from : http://www.ilo.org/public/english/pr...5/icsc0553.htm each chemical has a hyperlink to its own web page like this one. Let me know what you think Best regards, Roger ---------------------------------- "Joel" wrote: Ther are tow basic mathods that can be used. 1) Perform a web query. First setup a Reord Macro (Tools - Macro - Start Recording).. then perform one query manually by going to Data - Import External Data - New Web Query. Next modify the the recorded macro as required to add a loop changing the chemicals and the destination location so the data doesn't over-write each other. If one above doesn't work 2) Open an Internet Explorer application in Excel. through the Internet Explorer request each chemical and extract each results through the Internet Explorer Application. Let me know which approach you want to use. I can help with both approaches. "Roger on Excel" wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Joel,
This is excellent. Thanks for your advice. How would I download other information from other tables - for example I would like to input further data from the tables into further adjacent columns so one can use the table as a vlookup source? I see you use references like itm.innertext - does this select individual line items from the tables on the web page? Thanks, Roger "Joel" wrote: I had to switch to method 2 to get each chemical webpage. Try this code. Which data do you need? try using method one manually (Data - Import External Data - New Web Query) with one of the webpages from the code below and select one or more tables and see if the results are usable. I will help as required. Sub Getchemicals2() Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count)) ChemicalSht.Name = "Chemicals" Set IE = CreateObject("InternetExplorer.Application") IE.Visible = True URLFolder = _ "http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/" ChemicalRowCount = 1 For Letters = 0 To 25 AlphaLetter = Chr(Asc("a") + Letters) URL = URLFolder & AlphaLetter & "_index.htm" 'get web page IE.Navigate2 URL Do While IE.readyState < 4 DoEvents Loop Do While IE.busy = True DoEvents Loop H2Found = False For Each itm In IE.document.all If H2Found = False Then If itm.tagname = "H2" Then H2Found = True End If Else If itm.tagname = "A" Then If itm.innertext = "" Then Exit For 'chemical name ChemicalSht.Range("A" & ChemicalRowCount) = itm.innertext 'webpage ChemicalSht.Range("B" & ChemicalRowCount) = itm.href ChemicalRowCount = ChemicalRowCount + 1 End If End If Next itm Next Letters End Sub "Joel" wrote: Thhe code below was very simple. I did it in about 15 minutes. It gets the names of all the chemicals. I will work on the rest later. Sub Getchemicals() Set TempSht = Sheets.Add(after:=Sheets(Sheets.Count)) TempSht.Name = "Temp" Set ChemicalSht = Sheets.Add(after:=Sheets(Sheets.Count)) ChemicalSht.Name = "Summary" URLFolder = "URL;http://www.ilo.org/public/english/protection/safework/cis/products/icsc/dtasht/" ChemicalRowCount = 1 For i = 0 To 25 AlphaLetter = Chr(Asc("a") + i) TempSht.Cells.ClearContents With TempSht.QueryTables.Add(Connection:= _ URLFolder & AlphaLetter & "_index.htm", _ Destination:=TempSht.Range("A1")) .Name = "a_index" .FieldNames = True .RowNumbers = False .FillAdjacentFormulas = False .PreserveFormatting = True .RefreshOnFileOpen = False .BackgroundQuery = True .RefreshStyle = xlInsertDeleteCells .SavePassword = False .SaveData = True .AdjustColumnWidth = True .RefreshPeriod = 0 .WebSelectionType = xlEntirePage .WebFormatting = xlWebFormattingNone .WebPreFormattedTextToColumns = True .WebConsecutiveDelimitersAsOne = True .WebSingleBlockTextImport = False .WebDisableDateRecognition = False .WebDisableRedirections = False .Refresh BackgroundQuery:=False End With 'move data from tempory sheet to chemical sheet TempRowCount = 17 Do While Range("C" & TempRowCount) < "" ChemicalSht.Range("A" & ChemicalRowCount) = _ TempSht.Range("C" & TempRowCount) ChemicalRowCount = ChemicalRowCount + 1 TempRowCount = TempRowCount + 1 Loop Next i TempSht.Cells.ClearContents End Sub "Roger on Excel" wrote: Hi Joel, Thanks for responding. The first approach would work best for me - i have over 1000 chemicals in my list. here is an example of the website i would like to gather specific data tables from : http://www.ilo.org/public/english/pr...5/icsc0553.htm each chemical has a hyperlink to its own web page like this one. Let me know what you think Best regards, Roger ---------------------------------- "Joel" wrote: Ther are tow basic mathods that can be used. 1) Perform a web query. First setup a Reord Macro (Tools - Macro - Start Recording).. then perform one query manually by going to Data - Import External Data - New Web Query. Next modify the the recorded macro as required to add a loop changing the chemicals and the destination location so the data doesn't over-write each other. If one above doesn't work 2) Open an Internet Explorer application in Excel. through the Internet Explorer request each chemical and extract each results through the Internet Explorer Application. Let me know which approach you want to use. I can help with both approaches. "Roger on Excel" wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger |
#7
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Oct 13, 5:36*pm, Roger on Excel
wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger Do you always want to obtain the same info, just for a diffrerent chemical? If so, specifically what info do you want?..ron |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Ron: Look at the webpage below. There are over 5000 chemicals each with
different properties and warnings! http://www.ilo.org/public/english/pr...asht/index.htm "ron" wrote: On Oct 13, 5:36 pm, Roger on Excel wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger Do you always want to obtain the same info, just for a diffrerent chemical? If so, specifically what info do you want?..ron |
#9
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Oct 16, 1:14*pm, Joel wrote:
Ron: Look at the webpage below. *There are over 5000 chemicals each with different properties and warnings! http://www.ilo.org/public/english/pr...is/products/ic... "ron" wrote: On Oct 13, 5:36 pm, Roger on Excel wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger Do you always want to obtain the same info, just for a diffrerent chemical? *If so, specifically what info do you want?..ron- Hide quoted text - - Show quoted text - Yes, but depending what information is required, perhaps capturing it a) from the source code or b) by obtaining specific tables might be a reasonable solution. For example, Table 10 contains all of the "Important Data" information. Set cTables = ie.Document.getElementsByTagname("table") s = cTables(10).innertext Debug.Print s and "s" then contains IMPORTANT DATA Physical State; Appearance COLOURLESS LIQUID, WITH CHARACTERISTIC ODOUR. Physical dangers The vapour is heavier than air and may travel along the ground; distant ignition possible. Chemical dangers The substance can form explosive peroxides on contact with strong oxidants such as acetic acid, nitric acid, hydrogen peroxide. Reacts with chloroform and bromoform under basic conditions, causing fire and explosion hazard. Attacks plastic. Occupational exposure limits TLV: 500 ppm as TWA, 750 ppm as STEL; A4 (not classifiable as a human carcinogen); BEI issued; (ACGIH 2004). MAK: 500 ppm 1200 mg/m³ Peak limitation category: I(2); Pregnancy risk group: D; (DFG 2006).Routes of exposure The substance can be absorbed into the body by inhalation and through the skin. Inhalation risk A harmful contamination of the air can be reached rather quickly on evaporation of this substance at 20°C ; on spraying or dispersing, however, much faster. Effects of short-term exposure The vapour irritates the eyes and the respiratory tract. The substance may cause effects on the central nervous system, liver, kidneys and gastrointestinal tract. Effects of long-term or repeated exposure Repeated or prolonged contact with skin may cause dermatitis. The substance may have effects on the blood and bone marrow. Again, depending upon which pieces of data for each chemical are required, perhaps one of these approaches might prove useful...ron |
#10
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Oct 16, 2:39*pm, ron wrote:
On Oct 16, 1:14*pm, Joel wrote: Ron: Look at the webpage below. *There are over 5000 chemicals each with different properties and warnings! http://www.ilo.org/public/english/pr...is/products/ic... "ron" wrote: On Oct 13, 5:36 pm, Roger on Excel wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger Do you always want to obtain the same info, just for a diffrerent chemical? *If so, specifically what info do you want?..ron- Hide quoted text - - Show quoted text - Yes, but depending what information is required, perhaps capturing it a) from the source code or b) by obtaining specific tables might be a reasonable solution. *For example, Table 10 contains all of the "Important Data" information. * * Set cTables = ie.Document.getElementsByTagname("table") * * s = cTables(10).innertext * * Debug.Print s and "s" then contains IMPORTANT DATA Physical State; Appearance COLOURLESS LIQUID, WITH CHARACTERISTIC ODOUR. Physical dangers The vapour is heavier than air and may travel along the ground; distant ignition possible. Chemical dangers The substance can form explosive peroxides on contact with strong oxidants such as acetic acid, nitric acid, hydrogen peroxide. Reacts with chloroform and bromoform under basic conditions, causing fire and explosion hazard. Attacks plastic. Occupational exposure limits TLV: 500 ppm as TWA, 750 ppm as STEL; A4 (not classifiable as a human carcinogen); BEI issued; (ACGIH 2004). MAK: 500 ppm 1200 mg/m³ Peak limitation category: I(2); Pregnancy risk group: D; (DFG 2006).Routes of exposure The substance can be absorbed into the body by inhalation and through the skin. Inhalation risk A harmful contamination of the air can be reached rather quickly on evaporation of this substance at 20°C ; on spraying or dispersing, however, much faster. Effects of short-term exposure The vapour irritates the eyes and the respiratory tract. The substance may cause effects on the central nervous system, liver, kidneys and gastrointestinal tract. Effects of long-term or repeated exposure Repeated or prolonged contact with skin may cause dermatitis. The substance may have effects on the blood and bone marrow. Again, depending upon which pieces of data for each chemical are required, perhaps one of these approaches might prove useful...ron- Hide quoted text - - Show quoted text - Whoops, I checked a few more chemicals and found that Table 10 is not always the "Important Data" table. Still, I'd like to see an answer to my original question - what specific data fields does Roger want to capture?..ron |
#11
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Ron: It looks a little easier to get the data from the webpage your way than
mine. Your has an advantage of getting the formated data from the innerhtml property. "ron" wrote: On Oct 16, 1:14 pm, Joel wrote: Ron: Look at the webpage below. There are over 5000 chemicals each with different properties and warnings! http://www.ilo.org/public/english/pr...is/products/ic... "ron" wrote: On Oct 13, 5:36 pm, Roger on Excel wrote: I use hyperlinks for chemicals that access a webpage for the individual compounds showing tables of data for that chemical. I have a large list of chemicals and I want to automate the downloading of specific data in the tables from the web pages. The chemical data is stored in the same format of tables on each webpage. Can anyone help? Thanks, Roger Do you always want to obtain the same info, just for a diffrerent chemical? If so, specifically what info do you want?..ron- Hide quoted text - - Show quoted text - Yes, but depending what information is required, perhaps capturing it a) from the source code or b) by obtaining specific tables might be a reasonable solution. For example, Table 10 contains all of the "Important Data" information. Set cTables = ie.Document.getElementsByTagname("table") s = cTables(10).innertext Debug.Print s and "s" then contains IMPORTANT DATA Physical State; Appearance COLOURLESS LIQUID, WITH CHARACTERISTIC ODOUR. Physical dangers The vapour is heavier than air and may travel along the ground; distant ignition possible. Chemical dangers The substance can form explosive peroxides on contact with strong oxidants such as acetic acid, nitric acid, hydrogen peroxide. Reacts with chloroform and bromoform under basic conditions, causing fire and explosion hazard. Attacks plastic. Occupational exposure limits TLV: 500 ppm as TWA, 750 ppm as STEL; A4 (not classifiable as a human carcinogen); BEI issued; (ACGIH 2004). MAK: 500 ppm 1200 mg/m³ Peak limitation category: I(2); Pregnancy risk group: D; (DFG 2006).Routes of exposure The substance can be absorbed into the body by inhalation and through the skin. Inhalation risk A harmful contamination of the air can be reached rather quickly on evaporation of this substance at 20°C ; on spraying or dispersing, however, much faster. Effects of short-term exposure The vapour irritates the eyes and the respiratory tract. The substance may cause effects on the central nervous system, liver, kidneys and gastrointestinal tract. Effects of long-term or repeated exposure Repeated or prolonged contact with skin may cause dermatitis. The substance may have effects on the blood and bone marrow. Again, depending upon which pieces of data for each chemical are required, perhaps one of these approaches might prove useful...ron |
#12
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Thanks guys for all your help.
I am incredibly thankful for all your time trying to help me. I wish you all the best. Have a great weekend, Roger |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
automating import of several text files from specified folder | Excel Programming | |||
Automating Import of Date Information | Excel Programming | |||
Automating import of Word tables into Excel | Excel Discussion (Misc queries) | |||
Automating data import and separation | Excel Programming | |||
Automating import of a certain type of 'txt' file | Excel Programming |