Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
I need to extract specific data from a web page I have pasted into Excel.
The page contains many hyperlinks and objects like pictures, controls, embedded controls related to 'flash player' as well as the required data. I've tried the following simplistic "code". The hyperlinks remove OK but the rest of the code results in a lock up and enforced shutdown. Since I have no way of referencing the required data eg by name, I need to remove all the "clutter" then I can manipulate what's left to get what I need. On Error Resume Next ActiveSheet.Hyperlinks.Delete For Each Control In ActiveSheet.Controls Control.Select Control.Delete Next Control For Each Picture In ActiveSheet.Pictures Picture.Select Selection.Delete Next Picture For Each Shape In ActiveSheet.Shapes Shape.Select Selection.Delete Next Shape End Sub -- donwb |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
Don't paste it if you can help it, use the WebQuery option in;
Data Import External Data New Web Query If you do this and use the default options, no pictures or other elements will be imported, just data. Post back if you have any more questions. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
Try pasting special with text only
"donbowyer" wrote in message ... I need to extract specific data from a web page I have pasted into Excel. The page contains many hyperlinks and objects like pictures, controls, embedded controls related to 'flash player' as well as the required data. I've tried the following simplistic "code". The hyperlinks remove OK but the rest of the code results in a lock up and enforced shutdown. Since I have no way of referencing the required data eg by name, I need to remove all the "clutter" then I can manipulate what's left to get what I need. On Error Resume Next ActiveSheet.Hyperlinks.Delete For Each Control In ActiveSheet.Controls Control.Select Control.Delete Next Control For Each Picture In ActiveSheet.Pictures Picture.Select Selection.Delete Next Picture For Each Shape In ActiveSheet.Shapes Shape.Select Selection.Delete Next Shape End Sub -- donwb |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
if you get any help ... I am serching for something similar - I need to get the value currently selected in a html "select" control the Html snipit looks like this" <form <pSelect Item<br <SELECT NAME="Item" onChange="go()" <OPTION VALUE="3" Item-3-description <OPTION VALUE="1" Item-1-description <OPTION VALUE="5" Item-5-description <OPTION VALUE="4" Item-4-description <OPTION SELECTED VALUE="product-2-description" <table<tr<tdproduct-2-description </SELECT </p </form ? Word.Documents(1).Fields(1).Code.Text HTMLCONTROL Forms.HTML:Select.1 The following has the entire html text word.Documents(2).HTMLProject.HTMLProjectItems(1). Text but I was hoping not to have to parse this since word seems to know its a control. Using excel : ?ActiveSheet.OLEObjects(1).progid Forms.HTML:Select.1 ActiveSheet.OLEObjects(1).oletype = xlOLEControl (documentation only referes to xlOLEType constants: xlOLELink or xlOLEEmbed) documentation for "OLE Programmatic Identifiers" includes information for ActiveX Controls, Microsoft Access, Microsoft Excel, Microsoft Graph Microsoft Office Web Components, Microsoft Outlook, Microsoft PowerPoint Microsoft Word, ActiveX Controls ActiveX Controls To create the ActiveX controls listed in the following table, use the corresponding OLE programmatic identifier. To create this control Use this identifier CheckBox Forms.CheckBox.1 ComboBox Forms.ComboBox.1 etc. -- markbpdx ------------------------------------------------------------------------ markbpdx's Profile: http://www.excelforum.com/member.php...o&userid=37610 View this thread: http://www.excelforum.com/showthread...hreadid=525928 |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
if you get any help ... I am serching for something similar - I need to get the value currently selected in a html "select" control the Html snipit looks like this" <form <pSelect Item<br <SELECT NAME="Item" onChange="go()" <OPTION VALUE="3" Item-3-description <OPTION VALUE="1" Item-1-description <OPTION VALUE="5" Item-5-description <OPTION VALUE="4" Item-4-description <OPTION SELECTED VALUE="2"product-2-description </SELECT </p </form ? Word.Documents(1).Fields(1).Code.Text HTMLCONTROL Forms.HTML:Select.1 The following has the entire html text word.Documents(2).HTMLProject.HTMLProjectItems(1). Text but I was hoping not to have to parse this since word seems to know its a control. Using excel : ?ActiveSheet.OLEObjects(1).progid Forms.HTML:Select.1 ActiveSheet.OLEObjects(1).oletype = xlOLEControl (documentation only referes to xlOLEType constants: xlOLELink or xlOLEEmbed) documentation for "OLE Programmatic Identifiers" includes information for ActiveX Controls, Microsoft Access, Microsoft Excel, Microsoft Graph Microsoft Office Web Components, Microsoft Outlook, Microsoft PowerPoint Microsoft Word, ActiveX Controls ActiveX Controls To create the ActiveX controls listed in the following table, use the corresponding OLE programmatic identifier. To create this control Use this identifier CheckBox Forms.CheckBox.1 ComboBox Forms.ComboBox.1 etc. Idealy I would like to be able to determine the current selected item and its associated Value(s) -- markbpdx ------------------------------------------------------------------------ markbpdx's Profile: http://www.excelforum.com/member.php...o&userid=37610 View this thread: http://www.excelforum.com/showthread...hreadid=525928 |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
Where is the HTML page loaded? In IE ?
The HTML portion is clear, but the relevance of all the stuff below is not. ....and the thread you're tagging on to is long gone from my newsgroup server so no clues there either. Tim "markbpdx" wrote in message ... if you get any help ... I am serching for something similar - I need to get the value currently selected in a html "select" control the Html snipit looks like this" <form <pSelect Item<br <SELECT NAME="Item" onChange="go()" <OPTION VALUE="3" Item-3-description <OPTION VALUE="1" Item-1-description <OPTION VALUE="5" Item-5-description <OPTION VALUE="4" Item-4-description <OPTION SELECTED VALUE="product-2-description" <table<tr<tdproduct-2-description </SELECT </p </form ? Word.Documents(1).Fields(1).Code.Text HTMLCONTROL Forms.HTML:Select.1 The following has the entire html text word.Documents(2).HTMLProject.HTMLProjectItems(1). Text but I was hoping not to have to parse this since word seems to know its a control. Using excel : ?ActiveSheet.OLEObjects(1).progid Forms.HTML:Select.1 ActiveSheet.OLEObjects(1).oletype = xlOLEControl (documentation only referes to xlOLEType constants: xlOLELink or xlOLEEmbed) documentation for "OLE Programmatic Identifiers" includes information for ActiveX Controls, Microsoft Access, Microsoft Excel, Microsoft Graph Microsoft Office Web Components, Microsoft Outlook, Microsoft PowerPoint Microsoft Word, ActiveX Controls ActiveX Controls To create the ActiveX controls listed in the following table, use the corresponding OLE programmatic identifier. To create this control Use this identifier CheckBox Forms.CheckBox.1 ComboBox Forms.ComboBox.1 etc. -- markbpdx ------------------------------------------------------------------------ markbpdx's Profile: http://www.excelforum.com/member.php...o&userid=37610 View this thread: http://www.excelforum.com/showthread...hreadid=525928 |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
I am serching for something that will tell the value currently selected in a html "select" control using VBA from Excel or Word (or for that matter any ole controls that have an oletype=xlOLEControl and a progid=forms.HTML:[xxxxx] where [xxxxx] is some html form tag) the Html snipit looks like this" <form <pSelect Item<br <SELECT NAME="Item" onChange="go()" <OPTION VALUE="3" Item-3-description <OPTION VALUE="1" Item-1-description <OPTION VALUE="5" Item-5-description <OPTION VALUE="4" Item-4-description <OPTION SELECTED VALUE="2"product-2-description </SELECT </p </form the web page is opened in word (or excel) using an open method similar to the following Word.Documents.Open "http://xxx.yyy.com/page.php?" & paramater_list the web page is displayed in the Word or Excel Document/worksheet ------------------------------------------ In the immediate window ?Word.Documents(1).Fields(1).Code.Text returns this as a value without the quotes "HTMLCONTROL Forms.HTML:Select.1" (the fields property seems to be the one you would use if it was a word or excel form) The following property returns the entire html text: word.Documents(1).HTMLProject.HTMLProjectItems(1). Text But I was hoping not to have to parse this since word seems to know its a control. Using excel in a similar fasion: ?ActiveSheet.OLEObjects(1).progid returns "Forms.HTML:Select.1" Excel also has an oleObjects collection which returns oletype xlOLEControl (documentation only referes to xlOLEType constants: xlOLELink or xlOLEEmbed as being returned ) ActiveSheet.OLEObjects(1).oletype = xlOLEControl -------------------------------------------------------------------- The following is excerpted from vba documentation for "OLE Programmatic Identifiers" ----------------------------------------------------------------------- ActiveX Controls, Microsoft Access, Microsoft Excel, Microsoft Graph Microsoft Office Web Components, Microsoft Outlook, Microsoft PowerPoint Microsoft Word, ActiveX Controls ActiveX Controls To create the ActiveX controls listed in the following table, use the corresponding OLE programmatic identifier. _To_create_this_contro_l _Use_this_identifier__ CheckBox Forms.CheckBox.1 ComboBox Forms.ComboBox.1 etc. ------------------------------------------------------------------------ that section doesn't seem to refer to any identifiers with HTML in them but it seems like it would be the likely place to find them. Idealy I would like to be able to determine the current selected item and its associated Value(s) -- markbpdx ------------------------------------------------------------------------ markbpdx's Profile: http://www.excelforum.com/member.php...o&userid=37610 View this thread: http://www.excelforum.com/showthread...hreadid=525928 |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
HTML / web page processing
Have you tried automating IE to load the page and then reading the values
from there? Navigating a page loaded in Word/Excel doesn't seem to expose the controls in a predictable way, and the "native" HTML methods/properties are not available. Lots of past posts in this group on using IE. Tim "markbpdx" wrote in message ... I am serching for something that will tell the value currently selected in a html "select" control using VBA from Excel or Word (or for that matter any ole controls that have an oletype=xlOLEControl and a progid=forms.HTML:[xxxxx] where [xxxxx] is some html form tag) the Html snipit looks like this" <form <pSelect Item<br <SELECT NAME="Item" onChange="go()" <OPTION VALUE="3" Item-3-description <OPTION VALUE="1" Item-1-description <OPTION VALUE="5" Item-5-description <OPTION VALUE="4" Item-4-description <OPTION SELECTED VALUE="2"product-2-description </SELECT </p </form the web page is opened in word (or excel) using an open method similar to the following Word.Documents.Open "http://xxx.yyy.com/page.php?" & paramater_list the web page is displayed in the Word or Excel Document/worksheet ------------------------------------------ In the immediate window ?Word.Documents(1).Fields(1).Code.Text returns this as a value without the quotes "HTMLCONTROL Forms.HTML:Select.1" (the fields property seems to be the one you would use if it was a word or excel form) The following property returns the entire html text: word.Documents(1).HTMLProject.HTMLProjectItems(1). Text But I was hoping not to have to parse this since word seems to know its a control. Using excel in a similar fasion: ?ActiveSheet.OLEObjects(1).progid returns "Forms.HTML:Select.1" Excel also has an oleObjects collection which returns oletype xlOLEControl (documentation only referes to xlOLEType constants: xlOLELink or xlOLEEmbed as being returned ) ActiveSheet.OLEObjects(1).oletype = xlOLEControl -------------------------------------------------------------------- The following is excerpted from vba documentation for "OLE Programmatic Identifiers" ----------------------------------------------------------------------- ActiveX Controls, Microsoft Access, Microsoft Excel, Microsoft Graph Microsoft Office Web Components, Microsoft Outlook, Microsoft PowerPoint Microsoft Word, ActiveX Controls ActiveX Controls To create the ActiveX controls listed in the following table, use the corresponding OLE programmatic identifier. _To_create_this_contro_l _Use_this_identifier__ CheckBox Forms.CheckBox.1 ComboBox Forms.ComboBox.1 etc. ------------------------------------------------------------------------ that section doesn't seem to refer to any identifiers with HTML in them but it seems like it would be the likely place to find them. Idealy I would like to be able to determine the current selected item and its associated Value(s) -- markbpdx ------------------------------------------------------------------------ markbpdx's Profile: http://www.excelforum.com/member.php...o&userid=37610 View this thread: http://www.excelforum.com/showthread...hreadid=525928 |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Stop excel from processing HTML? IE paste raw html. | Excel Discussion (Misc queries) | |||
How do I bookmark an HTML page? | Excel Discussion (Misc queries) | |||
How do I create a HTML text for my web page? | New Users to Excel | |||
Excel < HTML Web page - How ? | Excel Programming | |||
saving as HTML page | Excel Discussion (Misc queries) |