Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Can I import raw HTML source code into Excel?
I want to run a web query from Excel that retreives the raw HTML code, not the parsed HTML page. Specifically, instead of retreiving text from the page, I want to return the page's HTML title (read: the text that falls between the <title tags). Unfortunately, the web query interface only lets you pick page content. I recorded the web query process and tried hacking the VBA by hand, but it doesn't appear that there are any objects/parameters/commands that allow me to import the page as pure source code. Does anyone know how to trick the web query into loading the source code into worksheet? If so, I can then write some VBA to strip the <title value out for me. Any gurus out there have an answer? Thanks in advance- Jon |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Can I import raw HTML source code into Excel?
Hi Jon,
Here's a function that should return the page title of an HTML document (or an empty string if URL is invalid or can't be found): Public Function gsGetURLTitle(rsURL As String) As String Dim ie As Object Set ie = CreateObject("InternetExplorer.Application") With ie .Navigate rsURL Do While .Busy And Not .ReadyState = 4 DoEvents Loop gsGetURLTitle = IIf(StrComp(.Document.Title, _ "Cannot find server", vbTextCompare) = 0, _ vbNullString, .Document.Title) .Quit End With Set ie = Nothing End Function -- Regards, Jake Marx MS MVP - Excel www.longhead.com [please keep replies in the newsgroup - email address unmonitored] J Diorio wrote: I want to run a web query from Excel that retreives the raw HTML code, not the parsed HTML page. Specifically, instead of retreiving text from the page, I want to return the page's HTML title (read: the text that falls between the <title tags). Unfortunately, the web query interface only lets you pick page content. I recorded the web query process and tried hacking the VBA by hand, but it doesn't appear that there are any objects/parameters/commands that allow me to import the page as pure source code. Does anyone know how to trick the web query into loading the source code into worksheet? If so, I can then write some VBA to strip the <title value out for me. Any gurus out there have an answer? Thanks in advance- Jon |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Can I import raw HTML source code into Excel?
THANKS JAKE! I can't believe you did that in 4 minutes. -Jon "Jake Marx" wrote: Hi Jon, Here's a function that should return the page title of an HTML document (or an empty string if URL is invalid or can't be found): Public Function gsGetURLTitle(rsURL As String) As String Dim ie As Object Set ie = CreateObject("InternetExplorer.Application") With ie .Navigate rsURL Do While .Busy And Not .ReadyState = 4 DoEvents Loop gsGetURLTitle = IIf(StrComp(.Document.Title, _ "Cannot find server", vbTextCompare) = 0, _ vbNullString, .Document.Title) .Quit End With Set ie = Nothing End Function -- Regards, Jake Marx MS MVP - Excel www.longhead.com [please keep replies in the newsgroup - email address unmonitored] J Diorio wrote: I want to run a web query from Excel that retreives the raw HTML code, not the parsed HTML page. Specifically, instead of retreiving text from the page, I want to return the page's HTML title (read: the text that falls between the <title tags). Unfortunately, the web query interface only lets you pick page content. I recorded the web query process and tried hacking the VBA by hand, but it doesn't appear that there are any objects/parameters/commands that allow me to import the page as pure source code. Does anyone know how to trick the web query into loading the source code into worksheet? If so, I can then write some VBA to strip the <title value out for me. Any gurus out there have an answer? Thanks in advance- Jon |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Can I import raw HTML source code into Excel?
Jon d wrote:
THANKS JAKE! I can't believe you did that in 4 minutes. No problem - glad to help! And actually, it was more like 15 minutes. But who's counting? <g -- Regards, Jake Marx MS MVP - Excel www.longhead.com [please keep replies in the newsgroup - email address unmonitored] |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Can I import raw HTML source code into Excel?
okay, that works for the title, but i'm trying to parse the actual html to
pick out data in the middle of the page. how would you alter this to search to just get the raw html code on the page? "Jake Marx" wrote: Hi Jon, Here's a function that should return the page title of an HTML document (or an empty string if URL is invalid or can't be found): Public Function gsGetURLTitle(rsURL As String) As String Dim ie As Object Set ie = CreateObject("InternetExplorer.Application") With ie .Navigate rsURL Do While .Busy And Not .ReadyState = 4 DoEvents Loop gsGetURLTitle = IIf(StrComp(.Document.Title, _ "Cannot find server", vbTextCompare) = 0, _ vbNullString, .Document.Title) .Quit End With Set ie = Nothing End Function -- Regards, Jake Marx MS MVP - Excel www.longhead.com [please keep replies in the newsgroup - email address unmonitored] J Diorio wrote: I want to run a web query from Excel that retreives the raw HTML code, not the parsed HTML page. Specifically, instead of retreiving text from the page, I want to return the page's HTML title (read: the text that falls between the <title tags). Unfortunately, the web query interface only lets you pick page content. I recorded the web query process and tried hacking the VBA by hand, but it doesn't appear that there are any objects/parameters/commands that allow me to import the page as pure source code. Does anyone know how to trick the web query into loading the source code into worksheet? If so, I can then write some VBA to strip the <title value out for me. Any gurus out there have an answer? Thanks in advance- Jon |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Can I import raw HTML source code into Excel?
..Document.Body.outerHTML Tim "got mike?" wrote in message ... okay, that works for the title, but i'm trying to parse the actual html to pick out data in the middle of the page. how would you alter this to search to just get the raw html code on the page? "Jake Marx" wrote: Hi Jon, Here's a function that should return the page title of an HTML document (or an empty string if URL is invalid or can't be found): Public Function gsGetURLTitle(rsURL As String) As String Dim ie As Object Set ie = CreateObject("InternetExplorer.Application") With ie .Navigate rsURL Do While .Busy And Not .ReadyState = 4 DoEvents Loop gsGetURLTitle = IIf(StrComp(.Document.Title, _ "Cannot find server", vbTextCompare) = 0, _ vbNullString, .Document.Title) .Quit End With Set ie = Nothing End Function -- Regards, Jake Marx MS MVP - Excel www.longhead.com |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
How to import html file in excel 2007 | Excel Discussion (Misc queries) | |||
HTML TAGS that format Data for import into Excel | Excel Programming | |||
Import HTML reports into Excel with page formatting options | Excel Programming | |||
Parsing Data From HTML Source into Excel using VBA | Excel Programming | |||
Calling HTML Source code from within VBA for Excel | Excel Programming |