ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   Can I import raw HTML source code into Excel? (https://www.excelbanter.com/excel-programming/325621-can-i-import-raw-html-source-code-into-excel.html)

J Diorio

Can I import raw HTML source code into Excel?
 

I want to run a web query from Excel that retreives the raw HTML code, not
the parsed HTML page.

Specifically, instead of retreiving text from the page, I want to return the
page's HTML title (read: the text that falls between the <title tags).

Unfortunately, the web query interface only lets you pick page content. I
recorded the web query process and tried hacking the VBA by hand, but it
doesn't appear that there are any objects/parameters/commands that allow me
to import the page as pure source code.

Does anyone know how to trick the web query into loading the source code
into worksheet? If so, I can then write some VBA to strip the <title value
out for me.

Any gurus out there have an answer?

Thanks in advance-
Jon

Jake Marx[_3_]

Can I import raw HTML source code into Excel?
 
Hi Jon,

Here's a function that should return the page title of an HTML document (or
an empty string if URL is invalid or can't be found):

Public Function gsGetURLTitle(rsURL As String) As String
Dim ie As Object

Set ie = CreateObject("InternetExplorer.Application")

With ie
.Navigate rsURL
Do While .Busy And Not .ReadyState = 4
DoEvents
Loop

gsGetURLTitle = IIf(StrComp(.Document.Title, _
"Cannot find server", vbTextCompare) = 0, _
vbNullString, .Document.Title)
.Quit
End With

Set ie = Nothing
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


J Diorio wrote:
I want to run a web query from Excel that retreives the raw HTML
code, not the parsed HTML page.

Specifically, instead of retreiving text from the page, I want to
return the page's HTML title (read: the text that falls between the
<title tags).

Unfortunately, the web query interface only lets you pick page
content. I recorded the web query process and tried hacking the VBA
by hand, but it doesn't appear that there are any
objects/parameters/commands that allow me to import the page as pure
source code.

Does anyone know how to trick the web query into loading the source
code into worksheet? If so, I can then write some VBA to strip the
<title value out for me.

Any gurus out there have an answer?

Thanks in advance-
Jon



Jon d

Can I import raw HTML source code into Excel?
 

THANKS JAKE! I can't believe you did that in 4 minutes.

-Jon


"Jake Marx" wrote:

Hi Jon,

Here's a function that should return the page title of an HTML document (or
an empty string if URL is invalid or can't be found):

Public Function gsGetURLTitle(rsURL As String) As String
Dim ie As Object

Set ie = CreateObject("InternetExplorer.Application")

With ie
.Navigate rsURL
Do While .Busy And Not .ReadyState = 4
DoEvents
Loop

gsGetURLTitle = IIf(StrComp(.Document.Title, _
"Cannot find server", vbTextCompare) = 0, _
vbNullString, .Document.Title)
.Quit
End With

Set ie = Nothing
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


J Diorio wrote:
I want to run a web query from Excel that retreives the raw HTML
code, not the parsed HTML page.

Specifically, instead of retreiving text from the page, I want to
return the page's HTML title (read: the text that falls between the
<title tags).

Unfortunately, the web query interface only lets you pick page
content. I recorded the web query process and tried hacking the VBA
by hand, but it doesn't appear that there are any
objects/parameters/commands that allow me to import the page as pure
source code.

Does anyone know how to trick the web query into loading the source
code into worksheet? If so, I can then write some VBA to strip the
<title value out for me.

Any gurus out there have an answer?

Thanks in advance-
Jon




Jake Marx[_3_]

Can I import raw HTML source code into Excel?
 
Jon d wrote:
THANKS JAKE! I can't believe you did that in 4 minutes.


No problem - glad to help! And actually, it was more like 15 minutes. But
who's counting? <g

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


got mike?[_2_]

Can I import raw HTML source code into Excel?
 
okay, that works for the title, but i'm trying to parse the actual html to
pick out data in the middle of the page. how would you alter this to search
to just get the raw html code on the page?

"Jake Marx" wrote:

Hi Jon,

Here's a function that should return the page title of an HTML document (or
an empty string if URL is invalid or can't be found):

Public Function gsGetURLTitle(rsURL As String) As String
Dim ie As Object

Set ie = CreateObject("InternetExplorer.Application")

With ie
.Navigate rsURL
Do While .Busy And Not .ReadyState = 4
DoEvents
Loop

gsGetURLTitle = IIf(StrComp(.Document.Title, _
"Cannot find server", vbTextCompare) = 0, _
vbNullString, .Document.Title)
.Quit
End With

Set ie = Nothing
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com

[please keep replies in the newsgroup - email address unmonitored]


J Diorio wrote:
I want to run a web query from Excel that retreives the raw HTML
code, not the parsed HTML page.

Specifically, instead of retreiving text from the page, I want to
return the page's HTML title (read: the text that falls between the
<title tags).

Unfortunately, the web query interface only lets you pick page
content. I recorded the web query process and tried hacking the VBA
by hand, but it doesn't appear that there are any
objects/parameters/commands that allow me to import the page as pure
source code.

Does anyone know how to trick the web query into loading the source
code into worksheet? If so, I can then write some VBA to strip the
<title value out for me.

Any gurus out there have an answer?

Thanks in advance-
Jon




Tim Williams

Can I import raw HTML source code into Excel?
 

..Document.Body.outerHTML

Tim


"got mike?" wrote in message
...
okay, that works for the title, but i'm trying to parse the actual html to
pick out data in the middle of the page. how would you alter this to
search
to just get the raw html code on the page?

"Jake Marx" wrote:

Hi Jon,

Here's a function that should return the page title of an HTML document
(or
an empty string if URL is invalid or can't be found):

Public Function gsGetURLTitle(rsURL As String) As String
Dim ie As Object

Set ie = CreateObject("InternetExplorer.Application")

With ie
.Navigate rsURL
Do While .Busy And Not .ReadyState = 4
DoEvents
Loop

gsGetURLTitle = IIf(StrComp(.Document.Title, _
"Cannot find server", vbTextCompare) = 0, _
vbNullString, .Document.Title)
.Quit
End With

Set ie = Nothing
End Function

--
Regards,

Jake Marx
MS MVP - Excel
www.longhead.com





All times are GMT +1. The time now is 08:07 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com