Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 395
Default Excel VBA to acess currently open web page (existing IE6 insta

Don- thank you for your generous offer. Unfortunately, the URL is on a secure
site, and I have no way to give access to others; the good news is that I can
right-click and I see that the page source contains the information I want to
scrape with RegEx, so really any web page will do.

Here is a test case:
* Open IE
* Navigate to www.google.com (or an alternate address of your preference)
* Click View/Source - this will bring up the page source in notepad

The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would be easy
to open the URL directly from Excel VBA, but my challenge is to capture that
text string directly from an already open browser window.

I've been abusing code snippets from all sorts of sources (non-VBA sites,
etc) but haven't gotten anywhere. I expect it will be something like:

Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop

I appreciate your time and expertise, and any advice you might be able to
offer.

Thank you!
Keith


"Don Guillett" wrote:

There may be a way but witout seeing all the info, it's hard to say. Give
urls and the desired. OR,
If desired, send your file to my address below along with this msg and
a clear explanation of what you want and before/after examples.

--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I asked my
question poorly and the thread died, so here goes another try.

I have a series of web pages that for the sake of convenience, I really
can't access programatically with Excel (I know it is probably possible,
but
there are several pages of data entry to get to the target pages, and it
isn't practical)

So, I'm perfectly willing to access the web page directly in IE6. However,
I
want to scrape the target page's source and pull a bunch of data into
Excel.
This web page design prevents copy/paste, so my only way to scrape the
data
is from the page source (or hand-retype all the data, which is what I'm
trying to avoid).

I haven't found any information on how to get Excel to interact with an
existing IE6 instance and grab the page source on the fly (when a macro
runs). Can anyone point me in the right direction?

Also, I'm currently on IE6, but if a newer version of IE would make this
easier to program, I'd consider upgrading.

Many thanks,
Keith



  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 117
Default Excel VBA to acess currently open web page (existing IE6 insta

Keith,

What I show below might get you close. The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL for
the page you will have open (this is to prevent problems if you happen to
have multiple web pages open at once). I used google for the test.

Note that this delivers the inner html for the body of the document which is
not delivering as much as you get when you right click and use View Source.
If it doesn't deliver what you need, I suspect you can work with the "All"
collection and retrieve the specific text you're after. This at least shows
you how to capture content from already open IE windows.


'-------------------------------------------

Sub GetInnerHTML()

Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String


strPartURL = "google"

Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows

If objShellWindows.Count = 0 Then
Exit Sub
End If

For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
Set rng = objIE.Document.Body.CreateTextRange
strPageHTML = rng.htmlText
End If
Next i

MsgBox strPageHTML

Set objShellWindows = Nothing
Set objShell = Nothing
End Sub


'------------------------------------------

Steve Yandl



"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on a
secure
site, and I have no way to give access to others; the good news is that I
can
right-click and I see that the page source contains the information I want
to
scrape with RegEx, so really any web page will do.

Here is a test case:
* Open IE
* Navigate to www.google.com (or an alternate address of your preference)
* Click View/Source - this will bring up the page source in notepad

The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would be
easy
to open the URL directly from Excel VBA, but my challenge is to capture
that
text string directly from an already open browser window.

I've been abusing code snippets from all sorts of sources (non-VBA sites,
etc) but haven't gotten anywhere. I expect it will be something like:

Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop

I appreciate your time and expertise, and any advice you might be able to
offer.

Thank you!
Keith


"Don Guillett" wrote:

There may be a way but witout seeing all the info, it's hard to say. Give
urls and the desired. OR,
If desired, send your file to my address below along with this msg
and
a clear explanation of what you want and before/after examples.

--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I asked my
question poorly and the thread died, so here goes another try.

I have a series of web pages that for the sake of convenience, I really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages, and
it
isn't practical)

So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data into
Excel.
This web page design prevents copy/paste, so my only way to scrape the
data
is from the page source (or hand-retype all the data, which is what I'm
trying to avoid).

I haven't found any information on how to get Excel to interact with an
existing IE6 instance and grab the page source on the fly (when a macro
runs). Can anyone point me in the right direction?

Also, I'm currently on IE6, but if a newer version of IE would make
this
easier to program, I'd consider upgrading.

Many thanks,
Keith




  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 395
Default Excel VBA to acess currently open web page (existing IE6 insta

Thanks Steve and Tim- this definitely has me on the right track.

Turns out that the content is not within the innerHTML, and I've been
googling and checking MSDN for syntax to return all, but so far no luck. If
you happen to come back to this thread, I'd welcome any syntax hints on how
to return the full page source, which I can then parse with RegEx.

Many thanks,
Keith

"Steve Yandl" wrote:

Keith,

What I show below might get you close. The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL for
the page you will have open (this is to prevent problems if you happen to
have multiple web pages open at once). I used google for the test.

Note that this delivers the inner html for the body of the document which is
not delivering as much as you get when you right click and use View Source.
If it doesn't deliver what you need, I suspect you can work with the "All"
collection and retrieve the specific text you're after. This at least shows
you how to capture content from already open IE windows.


'-------------------------------------------

Sub GetInnerHTML()

Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String


strPartURL = "google"

Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows

If objShellWindows.Count = 0 Then
Exit Sub
End If

For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
Set rng = objIE.Document.Body.CreateTextRange
strPageHTML = rng.htmlText
End If
Next i

MsgBox strPageHTML

Set objShellWindows = Nothing
Set objShell = Nothing
End Sub


'------------------------------------------

Steve Yandl



"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on a
secure
site, and I have no way to give access to others; the good news is that I
can
right-click and I see that the page source contains the information I want
to
scrape with RegEx, so really any web page will do.

Here is a test case:
* Open IE
* Navigate to www.google.com (or an alternate address of your preference)
* Click View/Source - this will bring up the page source in notepad

The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would be
easy
to open the URL directly from Excel VBA, but my challenge is to capture
that
text string directly from an already open browser window.

I've been abusing code snippets from all sorts of sources (non-VBA sites,
etc) but haven't gotten anywhere. I expect it will be something like:

Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop

I appreciate your time and expertise, and any advice you might be able to
offer.

Thank you!
Keith


"Don Guillett" wrote:

There may be a way but witout seeing all the info, it's hard to say. Give
urls and the desired. OR,
If desired, send your file to my address below along with this msg
and
a clear explanation of what you want and before/after examples.

--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I asked my
question poorly and the thread died, so here goes another try.

I have a series of web pages that for the sake of convenience, I really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages, and
it
isn't practical)

So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data into
Excel.
This web page design prevents copy/paste, so my only way to scrape the
data
is from the page source (or hand-retype all the data, which is what I'm
trying to avoid).

I haven't found any information on how to get Excel to interact with an
existing IE6 instance and grab the page source on the fly (when a macro
runs). Can anyone point me in the right direction?

Also, I'm currently on IE6, but if a newer version of IE would make
this
easier to program, I'd consider upgrading.

Many thanks,
Keith





  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 117
Default Excel VBA to acess currently open web page (existing IE6 insta

Keith,

I'll experiment a bit and see if I can come up with something a bit
different to try.

The object I've used to retrieve the complete source text for a page is
"MSXML2.XMLHTTP". It works great but I've only used it where I had a URL
for a web page not yet open on my desktop. I'm just not sure how it would
behave when the page has essentially been updated by other pages in the
current active session.


Steve



"ker_01" wrote in message
...
Thanks Steve and Tim- this definitely has me on the right track.

Turns out that the content is not within the innerHTML, and I've been
googling and checking MSDN for syntax to return all, but so far no luck.
If
you happen to come back to this thread, I'd welcome any syntax hints on
how
to return the full page source, which I can then parse with RegEx.

Many thanks,
Keith

"Steve Yandl" wrote:

Keith,

What I show below might get you close. The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL
for
the page you will have open (this is to prevent problems if you happen to
have multiple web pages open at once). I used google for the test.

Note that this delivers the inner html for the body of the document which
is
not delivering as much as you get when you right click and use View
Source.
If it doesn't deliver what you need, I suspect you can work with the
"All"
collection and retrieve the specific text you're after. This at least
shows
you how to capture content from already open IE windows.


'-------------------------------------------

Sub GetInnerHTML()

Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String


strPartURL = "google"

Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows

If objShellWindows.Count = 0 Then
Exit Sub
End If

For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
Set rng = objIE.Document.Body.CreateTextRange
strPageHTML = rng.htmlText
End If
Next i

MsgBox strPageHTML

Set objShellWindows = Nothing
Set objShell = Nothing
End Sub


'------------------------------------------

Steve Yandl



"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on a
secure
site, and I have no way to give access to others; the good news is that
I
can
right-click and I see that the page source contains the information I
want
to
scrape with RegEx, so really any web page will do.

Here is a test case:
* Open IE
* Navigate to www.google.com (or an alternate address of your
preference)
* Click View/Source - this will bring up the page source in notepad

The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would
be
easy
to open the URL directly from Excel VBA, but my challenge is to capture
that
text string directly from an already open browser window.

I've been abusing code snippets from all sorts of sources (non-VBA
sites,
etc) but haven't gotten anywhere. I expect it will be something like:

Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop

I appreciate your time and expertise, and any advice you might be able
to
offer.

Thank you!
Keith


"Don Guillett" wrote:

There may be a way but witout seeing all the info, it's hard to say.
Give
urls and the desired. OR,
If desired, send your file to my address below along with this
msg
and
a clear explanation of what you want and before/after examples.

--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I asked
my
question poorly and the thread died, so here goes another try.

I have a series of web pages that for the sake of convenience, I
really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages,
and
it
isn't practical)

So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data
into
Excel.
This web page design prevents copy/paste, so my only way to scrape
the
data
is from the page source (or hand-retype all the data, which is what
I'm
trying to avoid).

I haven't found any information on how to get Excel to interact with
an
existing IE6 instance and grab the page source on the fly (when a
macro
runs). Can anyone point me in the right direction?

Also, I'm currently on IE6, but if a newer version of IE would make
this
easier to program, I'd consider upgrading.

Many thanks,
Keith






  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 114
Default Excel VBA to acess currently open web page (existing IE6 insta


document.body.innerHTML

should do it. Without seeing the page you're trying to capture it's
hard to know what else to suggest.

Tim

On Jul 14, 3:30*pm, ker_01 wrote:
Thanks Steve and Tim- this definitely has me on the right track.

Turns out that the content is not within the innerHTML, and I've been
googling and checking MSDN for syntax to return all, but so far no luck. If
you happen to come back to this thread, I'd welcome any syntax hints on how
to return the full page source, which I can then parse with RegEx.

Many thanks,
Keith



"Steve Yandl" wrote:
Keith,


What I show below might get you close. *The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL for
the page you will have open (this is to prevent problems if you happen to
have multiple web pages open at once). *I used google for the test.


Note that this delivers the inner html for the body of the document which is
not delivering as much as you get when you right click and use View Source.
If it doesn't deliver what you need, I suspect you can work with the "All"
collection and retrieve the specific text you're after. *This at least shows
you how to capture content from already open IE windows.


'-------------------------------------------


Sub GetInnerHTML()


Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String


strPartURL = "google"


Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows


If objShellWindows.Count = 0 Then
* * Exit Sub
End If


For i = 0 To objShellWindows.Count - 1
* * Set objIE = objShellWindows.Item(i)
* * If InStr(objIE.LocationURL, strPartURL) Then
* * * * Set rng = objIE.Document.Body.CreateTextRange
* * * * strPageHTML = rng.htmlText
* * End If
Next i


MsgBox strPageHTML


Set objShellWindows = Nothing
Set objShell = Nothing
End Sub


'------------------------------------------


Steve Yandl


"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on a
secure
site, and I have no way to give access to others; the good news is that I
can
right-click and I see that the page source contains the information I want
to
scrape with RegEx, so really any web page will do.


Here is a test case:
* Open IE
* Navigate towww.google.com(or an alternate address of your preference)
* Click View/Source - this will bring up the page source in notepad


The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would be
easy
to open the URL directly from Excel VBA, but my challenge is to capture
that
text string directly from an already open browser window.


I've been abusing code snippets from all sorts of sources (non-VBA sites,
etc) but haven't gotten anywhere. I expect it will be something like:


Dim IE As SHDocVw.InternetExplorer
Do
* *MsgBox IE.document '.body.innerHTML?
Loop


I appreciate your time and expertise, and any advice you might be able to
offer.


Thank you!
Keith


"Don Guillett" wrote:


There may be a way but witout seeing all the info, it's hard to say. Give
urls and the desired. OR,
* * * If desired, send your file to my address below along with this msg
and
a clear explanation of what you want and before/after examples.


--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I asked my
question poorly and the thread died, so here goes another try.


I have a series of web pages that for the sake of convenience, I really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages, and
it
isn't practical)


So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data into
Excel.
This web page design prevents copy/paste, so my only way to scrape the
data
is from the page source (or hand-retype all the data, which is what I'm
trying to avoid).


I haven't found any information on how to get Excel to interact with an
existing IE6 instance and grab the page source on the fly (when a macro
runs). Can anyone point me in the right direction?


Also, I'm currently on IE6, but if a newer version of IE would make
this
easier to program, I'd consider upgrading.


Many thanks,
Keith- Hide quoted text -


- Show quoted text -




  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 117
Default Excel VBA to acess currently open web page (existing IE6 insta

Tim,

I agree with you but when I tested with the google home page there is a
significant difference between what you get for innerHTML and what I can get
going to 'View Source' from the IE context menu. I tried to pull the text
from the temporary internet files using a script and while that might work
for Keith, I failed on a PC running a Vista system with IE8. I think that
if he can share where the data is (in a table, in a text area or wherever)
we might find a solution. It's just tough without the file but I understand
completely that he can't and shouldn't share the file in a newsgroup.

Steve



"Tim Williams" wrote in message
...

document.body.innerHTML

should do it. Without seeing the page you're trying to capture it's
hard to know what else to suggest.

Tim

On Jul 14, 3:30 pm, ker_01 wrote:
Thanks Steve and Tim- this definitely has me on the right track.

Turns out that the content is not within the innerHTML, and I've been
googling and checking MSDN for syntax to return all, but so far no luck.
If
you happen to come back to this thread, I'd welcome any syntax hints on
how
to return the full page source, which I can then parse with RegEx.

Many thanks,
Keith



"Steve Yandl" wrote:
Keith,


What I show below might get you close. The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL
for
the page you will have open (this is to prevent problems if you happen
to
have multiple web pages open at once). I used google for the test.


Note that this delivers the inner html for the body of the document
which is
not delivering as much as you get when you right click and use View
Source.
If it doesn't deliver what you need, I suspect you can work with the
"All"
collection and retrieve the specific text you're after. This at least
shows
you how to capture content from already open IE windows.


'-------------------------------------------


Sub GetInnerHTML()


Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String


strPartURL = "google"


Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows


If objShellWindows.Count = 0 Then
Exit Sub
End If


For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
Set rng = objIE.Document.Body.CreateTextRange
strPageHTML = rng.htmlText
End If
Next i


MsgBox strPageHTML


Set objShellWindows = Nothing
Set objShell = Nothing
End Sub


'------------------------------------------


Steve Yandl


"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on a
secure
site, and I have no way to give access to others; the good news is
that I
can
right-click and I see that the page source contains the information I
want
to
scrape with RegEx, so really any web page will do.


Here is a test case:
* Open IE
* Navigate towww.google.com(or an alternate address of your
preference)
* Click View/Source - this will bring up the page source in notepad


The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would
be
easy
to open the URL directly from Excel VBA, but my challenge is to
capture
that
text string directly from an already open browser window.


I've been abusing code snippets from all sorts of sources (non-VBA
sites,
etc) but haven't gotten anywhere. I expect it will be something like:


Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop


I appreciate your time and expertise, and any advice you might be able
to
offer.


Thank you!
Keith


"Don Guillett" wrote:


There may be a way but witout seeing all the info, it's hard to say.
Give
urls and the desired. OR,
If desired, send your file to my address below along with this msg
and
a clear explanation of what you want and before/after examples.


--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I asked
my
question poorly and the thread died, so here goes another try.


I have a series of web pages that for the sake of convenience, I
really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages,
and
it
isn't practical)


So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data
into
Excel.
This web page design prevents copy/paste, so my only way to scrape
the
data
is from the page source (or hand-retype all the data, which is what
I'm
trying to avoid).


I haven't found any information on how to get Excel to interact
with an
existing IE6 instance and grab the page source on the fly (when a
macro
runs). Can anyone point me in the right direction?


Also, I'm currently on IE6, but if a newer version of IE would make
this
easier to program, I'd consider upgrading.


Many thanks,
Keith- Hide quoted text -


- Show quoted text -


  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 298
Default Excel VBA to acess currently open web page (existing IE6 insta

Agreed: any page which generates content on the client-side once the page
has loaded (eg. via javascript/AJAX) will show differences between
view-source (the HTML as delivered from the server) vs. getting the
innerHTML of the final page. However, the latter should reflect the final
content, and so it's not clear why innerHTML isn't giving the correct thing.

To test, I'd suggest using a bookmarklet script to view the final HTML:
http://sniptools.com/vault/view-sour...-a-bookmarklet

If that doesn't show what you expect then you'll have to post a URL or give
some description of what's on the page.

Tim


"Steve Yandl" wrote in message
...
Tim,

I agree with you but when I tested with the google home page there is a
significant difference between what you get for innerHTML and what I can
get going to 'View Source' from the IE context menu. I tried to pull the
text from the temporary internet files using a script and while that might
work for Keith, I failed on a PC running a Vista system with IE8. I think
that if he can share where the data is (in a table, in a text area or
wherever) we might find a solution. It's just tough without the file but
I understand completely that he can't and shouldn't share the file in a
newsgroup.

Steve



"Tim Williams" wrote in message
...

document.body.innerHTML

should do it. Without seeing the page you're trying to capture it's
hard to know what else to suggest.

Tim

On Jul 14, 3:30 pm, ker_01 wrote:
Thanks Steve and Tim- this definitely has me on the right track.

Turns out that the content is not within the innerHTML, and I've been
googling and checking MSDN for syntax to return all, but so far no luck.
If
you happen to come back to this thread, I'd welcome any syntax hints on
how
to return the full page source, which I can then parse with RegEx.

Many thanks,
Keith



"Steve Yandl" wrote:
Keith,


What I show below might get you close. The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL
for
the page you will have open (this is to prevent problems if you happen
to
have multiple web pages open at once). I used google for the test.


Note that this delivers the inner html for the body of the document
which is
not delivering as much as you get when you right click and use View
Source.
If it doesn't deliver what you need, I suspect you can work with the
"All"
collection and retrieve the specific text you're after. This at least
shows
you how to capture content from already open IE windows.


'-------------------------------------------


Sub GetInnerHTML()


Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String


strPartURL = "google"


Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows


If objShellWindows.Count = 0 Then
Exit Sub
End If


For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
Set rng = objIE.Document.Body.CreateTextRange
strPageHTML = rng.htmlText
End If
Next i


MsgBox strPageHTML


Set objShellWindows = Nothing
Set objShell = Nothing
End Sub


'------------------------------------------


Steve Yandl


"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on
a
secure
site, and I have no way to give access to others; the good news is
that I
can
right-click and I see that the page source contains the information I
want
to
scrape with RegEx, so really any web page will do.


Here is a test case:
* Open IE
* Navigate towww.google.com(or an alternate address of your
preference)
* Click View/Source - this will bring up the page source in notepad


The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would
be
easy
to open the URL directly from Excel VBA, but my challenge is to
capture
that
text string directly from an already open browser window.


I've been abusing code snippets from all sorts of sources (non-VBA
sites,
etc) but haven't gotten anywhere. I expect it will be something like:


Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop


I appreciate your time and expertise, and any advice you might be
able to
offer.


Thank you!
Keith


"Don Guillett" wrote:


There may be a way but witout seeing all the info, it's hard to say.
Give
urls and the desired. OR,
If desired, send your file to my address below along with this msg
and
a clear explanation of what you want and before/after examples.


--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I
asked my
question poorly and the thread died, so here goes another try.


I have a series of web pages that for the sake of convenience, I
really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages,
and
it
isn't practical)


So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data
into
Excel.
This web page design prevents copy/paste, so my only way to scrape
the
data
is from the page source (or hand-retype all the data, which is
what I'm
trying to avoid).


I haven't found any information on how to get Excel to interact
with an
existing IE6 instance and grab the page source on the fly (when a
macro
runs). Can anyone point me in the right direction?


Also, I'm currently on IE6, but if a newer version of IE would
make
this
easier to program, I'd consider upgrading.


Many thanks,
Keith- Hide quoted text -


- Show quoted text -




  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 395
Default Excel VBA to acess currently open web page (existing IE6 insta

Steve, Tim, JLGWhiz, and Don-

Thank you all for your help. I now have it working (details below).
Unfortunately I haven't done any web programming, so while I originally
thought I needed the innerHTML, I didn't know what was client side, etc, so I
was just fumbling to get everything so I could parse out the little bits I
need.

Turns out I was working on the wrong line of code; I was trying to edit
Set rng = objIE.Document.Body.CreateTextRange
to things like ...body.all
but when I left that line alone and went to the subsequent line
strPageHTML = rng.htmlText
and changed it to
strPageHTML = rng.text

Now the resulting text string includes the content I need to parse.

Thank you all- I would not have figured this all out (pulling an active IE
instance, etc) without the help of this newsgroup- it is much appreciated!

Keith

"Tim Williams" wrote:

Agreed: any page which generates content on the client-side once the page
has loaded (eg. via javascript/AJAX) will show differences between
view-source (the HTML as delivered from the server) vs. getting the
innerHTML of the final page. However, the latter should reflect the final
content, and so it's not clear why innerHTML isn't giving the correct thing.

To test, I'd suggest using a bookmarklet script to view the final HTML:
http://sniptools.com/vault/view-sour...-a-bookmarklet

If that doesn't show what you expect then you'll have to post a URL or give
some description of what's on the page.

Tim


"Steve Yandl" wrote in message
...
Tim,

I agree with you but when I tested with the google home page there is a
significant difference between what you get for innerHTML and what I can
get going to 'View Source' from the IE context menu. I tried to pull the
text from the temporary internet files using a script and while that might
work for Keith, I failed on a PC running a Vista system with IE8. I think
that if he can share where the data is (in a table, in a text area or
wherever) we might find a solution. It's just tough without the file but
I understand completely that he can't and shouldn't share the file in a
newsgroup.

Steve



"Tim Williams" wrote in message
...

document.body.innerHTML

should do it. Without seeing the page you're trying to capture it's
hard to know what else to suggest.

Tim

On Jul 14, 3:30 pm, ker_01 wrote:
Thanks Steve and Tim- this definitely has me on the right track.

Turns out that the content is not within the innerHTML, and I've been
googling and checking MSDN for syntax to return all, but so far no luck.
If
you happen to come back to this thread, I'd welcome any syntax hints on
how
to return the full page source, which I can then parse with RegEx.

Many thanks,
Keith



"Steve Yandl" wrote:
Keith,

What I show below might get you close. The line
strPartURL = "google"
should be amended so the string on the right is some portion of the URL
for
the page you will have open (this is to prevent problems if you happen
to
have multiple web pages open at once). I used google for the test.

Note that this delivers the inner html for the body of the document
which is
not delivering as much as you get when you right click and use View
Source.
If it doesn't deliver what you need, I suspect you can work with the
"All"
collection and retrieve the specific text you're after. This at least
shows
you how to capture content from already open IE windows.

'-------------------------------------------

Sub GetInnerHTML()

Dim strHTML As String
Dim strPartURL As String
Dim strPageHTML As String

strPartURL = "google"

Set objShell = CreateObject("Shell.Application")
Set objShellWindows = objShell.Windows

If objShellWindows.Count = 0 Then
Exit Sub
End If

For i = 0 To objShellWindows.Count - 1
Set objIE = objShellWindows.Item(i)
If InStr(objIE.LocationURL, strPartURL) Then
Set rng = objIE.Document.Body.CreateTextRange
strPageHTML = rng.htmlText
End If
Next i

MsgBox strPageHTML

Set objShellWindows = Nothing
Set objShell = Nothing
End Sub

'------------------------------------------

Steve Yandl

"ker_01" wrote in message
...
Don- thank you for your generous offer. Unfortunately, the URL is on
a
secure
site, and I have no way to give access to others; the good news is
that I
can
right-click and I see that the page source contains the information I
want
to
scrape with RegEx, so really any web page will do.

Here is a test case:
* Open IE
* Navigate towww.google.com(or an alternate address of your
preference)
* Click View/Source - this will bring up the page source in notepad

The part I'm missing is how to get this source automatically from an
"already open" IE window into Excel as a string. With Google it would
be
easy
to open the URL directly from Excel VBA, but my challenge is to
capture
that
text string directly from an already open browser window.

I've been abusing code snippets from all sorts of sources (non-VBA
sites,
etc) but haven't gotten anywhere. I expect it will be something like:

Dim IE As SHDocVw.InternetExplorer
Do
MsgBox IE.document '.body.innerHTML?
Loop

I appreciate your time and expertise, and any advice you might be
able to
offer.

Thank you!
Keith

"Don Guillett" wrote:

There may be a way but witout seeing all the info, it's hard to say.
Give
urls and the desired. OR,
If desired, send your file to my address below along with this msg
and
a clear explanation of what you want and before/after examples.

--
Don Guillett
Microsoft MVP Excel
SalesAid Software

"ker_01" wrote in message
...
Thanks to Ron and ryguy for responses to an earlier thread- I
asked my
question poorly and the thread died, so here goes another try.

I have a series of web pages that for the sake of convenience, I
really
can't access programatically with Excel (I know it is probably
possible,
but
there are several pages of data entry to get to the target pages,
and
it
isn't practical)

So, I'm perfectly willing to access the web page directly in IE6.
However,
I
want to scrape the target page's source and pull a bunch of data
into
Excel.
This web page design prevents copy/paste, so my only way to scrape
the
data
is from the page source (or hand-retype all the data, which is
what I'm
trying to avoid).

I haven't found any information on how to get Excel to interact
with an
existing IE6 instance and grab the page source on the fly (when a
macro
runs). Can anyone point me in the right direction?

Also, I'm currently on IE6, but if a newer version of IE would
make
this
easier to program, I'd consider upgrading.

Many thanks,
Keith- Hide quoted text -

- Show quoted text -





Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
can I change page layout in an entire existing Excel workbook? LJL Excel Discussion (Misc queries) 2 July 17th 09 03:10 PM
workbooks.open function fails to open an existing excel file when used in ASP, but works in VB. san Excel Programming 1 January 3rd 06 03:22 AM
Open existing Excel file Bill Coupe Excel Discussion (Misc queries) 3 January 18th 05 07:53 PM
Trying to open an existing Excel file in C# AJ[_8_] Excel Programming 0 October 26th 04 05:28 PM
Open only existing Excel Files in a given Range Matty C[_2_] Excel Programming 1 October 5th 04 01:32 PM


All times are GMT +1. The time now is 07:31 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"