Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 25
Default HTML Document.Links Issues

Thanks to dicks-blog.com I've just recently managed to find out how to access
web pages and bring the data back to Excel (2000) VBA even when it's not in a
table. The possibilities are very exciting. But I have run into two vexing
problems.

1) The same code that retrieves the collection of links on a page --
essentially, Document.Links -- works with two homepages and fails on a third.
The links on the third look the same to me -- static links with plain text
anchors.

2) One of my tasks is to list out the links in a sort of tree structure. It
would make sense to do this recursively, up to a user-specified number of
levels deep. But I started with a non-recursive model, figuring it would be
easier to debug. Moving to the second level, I load the page that the first
link points to, and grab ITS link collection, storing that in a different
variable. I process the page and come back to the first level. At that point,
trying to access the links collection so I can process the next link gives me
an Error 70, Permission Denied. This happens even if I reload the first-level
page and then try to re-retrieve the collection.

Since I've already stored the links collection in a variable, it shouldn't
matter that I've left the page where I got it from, should it? And why can't
I re-retrieve it even if I reload the page? Reloading is an abuse of
bandwidth and would make the macro take much longer to run, so I really want
to preserve the loaded links collection and just keep referring to it.

TIA,
Gregg Roberts
  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default HTML Document.Links Issues

Gergg,

Hard to tell exactly what the problem is without seeing any code.
Are you automating IE to do your work ?

Tim


"Gregg Roberts" wrote in message
...
Thanks to dicks-blog.com I've just recently managed to find out how to
access
web pages and bring the data back to Excel (2000) VBA even when it's not
in a
table. The possibilities are very exciting. But I have run into two vexing
problems.

1) The same code that retrieves the collection of links on a page --
essentially, Document.Links -- works with two homepages and fails on a
third.
The links on the third look the same to me -- static links with plain text
anchors.

2) One of my tasks is to list out the links in a sort of tree structure.
It
would make sense to do this recursively, up to a user-specified number of
levels deep. But I started with a non-recursive model, figuring it would
be
easier to debug. Moving to the second level, I load the page that the
first
link points to, and grab ITS link collection, storing that in a different
variable. I process the page and come back to the first level. At that
point,
trying to access the links collection so I can process the next link gives
me
an Error 70, Permission Denied. This happens even if I reload the
first-level
page and then try to re-retrieve the collection.

Since I've already stored the links collection in a variable, it shouldn't
matter that I've left the page where I got it from, should it? And why
can't
I re-retrieve it even if I reload the page? Reloading is an abuse of
bandwidth and would make the macro take much longer to run, so I really
want
to preserve the loaded links collection and just keep referring to it.

TIA,
Gregg Roberts



  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 25
Default HTML Document.Links Issues

Hard to tell exactly what the problem is without seeing any code.
Are you automating IE to do your work ?


Yes.

I solved the first problem, which was occurring because it was a frames site.
Here's the relevant code for the second problem:

IeApp.Navigate LinkURL
Do
Loop Until IeApp.ReadyState = READYSTATE_COMPLETE
Set IEDoc = IeApp.Document
<snip
Set LinkTags1 = IEDoc.links

<snip

CurRow = 3
CurCol = 1
MsgBox IsEmpty(LinkTags1) ' <-- This comes back False
MsgBox LinkTags1.Length ' <-- Error 70 occurs here on the SECOND
iteration...
For LinkIndex1 = 0 To LinkTags1.Length - 1 ' <-- ...and here.
CurRow = CurRow + 1
LinkURL = LinkTags1(LinkIndex1).href
If LinkURL < "" Then ' An A tag with blank href is a bookmark, so ignore
Range(Cells(CurRow, CurCol), Cells(CurRow, CurCol)).Value = _
LinkTags1(LinkIndex1).innerText
Range(Cells(CurRow, CurCol + 1), Cells(CurRow, CurCol + 1)).Value = _
LinkURL
' if unique and internal, then get links at next level down
If URL_Is_Unique(LinkURL) And URL_Is_Internal(LinkURL) Then
IeApp.Navigate LinkURL
Do
Loop Until IeApp.ReadyState = READYSTATE_COMPLETE
Set IEDoc = IeApp.Document
Set LinkTags2 = IEDoc.links

<snip ... process second-level page and return to top

  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default HTML Document.Links Issues

I'm not certain you can still access the links collection if the page has
already been unloaded.
You might have to copy the links into a collection, dictionary or array
while the page is still loaded, and only then loop through them to get the
sub-links.

Tim



"Gregg Roberts" wrote in message
...
Hard to tell exactly what the problem is without seeing any code.
Are you automating IE to do your work ?


Yes.

I solved the first problem, which was occurring because it was a frames
site.
Here's the relevant code for the second problem:

IeApp.Navigate LinkURL
Do
Loop Until IeApp.ReadyState = READYSTATE_COMPLETE
Set IEDoc = IeApp.Document
<snip
Set LinkTags1 = IEDoc.links

<snip

CurRow = 3
CurCol = 1
MsgBox IsEmpty(LinkTags1) ' <-- This comes back False
MsgBox LinkTags1.Length ' <-- Error 70 occurs here on the SECOND
iteration...
For LinkIndex1 = 0 To LinkTags1.Length - 1 ' <-- ...and here.
CurRow = CurRow + 1
LinkURL = LinkTags1(LinkIndex1).href
If LinkURL < "" Then ' An A tag with blank href is a bookmark, so
ignore
Range(Cells(CurRow, CurCol), Cells(CurRow, CurCol)).Value = _
LinkTags1(LinkIndex1).innerText
Range(Cells(CurRow, CurCol + 1), Cells(CurRow, CurCol + 1)).Value =
_
LinkURL
' if unique and internal, then get links at next level down
If URL_Is_Unique(LinkURL) And URL_Is_Internal(LinkURL) Then
IeApp.Navigate LinkURL
Do
Loop Until IeApp.ReadyState = READYSTATE_COMPLETE
Set IEDoc = IeApp.Document
Set LinkTags2 = IEDoc.links

<snip ... process second-level page and return to top



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
links issues freekrill Excel Discussion (Misc queries) 1 November 29th 05 05:31 PM
Opening Excel Document in ASP.NET and having issues...please help Josh Behl[_2_] Excel Programming 0 November 11th 05 03:04 AM
Update links - hyperlink issues Shan Excel Discussion (Misc queries) 0 March 3rd 05 01:57 PM
Links in html-file Mar Excel Discussion (Misc queries) 0 February 1st 05 12:01 PM
Get a value from a table in an HTML document Terry V Excel Programming 6 October 5th 04 04:37 AM


All times are GMT +1. The time now is 10:22 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"