Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
programmatically retrieve links from web page
Hi there
I am using the Microsoft XML v6.0 library to retrieve a web page from the Internet, as follows: Dim oHttp As Object Set oHttp = CreateObject("MSXML2.XMLHTTP") oHttp.Open "GET", "http://www.microsoft.com/default.aspx", False oHttp.Send content = oHttp.responseText Once downloaded, I want to search through the page for all URLs that link through to other web pages (ie. contained within <a </a tags). The problem is that, given the huge diversity of formats for links (relative and absolute references, url-encoding, etc.), I'm struggling to write out all the possibilities in code. Is there an easier way to retrieve the contents of a specific element in a web page, or even better, to scroll through collections of elements? I've tried XML proper (MSXML2.DOMDocument40) but this doesn't seem to work with HTML pages' loose structure. Best regards Loane |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
programmatically retrieve links from web page
Hi Loane,
Different approach, but see the following: http://www.dicks-blog.com/archives/2...rnet-explorer/ Regards, Nate Oliver Loane Sharp wrote: Once downloaded, I want to search through the page for all URLs that link through to other web pages (ie. contained within <a </a tags). The problem is that, given the huge diversity of formats for links (relative and absolute references, url-encoding, etc.), I'm struggling to write out all the possibilities in code. |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
programmatically retrieve links from web page
Hi Nate
What a simple, elegant solution. Thanks a stack Best regards Loane "Nate Oliver" wrote in message oups.com... Hi Loane, Different approach, but see the following: http://www.dicks-blog.com/archives/2...rnet-explorer/ Regards, Nate Oliver Loane Sharp wrote: Once downloaded, I want to search through the page for all URLs that link through to other web pages (ie. contained within <a </a tags). The problem is that, given the huge diversity of formats for links (relative and absolute references, url-encoding, etc.), I'm struggling to write out all the possibilities in code. |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
How to: links on the same page | New Users to Excel | |||
How do I retrieve a page number in Excel? | Excel Worksheet Functions | |||
Move page breaks programmatically... | Excel Programming | |||
How do I disable links on the page? | Excel Worksheet Functions | |||
How to Programmatically Insert a Page Break Every Nth Row in a Range | Excel Programming |