View Single Post
  #3   Report Post  
Posted to microsoft.public.excel.programming
Mark[_66_] Mark[_66_] is offline
external usenet poster
 
Posts: 24
Default Need to open HTML Document

On May 15, 9:47*am, T Lavedas wrote:
On May 15, 10:35 am, Mark wrote:

Hello, I have code in my program that can open HTML Files and
search them for hrefs. *There are also HTML Documents that I need to
search. *If selected they open in IE, if I right click and say open
with, then they open as a text file. *Is there a way to do this in my
code?


What do you mean by HTML *files* as opposed to HTML *documents*? *Are
the files local and the documents accessed over the web? *Also, what
do you mean by *selected*? *I thought you said they were being
accessed by code - as text files with FSO, I suppose. *If so, I don't
see how they could *open*. *Please explain how this happens.

I believe you posted some code yesterday, but since I didn't
understand the distinction between files and documents, I didn't
respond. *Since you are asking the question again, I thought I'd chime
in to get clarification. *Then maybe I or someone else will be able to
respond with something useful.

Tom Lavedas
===========http://members.cox.net/tglbatch/wsh/


Hello

Well basically, the thing was, there were two types of files if you
double clicked, one that would open in IE, and one that would open in
Notepad, but both files are HTML. The one that opens in IE is named
HTML Document, and the one that opens in Notepad is HTML File. I went
into folder options and changed it so that both of them open in
Notepad to that my program can search the source code for hrefs. I
did this because it was finding some hrefs on some files but not
others. After researching a bit more I have found what the problem
is. With the hrefs that are being found, in the source code they are
on one line, between the <P</P tags. On the ones that it is not
finding, there are multiple hrefs within the tags. The code that I
have to search for the hrefs is below and I think I need to add more
code to grab these others. Any ideas?

Private Function GetHrefs(ByVal html, ByVal strFilex, ByVal strTitlex)
Dim re, matches, match, d, uri, name, r As Long, c As Range, Lrow
As Long
Dim saveLink
Dim iRet
saveLink = False

Set re = CreateObject("vbscript.regexp")
re.Pattern = "<a\s+.*?href=[""\']?([^""\' ]*)[""\']?[^]*(.*?)<\/
a"
re.IgnoreCase = True
re.MultiLine = True
re.Global = True
Set matches = re.Execute(html)
For Each match In matches
iRet = InspectLink(GetURLAddress(match))
If (iRet 0) Then
Cells(Globalindx, 2) = strFilex
Cells(Globalindx, 3) = strTitlex
Cells(Globalindx, 4) = GetURLTitle(match)
Cells(Globalindx, 5) = GetURLAddress(match)
Cells(Globalindx, 6) = GetType(iRet)

Globalindx = Globalindx + 1
End If


Next
Set matches = Nothing
Set re = Nothing


End Function