Robert Baer wrote:
Robert Baer wrote:
Auric__ wrote:
Robert Baer wrote:
Excel macros are SO... undocumented.
Sure they are. Plenty of documentation for individual keywords. You're
not
looking for that, you're looking for a broader concept than individual
keywords.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)
Downloading a web page (or any URI, really) is easy:
Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long
Sub foo()
Dim tmp, result, contents
tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"
'download
result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _
tmp, 0, 0)
If result< 0 Then
'failed to download; error handler here
Else
'read from file
Open tmp For Binary As 1
contents = Space$(LOF(1))
Get #1, 1, contents
Close 1
'parse file here
'[...]
'cleanup
End If
Kill tmp
End Sub
(Note that URLDownloadToFile must be declared PtrSafe on 64-bit
versions of
Excel.)
Dealing with the data downloaded is very dependant on what the page
contains
and what you want to extract from it. That page you mentioned contains 2
images and 2 Javascript arrays; assuming you want the data from the
arrays,
you could search for "awls[" or "aprd[" and get your data that way.
Rather than downloading to a file, it is possible to download
straight to
memory, but I find it's simpler to use a temp file. Among other things,
downloading to memory requires opening and closing the connection
yourself;
URLDownloadToFile handles that for you.
Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovered a few things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to process.
* Error: if i have umpteen files to process, then i must use their
unique names.
Fix is easy - just have a constant name for use.
* Error: i can issue a KILL each time (a "KILL *.*" does not work).
2) The file "tmp" is created ONLY if the value of "result" is zero.
3) The problem i have seems to be due to the fact that the online files
have no filetype:
"https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5"
And i need to process page numbers from 5 to 999 (the reason for the
program).
I have processed page numbers from 1 to 4 by hand...PITA.
Ideas for a fix?
Thanks.
Well, it is rather bad..
I created a file with no filetype and put it on the web.
The process URLDownloadToFile worked, with "tmp" and "result" being
correct.
BUT....
The line "Open tmp For Binary As 1" barfed with error message "Bad
filename or number" (remember, it worked with my gTX.htm file).
So, there is "something" about that site that seems to prevent read
access.
Is there a way to get a clue (and maybe fix it)?
And assuming a fix, what can i do about the OPEN command/syntax?
// What i did in Excel:
S$ = "D:\Website\Send .Hot\****"
tmp = Environ("TEMP") & "\" & S$
result = URLDownloadToFile(0, S$, tmp, 0, 0)
'The file is at "http://www.oil4lessllc.com/****" ; an Excel file