Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.
  #2   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 538
Default Read (and parse) file on the web

Robert Baer wrote:

Excel macros are SO... undocumented.


Sure they are. Plenty of documentation for individual keywords. You're not
looking for that, you're looking for a broader concept than individual
keywords.

Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)


Downloading a web page (or any URI, really) is easy:

Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Sub foo()
Dim tmp, result, contents
tmp = Environ("TEMP") & Format$(Now, "yyyymmdd-hhmmss-") & "gTX.htm"
'download
result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _
tmp, 0, 0)
If result < 0 Then
'failed to download; error handler here
Else
'read from file
Open tmp For Binary As 1
contents = Space$(LOF(1))
Get #1, 1, contents
Close 1
'parse file here
'[...]
'cleanup
End If
Kill tmp
End Sub

(Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of
Excel.)

Dealing with the data downloaded is very dependant on what the page contains
and what you want to extract from it. That page you mentioned contains 2
images and 2 Javascript arrays; assuming you want the data from the arrays,
you could search for "awls[" or "aprd[" and get your data that way.

Rather than downloading to a file, it is possible to download straight to
memory, but I find it's simpler to use a temp file. Among other things,
downloading to memory requires opening and closing the connection yourself;
URLDownloadToFile handles that for you.

--
- We lose the ones we love; we cannot change it. Put it aside.
- How? How can I do what is needed, when all I feel is... hate?
  #3   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 538
Default Read (and parse) file on the web

I wrote:

tmp = Environ("TEMP") & Format$(Now, "yyyymmdd-hhmmss-") & "gTX.htm"


Damn, missed a backslash.

tmp = Environ("TEMP") & "\" & Format$(Now, "yyyymmdd-hhmmss-") & "gTX.htm"

--
I'm willing to table this discussion for now, and sneak out, if you are.
  #4   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

Auric__ wrote:
I wrote:

tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"


Damn, missed a backslash.

tmp = Environ("TEMP")& "\"& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"

Thanks for the correction to the "path" of destruction.

  #5   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

Auric__ wrote:
Robert Baer wrote:

Excel macros are SO... undocumented.


Sure they are. Plenty of documentation for individual keywords. You're not
looking for that, you're looking for a broader concept than individual
keywords.

Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)


Downloading a web page (or any URI, really) is easy:

Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Sub foo()
Dim tmp, result, contents
tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"
'download
result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _
tmp, 0, 0)
If result< 0 Then
'failed to download; error handler here
Else
'read from file
Open tmp For Binary As 1
contents = Space$(LOF(1))
Get #1, 1, contents
Close 1
'parse file here
'[...]
'cleanup
End If
Kill tmp
End Sub

(Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of
Excel.)

Dealing with the data downloaded is very dependant on what the page contains
and what you want to extract from it. That page you mentioned contains 2
images and 2 Javascript arrays; assuming you want the data from the arrays,
you could search for "awls[" or "aprd[" and get your data that way.

Rather than downloading to a file, it is possible to download straight to
memory, but I find it's simpler to use a temp file. Among other things,
downloading to memory requires opening and closing the connection yourself;
URLDownloadToFile handles that for you.

Thanks.
I know that the parsing is very dependent on the contents and what
one wants.
That is why i gave an example having at least one data array; similar
to what i may be parsing.
I too, like temp files because i can open them in random and/or
binary mode if need.
A bit of fun to read such backwards (like BMP files).

Thanks again.



  #6   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

Auric__ wrote:
Robert Baer wrote:

Excel macros are SO... undocumented.


Sure they are. Plenty of documentation for individual keywords. You're not
looking for that, you're looking for a broader concept than individual
keywords.

Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)


Downloading a web page (or any URI, really) is easy:

Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Sub foo()
Dim tmp, result, contents
tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"
'download
result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _
tmp, 0, 0)
If result< 0 Then
'failed to download; error handler here
Else
'read from file
Open tmp For Binary As 1
contents = Space$(LOF(1))
Get #1, 1, contents
Close 1
'parse file here
'[...]
'cleanup
End If
Kill tmp
End Sub

(Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of
Excel.)

Dealing with the data downloaded is very dependant on what the page contains
and what you want to extract from it. That page you mentioned contains 2
images and 2 Javascript arrays; assuming you want the data from the arrays,
you could search for "awls[" or "aprd[" and get your data that way.

Rather than downloading to a file, it is possible to download straight to
memory, but I find it's simpler to use a temp file. Among other things,
downloading to memory requires opening and closing the connection yourself;
URLDownloadToFile handles that for you.

Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovers a few things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to process.
Fix is easy - just have a constant name for use.
2) The file "tmp" is created ONLY if the value of "result" is zero.
3) The problem i have seems to be due to the fact that the online files
have no filetype:
"https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5"
And i need to process page numbers from 5 to 999 (the reason for the
program).
I have processed page numbers from 1 to 4 by hand...PITA.

Ideas for a fix?
Thanks.
  #7   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web CORRECTION

Robert Baer wrote:
Auric__ wrote:
Robert Baer wrote:

Excel macros are SO... undocumented.


Sure they are. Plenty of documentation for individual keywords. You're
not
looking for that, you're looking for a broader concept than individual
keywords.

Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)


Downloading a web page (or any URI, really) is easy:

Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Sub foo()
Dim tmp, result, contents
tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"
'download
result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _
tmp, 0, 0)
If result< 0 Then
'failed to download; error handler here
Else
'read from file
Open tmp For Binary As 1
contents = Space$(LOF(1))
Get #1, 1, contents
Close 1
'parse file here
'[...]
'cleanup
End If
Kill tmp
End Sub

(Note that URLDownloadToFile must be declared PtrSafe on 64-bit
versions of
Excel.)

Dealing with the data downloaded is very dependant on what the page
contains
and what you want to extract from it. That page you mentioned contains 2
images and 2 Javascript arrays; assuming you want the data from the
arrays,
you could search for "awls[" or "aprd[" and get your data that way.

Rather than downloading to a file, it is possible to download straight to
memory, but I find it's simpler to use a temp file. Among other things,
downloading to memory requires opening and closing the connection
yourself;
URLDownloadToFile handles that for you.

Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovered a few things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to process.

* Error: if i have umpteen files to process, then i must use their
unique names.

Fix is easy - just have a constant name for use.

* Error: i can issue a KILL each time (a "KILL *.*" does not work).

2) The file "tmp" is created ONLY if the value of "result" is zero.
3) The problem i have seems to be due to the fact that the online files
have no filetype:
"https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5"
And i need to process page numbers from 5 to 999 (the reason for the
program).
I have processed page numbers from 1 to 4 by hand...PITA.

Ideas for a fix?
Thanks.


  #8   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web CORRECTION#2

Robert Baer wrote:
Robert Baer wrote:
Auric__ wrote:
Robert Baer wrote:

Excel macros are SO... undocumented.

Sure they are. Plenty of documentation for individual keywords. You're
not
looking for that, you're looking for a broader concept than individual
keywords.

Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Downloading a web page (or any URI, really) is easy:

Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Sub foo()
Dim tmp, result, contents
tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm"
'download
result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _
tmp, 0, 0)
If result< 0 Then
'failed to download; error handler here
Else
'read from file
Open tmp For Binary As 1
contents = Space$(LOF(1))
Get #1, 1, contents
Close 1
'parse file here
'[...]
'cleanup
End If
Kill tmp
End Sub

(Note that URLDownloadToFile must be declared PtrSafe on 64-bit
versions of
Excel.)

Dealing with the data downloaded is very dependant on what the page
contains
and what you want to extract from it. That page you mentioned contains 2
images and 2 Javascript arrays; assuming you want the data from the
arrays,
you could search for "awls[" or "aprd[" and get your data that way.

Rather than downloading to a file, it is possible to download
straight to
memory, but I find it's simpler to use a temp file. Among other things,
downloading to memory requires opening and closing the connection
yourself;
URLDownloadToFile handles that for you.

Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovered a few things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to process.

* Error: if i have umpteen files to process, then i must use their
unique names.

Fix is easy - just have a constant name for use.

* Error: i can issue a KILL each time (a "KILL *.*" does not work).

2) The file "tmp" is created ONLY if the value of "result" is zero.
3) The problem i have seems to be due to the fact that the online files
have no filetype:
"https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5"
And i need to process page numbers from 5 to 999 (the reason for the
program).
I have processed page numbers from 1 to 4 by hand...PITA.

Ideas for a fix?
Thanks.


Well, it is rather bad..
I created a file with no filetype and put it on the web.
The process URLDownloadToFile worked, with "tmp" and "result" being
correct.
BUT....
The line "Open tmp For Binary As 1" barfed with error message "Bad
filename or number" (remember, it worked with my gTX.htm file).

So, there is "something" about that site that seems to prevent read
access.
Is there a way to get a clue (and maybe fix it)?

And assuming a fix, what can i do about the OPEN command/syntax?
// What i did in Excel:
S$ = "D:\Website\Send .Hot\****"
tmp = Environ("TEMP") & "\" & S$
result = URLDownloadToFile(0, S$, tmp, 0, 0)
'The file is at "http://www.oil4lessllc.com/****" ; an Excel file
  #9   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 538
Default Read (and parse) file on the web CORRECTION#2

Robert Baer wrote:

And assuming a fix, what can i do about the OPEN command/syntax?
// What i did in Excel:
S$ = "D:\Website\Send .Hot\****"
tmp = Environ("TEMP") & "\" & S$


The contents of the variable S$ at this point:

S$ = "C:\Users\auric\D:\Website\Send .Hot\****"

Do you see the problem?

Also, as Garry pointed out, cleanup should happen automatically. The "Kill"
keyword deletes files.

Try this code:

Declare Function URLDownloadToFile Lib "urlmon" _
Alias "URLDownloadToFileA" (ByVal pCaller As Long, _
ByVal szURL As String, ByVal szFileName As String, _
ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long

Function downloadFile(what As String) As String
'returns tmp file's path on success or empty string on failure
Dim tmp, result, contents, fnum
tmp = Environ("TEMP") & "\" & Format$(Now, "yyyymmdd-hhmmss-") & _
"downloaded.tmp"
'download
result = URLDownloadToFile(0, what, tmp, 0, 0)
If result < 0 Then
'failed to download; error handler here, if any
On Error Resume Next 'can be avoided by checking if the file exists
Kill tmp 'cleanup
On Error GoTo 0
downloadFile = ""
Else
'read from file
fnum = FreeFile
Open tmp For Binary As fnum
contents = Space$(LOF(fnum))
Get #fnum, 1, contents
Close fnum
downloadFile = tmp
End If
End Function

Sub foo()
Dim what As String, files_to_get As Variant, L0
'specify files to download
files_to_get = Array("http://www.oil4lessllc.com/****", _
"http://www.oil4lessllc.org/gTX.htm")
For L0 = LBound(files_to_get) To UBound(files_to_get)
what = downloadFile(files_to_get(L0))
If Len(what) Then
'****parse file here*****
Kill what 'cleanup
End If
Next
End Sub

--
THIS IS NOT AN APPROPRIATE ANSWER.
  #10   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovers a few
things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to
process.


So you did not 'catch' that the tmp file is deleted when this line...

Kill tmp

...gets executed!

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



  #11   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovers a few things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to process.


So you did not 'catch' that the tmp file is deleted when this line...

Kill tmp

..gets executed!

I know about that; i have been stepping thru the execution (F8), and
stop long before that.
Then i go to CMD prompt and check files and DEL *.* when appropriate
for next test.

  #12   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovers a few things.
1) The variable "tmp" is nice, but does not have to have the date and
time; that would fill the HD since i have thousands of files to process.


So you did not 'catch' that the tmp file is deleted when this line...

Kill tmp

..gets executed!

Been thru this already..also makes no sense to have a name miles long.

  #13   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

but does not have to have the date and time; that would fill the HD
since i have thousands of files to process.


Filenames have nothing to do with storage space; -it's the file size!
Given Auric_'s suggestion creates text files, the size of 999 txt files
would hardly be more the 1MB total! If you append each page to the 1st
file then all pages could be in 1 file...

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #14   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

After looking at your link to p5, I see what you mean by the amount of
storage space, but filename is not a factor. Parsing will certainly
downsize each file considerably...

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #15   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
After looking at your link to p5, I see what you mean by the amount of
storage space, but filename is not a factor. Parsing will certainly
downsize each file considerably...

One has to open the file to parse, so that is not logical.
The function URLDownloadToFile gives zero options - it copies ALL of
the source into a TEMP file; one hopes that the source is not equal to
or larger than 4GB in size!

For pages on the web, that is extremely unlikely; webpage size max
limit prolly is 10MB; maybe 300K worst case on the average.

So, once in TEMP, it can be opened for input (text), for random (may
specify buffer size), or for binary (may specify buffer size).
Here,one can optimize read speed VS string space used.




  #16   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
but does not have to have the date and time; that would fill the HD
since i have thousands of files to process.


Filenames have nothing to do with storage space; -it's the file size!
Given Auric_'s suggestion creates text files, the size of 999 txt files
would hardly be more the 1MB total! If you append each page to the 1st
file then all pages could be in 1 file...

True, BUT the files can be large:
"result = URLDownloadToFile(0, S$, tmp, 0, 0)" creates a file in TEMP
the size of the source - which can be multi-megabtes; 999 of them can
eat the HD space fast.
Hopefully a URL file size does not exceed the space limit allowed in
Excel 2003 string space (anyone know what that might be?).

I have found that the stringvalue AKA TEMP filename can be fixed to
anything reasonable, and does not have to include
parts/substrings/subsets of the file one wants to download.

I can be "a good thing" (quoting Martha Stewart) to delete the file
when done.

I have also found the following:
1) one does not have to use FreeFile for a file number (when all else is
OK).
2) cannot use "contents" for string storage space.
3) one cannot mix use of "/" and "\" in a string for a given file name.
4) one cannot have a space in the file name, so that gives a serious
problem for some web URLs (work-around anyone?)
5) method fails for "https:" (work-around anyone?)
  #17   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

I have also found the following:
1) one does not have to use FreeFile for a file number (when all else
is OK).


True, however not considered 'best practice'. Freefile() ensures a
unique ID is assigned to your var.

2) cannot use "contents" for string storage space.

Why not? It's not a VB[A] keyword and so qualifies for use as a var.


3) one cannot mix use of "/" and "\" in a string for a given file
name.


Not sure why you'd use "/" in a path string! Forward slash is not a
legal filename/path character. Backslash is the default Windows path
delimiter. If choosing folders, the last backslah is not followed by a
filename.

4) one cannot have a space in the file name, so that gives a serious
problem for some web URLs (work-around anyone?)


Web paths 'pad' spaces so the string is contiguous. I believe the pad
string is "%20" OR "+".

5) method fails for "https:" (work-around anyone?)


What method?

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #18   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

5) method fails for "https:" (work-around anyone?)

These URLs usually require some kind of 'login' be done, which needs to
be included in the URL string.

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #19   Report Post  
Banned
 
Posts: 2
Default

Phim SEX không che , Phim SEX Nh*t Bản , Phim SEX loạn luân , Phim SEX HD

Phim SEX HD <<<< Tuyển t*p phim sex chất lượng cao mới nhất, những bộ phim heo hay nhất 2016, xem phim sex HD full 1080p trên web không quảng cáo, chúc các bạn xem ...
  #20   Report Post  
Banned
 
Posts: 2
Default

Phim SEX không che , Phim SEX Nh*t Bản , Phim SEX loạn luân , Phim SEX HD

Phim SEX online <<<< Xem phim sex chọn lọc chất lượng cao mới nhất 2016, Phim sex online siêu dâm dục, phim sex hd siêu nét vừa xem vừa thủ dâm cực khoái.


  #21   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

What is with this a*hole posting this sh*t here?
  #22   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 538
Default Read (and parse) file on the web

Robert Baer wrote:

What is with this a*hole posting this sh*t here?


It's just spam. Ignore it.

--
If you would succeed,
you must reduce your strategy to its point of application.
  #23   Report Post  
Banned
 
Posts: 1
Default

Ch*o bạn!

Có bạn phải đang tìm kiếm cho mình mã giảm giá adayroi không? Hãy truy c*p ngay MaGiamGiaAdayroi.com để nh*n th*t nhiều mã giảm giá miễn ph* nhé
  #24   Report Post  
Junior Member
 
Posts: 3
Default

sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả
  #25   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

everonvietnam2016 wrote:
sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người
dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua
ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả




No kapish.
Firstly, on my computer, all i see is a strange mix of characters
from the 512 ASCII set.
Secondly, i would not be able to read or understand your language
even if it was elegantly rendered.




  #26   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 538
Default Read (and parse) file on the web

Robert Baer wrote:

everonvietnam2016 wrote:
sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người
dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua
ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả

No kapish.
Firstly, on my computer, all i see is a strange mix of characters
from the 512 ASCII set.
Secondly, i would not be able to read or understand your language
even if it was elegantly rendered.


It's Vietnamese. *Lots* of accented characters. Also, it's spam.

--
Who says life is sacred? God? Hey, if you read your history,
God is one of the leading causes of death.
-- George Carlin
  #27   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

Auric__ wrote:
Robert Baer wrote:

everonvietnam2016 wrote:
sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người
dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua
ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả

No kapish.
Firstly, on my computer, all i see is a strange mix of characters
from the 512 ASCII set.
Secondly, i would not be able to read or understand your language
even if it was elegantly rendered.


It's Vietnamese. *Lots* of accented characters. Also, it's spam.

Yes, it was easy to recognize that was Vietnamese.
How the heck did you figure out that it was spam?

  #28   Report Post  
Junior Member
 
Posts: 3
Default

sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người s* dụng, dịch vụ ráo trọi, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả
  #29   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.


Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

...for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #30   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.


Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

I did not even try cURL as the explanation was just too dern complicated.
Fiddled in Excel,as it has so many different ways to do something
specific.

So, this is skeleton of what i have:
Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken

then..
With Worksheets(1)
' .Copy ''do not need; saves BOOK space
.SaveAs sav$ 'do not know how to close when done
' above creates the file described; that takes HD space, about 300K
End With

IMMEDIATELY after the "End With", a folder is created with useless
metadata info; do not know how to close when done.

WARNING: Scheme works only in XP and Win7.
If in XP, at about 150 files,one gets a PHONY "HD is full" warning
and one must exit Excel so as to be able to delete processed (and so
unwanted) files.
I say PHONY because the system showed NO CHANGE in HD free space,
never mind those files take about 500MB.

Furthermore, in Win7, these files show up in a folder the system
KNOWS NOTHING ABOUT..Windows Explorer does not show C:\Documents which
IS accessible; C:\<sysname\MY Documents is shown and CANNOT be accessed.
Instead of the Excel program crashing, the system is shut down and
locked.
YET other reasons I hate Win7.



  #31   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.


Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

I did not even try cURL as the explanation was just too dern
complicated.
Fiddled in Excel,as it has so many different ways to do something
specific.

So, this is skeleton of what i have:
Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken

then..
With Worksheets(1)
' .Copy ''do not need; saves BOOK space
.SaveAs sav$ 'do not know how to close when done
' above creates the file described; that takes HD space, about
300K
End With

IMMEDIATELY after the "End With", a folder is created with useless
metadata info; do not know how to close when done.

WARNING: Scheme works only in XP and Win7.
If in XP, at about 150 files,one gets a PHONY "HD is full" warning
and one must exit Excel so as to be able to delete processed (and so
unwanted) files.
I say PHONY because the system showed NO CHANGE in HD free space,
never mind those files take about 500MB.

Furthermore, in Win7, these files show up in a folder the system
KNOWS NOTHING ABOUT..Windows Explorer does not show C:\Documents
which IS accessible; C:\<sysname\MY Documents is shown and CANNOT be
accessed.
Instead of the Excel program crashing, the system is shut down and
locked.
YET other reasons I hate Win7.


I don't follow what you're talking about here! What does it have to do
with the download I linked to?

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #32   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.

Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

I did not even try cURL as the explanation was just too dern complicated.
Fiddled in Excel,as it has so many different ways to do something
specific.

So, this is skeleton of what i have:
Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken

then..
With Worksheets(1)
' .Copy ''do not need; saves BOOK space
.SaveAs sav$ 'do not know how to close when done
' above creates the file described; that takes HD space, about 300K
End With

IMMEDIATELY after the "End With", a folder is created with useless
metadata info; do not know how to close when done.

WARNING: Scheme works only in XP and Win7.
If in XP, at about 150 files,one gets a PHONY "HD is full" warning and
one must exit Excel so as to be able to delete processed (and so
unwanted) files.
I say PHONY because the system showed NO CHANGE in HD free space,
never mind those files take about 500MB.

Furthermore, in Win7, these files show up in a folder the system KNOWS
NOTHING ABOUT..Windows Explorer does not show C:\Documents which IS
accessible; C:\<sysname\MY Documents is shown and CANNOT be accessed.
Instead of the Excel program crashing, the system is shut down and
locked.
YET other reasons I hate Win7.


I don't follow what you're talking about here! What does it have to do
with the download I linked to?

In the meantime, i took a stab of a "pure" Excel program to get the data.

Whatever you do and more eXplicity how you do the search, it yields
results that i do not see.

Manually downloading the first page for a manual search, I get:

5960 REGULATOR AND "ELECTRON TUBE"
About 922 results (1 ms)
5960-00-503-9529
5960-00-504-8401
5960-01-035-3901
5960-01-029-2766
5960-00-617-4105
5960-00-729-5602
5960-00-826-1280
5960-00-754-5316
5960-00-962-5391
5960-00-944-4671
5960-00-897-8418
and
5960 AND REGULATOR AND "ELECTRON TUBE"
About 104 results (16 ms)
5960-00-503-9529
5960-00-504-8401
5960-01-035-3901
5960-01-029-2766
5960-00-617-4105
5960-00-729-5602
5960-00-826-1280
5960-00-754-5316
5960-00-962-5391
5960-00-944-4671
5960-00-897-8418

Note they are very different, and the second search "gets" a a lot less.
Also neither search gets anything you got, and i am interested in how
you did it.

  #33   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

Also neither search gets anything you got, and i am interested in
how you did it.


If you study the file I gave you, you'll see how both methods are
working. The worksheet implements all manual parsing so you can study
each part of the process as well as the web page source structure; the
*AutoParse* macro collects the data and writes it to the file.

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #34   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

If you're referring to the substitute 'page error' text put in place of
missing item info, ..well that might be misleading you. Fact is,
starting with item7 on pg7 there is no item info on any pages I checked
manually in the browser (up to pg100). Perhaps you could rephrase that
to "No Data Available"!?

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #35   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
If you're referring to the substitute 'page error' text put in place of
missing item info, ..well that might be misleading you. Fact is,
starting with item7 on pg7 there is no item info on any pages I checked
manually in the browser (up to pg100). Perhaps you could rephrase that
to "No Data Available"!?

Machs nicht.
I also looked manually and you are correct.
Why the heck they have NSNs that do not relate to a part is puzzling,
but, hey, it *IS* the government.
Not useful to what i need, but still nice to know.



  #36   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.


Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

Holy S*!
I did about 30 pages by hand; quit as rather tiresome and total pages
unknown (MORE than 999).
Never saw the fail you saw.

Difference is that you used the word "and"; technically (i think)
that should not affect results.
Also, you got items I am interested in, and after processing 503
pages, i did NOT get those.

In both cases, there were a lot of duplicate records (government
data, what else can you expect?).

In your sample, there were 73 useful records containing 43 unique
records. There may be some that i am not interested in, but there
definitely ARE those i did not find that i am interested in.

In my sample, there were 3782 unique records,and (better sit down),
only 15 were interesting. Crappy odds.

Hopefully, when i call them, someone that has some experience and
knowledge of how their sort criteria works, will answer the phone.
Last time i called,i got a new guy; no help other than "use Google".

You have done a masterful job!
Label it !DONE! please.

Thanks a lot.
  #37   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.


Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

Holy S*!
I did about 30 pages by hand; quit as rather tiresome and total
pages unknown (MORE than 999).
Never saw the fail you saw.

Difference is that you used the word "and"; technically (i think)
that should not affect results.
Also, you got items I am interested in, and after processing 503
pages, i did NOT get those.

In both cases, there were a lot of duplicate records (government
data, what else can you expect?).

In your sample, there were 73 useful records containing 43 unique
records. There may be some that i am not interested in, but there
definitely ARE those i did not find that i am interested in.


You could 'dump' the file into a worksheet and filter out the dupes
easily enough.

In my sample, there were 3782 unique records,and (better sit
down), only 15 were interesting. Crappy odds.

Hopefully, when i call them, someone that has some experience and
knowledge of how their sort criteria works, will answer the phone.
Last time i called,i got a new guy; no help other than "use
Google".


Those are dynamic web pages and so are database driven. Surely there's
a repository database for this info somewhere other than NSN?

You have done a masterful job!
Label it !DONE! please.

Thanks a lot.


Happy to be of help; -I found the project rather interesting!

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

  #38   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.

Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

Holy S*!
I did about 30 pages by hand; quit as rather tiresome and total pages
unknown (MORE than 999).
Never saw the fail you saw.

Difference is that you used the word "and"; technically (i think) that
should not affect results.
Also, you got items I am interested in, and after processing 503
pages, i did NOT get those.

In both cases, there were a lot of duplicate records (government data,
what else can you expect?).

In your sample, there were 73 useful records containing 43 unique
records. There may be some that i am not interested in, but there
definitely ARE those i did not find that i am interested in.


You could 'dump' the file into a worksheet and filter out the dupes
easily enough.

* Yes, i did that; getting those 43 unique records.


In my sample, there were 3782 unique records,and (better sit down),
only 15 were interesting. Crappy odds.

Hopefully, when i call them, someone that has some experience and
knowledge of how their sort criteria works, will answer the phone.
Last time i called,i got a new guy; no help other than "use Google".


Those are dynamic web pages and so are database driven. Surely there's a
repository database for this info somewhere other than NSN?

* Prolly not dynamic as NSNs do not change except possible additions on
rare occasion.
Certainly, a SPECIFIC search (so far) always gives the same results.
Look at a previous response concerning 100% manual search results,
first page only.
5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results
5960 REGULATOR AND "ELECTRON TUBE" About 922 results
Totally different and neither match your results.
And your results look superior.


You have done a masterful job!
Label it !DONE! please.

Thanks a lot.


Happy to be of help; -I found the project rather interesting!


  #39   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

Robert Baer wrote:
GS wrote:
GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.

Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

Holy S*!
I did about 30 pages by hand; quit as rather tiresome and total pages
unknown (MORE than 999).
Never saw the fail you saw.

Difference is that you used the word "and"; technically (i think) that
should not affect results.
Also, you got items I am interested in, and after processing 503
pages, i did NOT get those.

In both cases, there were a lot of duplicate records (government data,
what else can you expect?).

In your sample, there were 73 useful records containing 43 unique
records. There may be some that i am not interested in, but there
definitely ARE those i did not find that i am interested in.


You could 'dump' the file into a worksheet and filter out the dupes
easily enough.

* Yes, i did that; getting those 43 unique records.


In my sample, there were 3782 unique records,and (better sit down),
only 15 were interesting. Crappy odds.

Hopefully, when i call them, someone that has some experience and
knowledge of how their sort criteria works, will answer the phone.
Last time i called,i got a new guy; no help other than "use Google".


Those are dynamic web pages and so are database driven. Surely there's a
repository database for this info somewhere other than NSN?

* Prolly not dynamic as NSNs do not change except possible additions on
rare occasion.
Certainly, a SPECIFIC search (so far) always gives the same results.
Look at a previous response concerning 100% manual search results, first
page only.
5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results
5960 REGULATOR AND "ELECTRON TUBE" About 922 results
Totally different and neither match your results.
And your results look superior.


You have done a masterful job!
Label it !DONE! please.

Thanks a lot.


Happy to be of help; -I found the project rather interesting!


I am getting more confused. Search term used and response:
5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results
5960 REGULATOR AND "ELECTRON TUBE" About 922 results
5960 regulator and "ELECTRON TUBE" About 3134377 results

Use of the second two give exactly the same list for the first page
and the last term is the one you used in your program.
The results like i previously said, are completely different WRT the
first term and your program (3 different results).
Notice the 3.1 million results when lower case is used for
"regulator"; i think the database "engine" is thrashing around in what
is almost a useless attempt.
BUT. That thrashing produces very useful results (after sort and
consolidate).

SO.
(1) results are dependent on the form / format of the search term used.
(2) results depend on the (in this case) Excel procedure used that does
the access and fetch.

Now i know rather well, that Excel mangles the HTML information when
it is imported; most especially the primary page.
I had my Excel program working to parse the primary HTML page AS SEEN
BY THE HUMAN EYE ON THE WEB, and i had to make a number of changes to
accommodate what Excel gave me.
Therefore, on that basis, i have a rather strong suspicion that what
Excel SENDS to the web for a search is quite different than what we think.

Comments?
Suggestions, primarily to get it more efficient BUT ALSO give it all
to us?

PS: How i get the data:
Workbooks.Open Filename:=openFYL$ 'YES..opens as R/O
With Worksheets(1)
' .Copy ''do not need; saves BOOKn space
.SaveAs sav$ 'do not know how to close when do not need
End With

  #40   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
external usenet poster
 
Posts: 1,182
Default Read (and parse) file on the web

I've uploaded a new version that skips dupes, and flags missing item
info. (See the new 'Test' file)

This version also runs orders of magnitude faster!

--
Garry

Free usenet access at http://www.eternal-september.org
Classic VB Users Regroup!
comp.lang.basic.visual.misc
microsoft.public.vb.general.discussion

---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus



Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
EOF Parse Text file Bam Excel Programming 2 September 24th 08 04:13 AM
Parse a txt file and save as csv? Frank Pytel Excel Programming 4 September 14th 08 09:23 PM
parse from txt file geebee Excel Programming 3 August 19th 08 07:55 PM
Parse File Location Mike Excel Worksheet Functions 5 October 3rd 07 04:05 PM
REQ: Simplest way to parse (read) HTML formatted data in via Excel VBA (or VB6) Steve[_29_] Excel Programming 3 August 25th 03 10:43 PM


All times are GMT +1. The time now is 10:35 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright 2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"