View Single Post
  #74   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
Robert Baer Robert Baer is offline
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
GS wrote:
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say
http://www.oil4lessllc.org/gTX.htm)

Thanks.

Look here...

https://app.box.com/s/23yqum8auvzx17h04u4f

..for *ParseWebPages.zip*, which contains:

ParseWebPages.xls
NSN_5960.txt
(Blank data file with fieldnames only in line1
NSN_5960_Test.txt
(Results for 1st 20 pages)

I did not even try cURL as the explanation was just too dern complicated.
Fiddled in Excel,as it has so many different ways to do something
specific.

So, this is skeleton of what i have:
Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken

then..
With Worksheets(1)
' .Copy ''do not need; saves BOOK space
.SaveAs sav$ 'do not know how to close when done
' above creates the file described; that takes HD space, about 300K
End With

IMMEDIATELY after the "End With", a folder is created with useless
metadata info; do not know how to close when done.

WARNING: Scheme works only in XP and Win7.
If in XP, at about 150 files,one gets a PHONY "HD is full" warning and
one must exit Excel so as to be able to delete processed (and so
unwanted) files.
I say PHONY because the system showed NO CHANGE in HD free space,
never mind those files take about 500MB.

Furthermore, in Win7, these files show up in a folder the system KNOWS
NOTHING ABOUT..Windows Explorer does not show C:\Documents which IS
accessible; C:\<sysname\MY Documents is shown and CANNOT be accessed.
Instead of the Excel program crashing, the system is shut down and
locked.
YET other reasons I hate Win7.


I don't follow what you're talking about here! What does it have to do
with the download I linked to?

In the meantime, i took a stab of a "pure" Excel program to get the data.

Whatever you do and more eXplicity how you do the search, it yields
results that i do not see.

Manually downloading the first page for a manual search, I get:

5960 REGULATOR AND "ELECTRON TUBE"
About 922 results (1 ms)
5960-00-503-9529
5960-00-504-8401
5960-01-035-3901
5960-01-029-2766
5960-00-617-4105
5960-00-729-5602
5960-00-826-1280
5960-00-754-5316
5960-00-962-5391
5960-00-944-4671
5960-00-897-8418
and
5960 AND REGULATOR AND "ELECTRON TUBE"
About 104 results (16 ms)
5960-00-503-9529
5960-00-504-8401
5960-01-035-3901
5960-01-029-2766
5960-00-617-4105
5960-00-729-5602
5960-00-826-1280
5960-00-754-5316
5960-00-962-5391
5960-00-944-4671
5960-00-897-8418

Note they are very different, and the second search "gets" a a lot less.
Also neither search gets anything you got, and i am interested in how
you did it.