Read (and parse) file on the web CORRECTION#2
GS wrote:
You are correct, #1) do not need that line 3, and #2) do not need the
extended info.
Ok then, fieldnames will be: Item#,Part#,MCRL,CAGE,Source
File name(s) for PageNumber=1 I would use 5960_001.TXT,..to
PageNumber=999 I would use 5960_999.TXT and that would preserve order.
*OR*
Reading & parsing from PageNumber=1 to PageNumber=999,one could append
to same file (name NSN_5960.TXT); might as well - makes it easier to
pour into a single Excel file.
Either way is fine.
Ok, then ouput filename will be: NSN_5960.txt
I have found a way to get rid of items that are not strictly electron
tubes and/or not regulators; that way you do not have to parse out
these "unfit" items from first page description. Use:
"https://www.nsncenter.com/NSNSearch?q=5960%20regulator%20and%20%22ELECTRON%2 0TUBE%22&PageNumber=1"
Naturally, PageNumber still goes from 1 to 999.
Note the implied "(", ")" and " "; human-readable "5960 regulator and
(ELECTRON TUBE)".
As far as i can tell, using that shows no undesirable parts.
Works nice! Now I get 11 5960 items per parent page.
Thanks!
PS: i found WGET to be non-useful (a) it truncates the filename (b) it
buggers it to partial gibberish
What is WGET?
WGET is a command line program that will copy contents of an URL to
the hard drive; it has various options, for SSL, i think for some
processing, for giving the output file a specific name, for recursion, etc.
Was still trying to find ways to copy the online file to the hard drive.
I still do not understand what magic you used.
Now, the nitty-gritty; in exchange for that nicely parsed file, what
do i owe you?
|