View Single Post
  #80   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.excel
Robert Baer Robert Baer is offline
external usenet poster
 
Posts: 93
Default Read (and parse) file on the web

GS wrote:
The process result after copy/paste a web page into a worksheet is
*entirely different* than reading the webpage source. Both my examples
read webpage source *not the rendered page you see in the browser*! The
fault of using copy/paste on a webpage is that different browsers
*often* won't necessarily display content the same way.

If you read the source in the tmp.txt you 'should' very quickly realize
these pages are a template wherein data is dynamically inserted from a
database via script embeded in the source html.

I use the last URL query *you provided* in both the worksheet approach
and the AutoParse() sub. The tmp.txt file shows the complete webpage
source, whereas txtPgSrc shows the webpage source *as rendered* in
WebBrowser1. WebBrowser1 will display whatever is in the URL cell above
it; AutoParse uses the string defined as Public Const gsUrl1$.

You need to decide what URL string you want to run with and set both the
URL cell and gsUrl1 strings to that. Scrap using the copy/paste webpage
approach altogether because it's unreliable at best and renders
inconsistent results at worst! (*Clue:* Note how WebBrowser1 wraps
content but xtPgSrc does not!)

You are collecting data here, NOT capturing webpage content as rendered.
The data displays according to the source behind the rendered webpage.
That source is structured to be dynamic in terms of what data is
rendered based on the URL string, and HOW it displays depends on the
browser being used to view the data. In this case, WebBrowser1 uses the
same engine as Internet Explorer, and what you see on your screen
*depends on* which version of that engine is running!

If you've ever used HTML to build webpages you'd know (or at the very
least *should know*) instinctively that the code source is the only
reliable element to work with.

HTH

Maybe i was not too clear.
Case one:
Using a browser, log to https://www.nsncenter.com/ and give it a
search term: 5960&REGULATOR&"ELECTRON TUBE" in the NSN box, and click on
the WebFLIS Search green button.
Then use the browser "File" pulldown, select Save Page As and modify
the extension to .TXT
The resulting file is a bit different than what one sees in other
methods.
Case two:
Choose a method of getting the search results; a given search term
will always produce the same results (ie: reproducible), and small
changes of the search term may give different results - and THOSE
DIFFERENCES are some of what i am talking about.
Case three:
Choose a given search term, and compare results between various
methods; DIFFERENCES may be huge, also some of what i am talking about.

In case three, with your program, whatever is happening gives a
radically different result. And that result is VERY useful.

For some unknown reason, your program/macro refuses to run, and gives
the following error message: "Can't find project or library".

Would you be so kind as to modify the search term in your program to
5960&REGULATOR&"ELECTRON TUBE" and run it? and please send the results?