Home |
Search |
Today's Posts |
#11
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
So what you also want is the linked file (web page) the image or part#
links to! Here's what I got from https://www.nsncenter.com/NSN/5960-00-831-8683 (pg4): 1st occurance of <a href="/NSN/5960 is at line 7878; 1st occurance of (MCRL) is at line 7931; 1st occurance after that of <a href="/PartNumber" is this at line 7951; <td align="center" style="vertical-align: middle;"<a href="/PartNumber/GV4S1400"GV4S1400</a</td and the next 3 lines a <td style="width: 125px; height: 60px; vertical-align: middle;" align="center" nowrap <a href="/CAGE/63060"63060</a </td <td align="center" style="vertical-align: middle;" <a href="/CAGE/63060"<img class="img-thumbnail" src="https://placehold.it/90x45?text=No%0DImage%0DYet" height=45 width=90 /</a </td <td text-align="center" style="vertical-align: middle;"<a title="CAGE 63060" href="/CAGE/63060"HEICO OHMITE LLC</a</td So you want to go to the next page linked to and repeat the process? At this point my Excel sheet has been modified as follows: Source | NSN Item# | Description | Part# | MCRL# Tektronix | 5960-00-831-8683 | ELECTRON TUBE | GV4S1400 | 4932653 <a href="/CAGE/63060"63060</a <a href="/CAGE/63060"<img class="img-thumbnail" src="https://placehold.it/90x45?text=No%0DImage%0DYet" height=45 width=90 /</a <a title="CAGE 63060" href="/CAGE/63060"HEICO OHMITE LLC</a General Dynamics | 5960-00-853-8207 | ELECTRON TUBE | 295-29434 | 5074477 line1 line2 line3 ...and so on. So far, I'm working with text files and so I'm inclined to append each item to a file named "ElectronTube_NSN5960.txt". File contents for the 2 items above would be structured so the 1st line contains headings (data fields) so it works properly with ADODB. (Note that I use a comma as delimiter, and the file does not contain any blank lines)... Source,NSN Item#,Description,Part#,MCRL# Tektronix,5960-00-831-8683,ELECTRON TUBE,GV4S1400,4932653 <a href="/CAGE/63060"63060</a <a href="/CAGE/63060"<img class="img-thumbnail" src="https://placehold.it/90x45?text=No%0DImage%0DYet" height=45 width=90 /</a <a title="CAGE 63060" href="/CAGE/63060"HEICO OHMITE LLC</a General Dynamics,5960-00-853-8207,ELECTRON TUBE,295-29434,5074477 <a href="/CAGE/1VPW8"1VPW8</a <a href="/CAGE/1VPW8"<img class="img-thumbnail" src="https://az774353.vo.msecnd.net/cage/90/1vpw8.jpg" alt="CAGE 1VPW8" height=45 width=90 /</a <a title="CAGE 1VPW8" href="/CAGE/1VPW8"GENERAL DYNAMICS C4 SYSTEMS, INC.</a ...where I have parsed off the CSS formatting text and html tags outside <a...</a from the 3 lines. I'd likely convert the UCase to proer case as well. The file size is 653 bytes meaning a full page would be about 4kb; 1000 pages being about 4mb. That's 44 lines per page after the fields line. A file this size can be easily handled via ADO recordset or std VB file I/O functions/methods. Loading into an array (vData) puts fields in vData(0) and records starting at vData(1), and looping would Step 4. I really don't have the time/energy (I have Lou Gehrig's) to get any more involved with your project due to current commitments. I just felt it might be worth explaining how I'd handle your task in the hopes it would be helpful to you reaching a viable solution. I bid you good wishes going forward... -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
EOF Parse Text file | Excel Programming | |||
Parse a txt file and save as csv? | Excel Programming | |||
parse from txt file | Excel Programming | |||
Parse File Location | Excel Worksheet Functions | |||
REQ: Simplest way to parse (read) HTML formatted data in via Excel VBA (or VB6) | Excel Programming |