Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. |
#2
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Robert Baer wrote:
Excel macros are SO... undocumented. Sure they are. Plenty of documentation for individual keywords. You're not looking for that, you're looking for a broader concept than individual keywords. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Downloading a web page (or any URI, really) is easy: Declare Function URLDownloadToFile Lib "urlmon" _ Alias "URLDownloadToFileA" (ByVal pCaller As Long, _ ByVal szURL As String, ByVal szFileName As String, _ ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long Sub foo() Dim tmp, result, contents tmp = Environ("TEMP") & Format$(Now, "yyyymmdd-hhmmss-") & "gTX.htm" 'download result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _ tmp, 0, 0) If result < 0 Then 'failed to download; error handler here Else 'read from file Open tmp For Binary As 1 contents = Space$(LOF(1)) Get #1, 1, contents Close 1 'parse file here '[...] 'cleanup End If Kill tmp End Sub (Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of Excel.) Dealing with the data downloaded is very dependant on what the page contains and what you want to extract from it. That page you mentioned contains 2 images and 2 Javascript arrays; assuming you want the data from the arrays, you could search for "awls[" or "aprd[" and get your data that way. Rather than downloading to a file, it is possible to download straight to memory, but I find it's simpler to use a temp file. Among other things, downloading to memory requires opening and closing the connection yourself; URLDownloadToFile handles that for you. -- - We lose the ones we love; we cannot change it. Put it aside. - How? How can I do what is needed, when all I feel is... hate? |
#3
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
I wrote:
tmp = Environ("TEMP") & Format$(Now, "yyyymmdd-hhmmss-") & "gTX.htm" Damn, missed a backslash. tmp = Environ("TEMP") & "\" & Format$(Now, "yyyymmdd-hhmmss-") & "gTX.htm" -- I'm willing to table this discussion for now, and sneak out, if you are. |
#4
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Auric__ wrote:
I wrote: tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm" Damn, missed a backslash. tmp = Environ("TEMP")& "\"& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm" Thanks for the correction to the "path" of destruction. |
#5
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Auric__ wrote:
Robert Baer wrote: Excel macros are SO... undocumented. Sure they are. Plenty of documentation for individual keywords. You're not looking for that, you're looking for a broader concept than individual keywords. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Downloading a web page (or any URI, really) is easy: Declare Function URLDownloadToFile Lib "urlmon" _ Alias "URLDownloadToFileA" (ByVal pCaller As Long, _ ByVal szURL As String, ByVal szFileName As String, _ ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long Sub foo() Dim tmp, result, contents tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm" 'download result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _ tmp, 0, 0) If result< 0 Then 'failed to download; error handler here Else 'read from file Open tmp For Binary As 1 contents = Space$(LOF(1)) Get #1, 1, contents Close 1 'parse file here '[...] 'cleanup End If Kill tmp End Sub (Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of Excel.) Dealing with the data downloaded is very dependant on what the page contains and what you want to extract from it. That page you mentioned contains 2 images and 2 Javascript arrays; assuming you want the data from the arrays, you could search for "awls[" or "aprd[" and get your data that way. Rather than downloading to a file, it is possible to download straight to memory, but I find it's simpler to use a temp file. Among other things, downloading to memory requires opening and closing the connection yourself; URLDownloadToFile handles that for you. Thanks. I know that the parsing is very dependent on the contents and what one wants. That is why i gave an example having at least one data array; similar to what i may be parsing. I too, like temp files because i can open them in random and/or binary mode if need. A bit of fun to read such backwards (like BMP files). Thanks again. |
#6
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Auric__ wrote:
Robert Baer wrote: Excel macros are SO... undocumented. Sure they are. Plenty of documentation for individual keywords. You're not looking for that, you're looking for a broader concept than individual keywords. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Downloading a web page (or any URI, really) is easy: Declare Function URLDownloadToFile Lib "urlmon" _ Alias "URLDownloadToFileA" (ByVal pCaller As Long, _ ByVal szURL As String, ByVal szFileName As String, _ ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long Sub foo() Dim tmp, result, contents tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm" 'download result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _ tmp, 0, 0) If result< 0 Then 'failed to download; error handler here Else 'read from file Open tmp For Binary As 1 contents = Space$(LOF(1)) Get #1, 1, contents Close 1 'parse file here '[...] 'cleanup End If Kill tmp End Sub (Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of Excel.) Dealing with the data downloaded is very dependant on what the page contains and what you want to extract from it. That page you mentioned contains 2 images and 2 Javascript arrays; assuming you want the data from the arrays, you could search for "awls[" or "aprd[" and get your data that way. Rather than downloading to a file, it is possible to download straight to memory, but I find it's simpler to use a temp file. Among other things, downloading to memory requires opening and closing the connection yourself; URLDownloadToFile handles that for you. Well, i am in a pickle. Firstly, i did a bit of experimenting, and i discovers a few things. 1) The variable "tmp" is nice, but does not have to have the date and time; that would fill the HD since i have thousands of files to process. Fix is easy - just have a constant name for use. 2) The file "tmp" is created ONLY if the value of "result" is zero. 3) The problem i have seems to be due to the fact that the online files have no filetype: "https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5" And i need to process page numbers from 5 to 999 (the reason for the program). I have processed page numbers from 1 to 4 by hand...PITA. Ideas for a fix? Thanks. |
#7
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Robert Baer wrote:
Auric__ wrote: Robert Baer wrote: Excel macros are SO... undocumented. Sure they are. Plenty of documentation for individual keywords. You're not looking for that, you're looking for a broader concept than individual keywords. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Downloading a web page (or any URI, really) is easy: Declare Function URLDownloadToFile Lib "urlmon" _ Alias "URLDownloadToFileA" (ByVal pCaller As Long, _ ByVal szURL As String, ByVal szFileName As String, _ ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long Sub foo() Dim tmp, result, contents tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm" 'download result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _ tmp, 0, 0) If result< 0 Then 'failed to download; error handler here Else 'read from file Open tmp For Binary As 1 contents = Space$(LOF(1)) Get #1, 1, contents Close 1 'parse file here '[...] 'cleanup End If Kill tmp End Sub (Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of Excel.) Dealing with the data downloaded is very dependant on what the page contains and what you want to extract from it. That page you mentioned contains 2 images and 2 Javascript arrays; assuming you want the data from the arrays, you could search for "awls[" or "aprd[" and get your data that way. Rather than downloading to a file, it is possible to download straight to memory, but I find it's simpler to use a temp file. Among other things, downloading to memory requires opening and closing the connection yourself; URLDownloadToFile handles that for you. Well, i am in a pickle. Firstly, i did a bit of experimenting, and i discovered a few things. 1) The variable "tmp" is nice, but does not have to have the date and time; that would fill the HD since i have thousands of files to process. * Error: if i have umpteen files to process, then i must use their unique names. Fix is easy - just have a constant name for use. * Error: i can issue a KILL each time (a "KILL *.*" does not work). 2) The file "tmp" is created ONLY if the value of "result" is zero. 3) The problem i have seems to be due to the fact that the online files have no filetype: "https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5" And i need to process page numbers from 5 to 999 (the reason for the program). I have processed page numbers from 1 to 4 by hand...PITA. Ideas for a fix? Thanks. |
#8
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Robert Baer wrote:
Robert Baer wrote: Auric__ wrote: Robert Baer wrote: Excel macros are SO... undocumented. Sure they are. Plenty of documentation for individual keywords. You're not looking for that, you're looking for a broader concept than individual keywords. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Downloading a web page (or any URI, really) is easy: Declare Function URLDownloadToFile Lib "urlmon" _ Alias "URLDownloadToFileA" (ByVal pCaller As Long, _ ByVal szURL As String, ByVal szFileName As String, _ ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long Sub foo() Dim tmp, result, contents tmp = Environ("TEMP")& Format$(Now, "yyyymmdd-hhmmss-")& "gTX.htm" 'download result = URLDownloadToFile(0, "http://www.oil4lessllc.org/gTX.htm", _ tmp, 0, 0) If result< 0 Then 'failed to download; error handler here Else 'read from file Open tmp For Binary As 1 contents = Space$(LOF(1)) Get #1, 1, contents Close 1 'parse file here '[...] 'cleanup End If Kill tmp End Sub (Note that URLDownloadToFile must be declared PtrSafe on 64-bit versions of Excel.) Dealing with the data downloaded is very dependant on what the page contains and what you want to extract from it. That page you mentioned contains 2 images and 2 Javascript arrays; assuming you want the data from the arrays, you could search for "awls[" or "aprd[" and get your data that way. Rather than downloading to a file, it is possible to download straight to memory, but I find it's simpler to use a temp file. Among other things, downloading to memory requires opening and closing the connection yourself; URLDownloadToFile handles that for you. Well, i am in a pickle. Firstly, i did a bit of experimenting, and i discovered a few things. 1) The variable "tmp" is nice, but does not have to have the date and time; that would fill the HD since i have thousands of files to process. * Error: if i have umpteen files to process, then i must use their unique names. Fix is easy - just have a constant name for use. * Error: i can issue a KILL each time (a "KILL *.*" does not work). 2) The file "tmp" is created ONLY if the value of "result" is zero. 3) The problem i have seems to be due to the fact that the online files have no filetype: "https://www.nsncenter.com/NSNSearch?q=5960%20regulator&PageNumber=5" And i need to process page numbers from 5 to 999 (the reason for the program). I have processed page numbers from 1 to 4 by hand...PITA. Ideas for a fix? Thanks. Well, it is rather bad.. I created a file with no filetype and put it on the web. The process URLDownloadToFile worked, with "tmp" and "result" being correct. BUT.... The line "Open tmp For Binary As 1" barfed with error message "Bad filename or number" (remember, it worked with my gTX.htm file). So, there is "something" about that site that seems to prevent read access. Is there a way to get a clue (and maybe fix it)? And assuming a fix, what can i do about the OPEN command/syntax? // What i did in Excel: S$ = "D:\Website\Send .Hot\****" tmp = Environ("TEMP") & "\" & S$ result = URLDownloadToFile(0, S$, tmp, 0, 0) 'The file is at "http://www.oil4lessllc.com/****" ; an Excel file |
#9
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Robert Baer wrote:
And assuming a fix, what can i do about the OPEN command/syntax? // What i did in Excel: S$ = "D:\Website\Send .Hot\****" tmp = Environ("TEMP") & "\" & S$ The contents of the variable S$ at this point: S$ = "C:\Users\auric\D:\Website\Send .Hot\****" Do you see the problem? Also, as Garry pointed out, cleanup should happen automatically. The "Kill" keyword deletes files. Try this code: Declare Function URLDownloadToFile Lib "urlmon" _ Alias "URLDownloadToFileA" (ByVal pCaller As Long, _ ByVal szURL As String, ByVal szFileName As String, _ ByVal dwReserved As Long, ByVal lpfnCB As Long) As Long Function downloadFile(what As String) As String 'returns tmp file's path on success or empty string on failure Dim tmp, result, contents, fnum tmp = Environ("TEMP") & "\" & Format$(Now, "yyyymmdd-hhmmss-") & _ "downloaded.tmp" 'download result = URLDownloadToFile(0, what, tmp, 0, 0) If result < 0 Then 'failed to download; error handler here, if any On Error Resume Next 'can be avoided by checking if the file exists Kill tmp 'cleanup On Error GoTo 0 downloadFile = "" Else 'read from file fnum = FreeFile Open tmp For Binary As fnum contents = Space$(LOF(fnum)) Get #fnum, 1, contents Close fnum downloadFile = tmp End If End Function Sub foo() Dim what As String, files_to_get As Variant, L0 'specify files to download files_to_get = Array("http://www.oil4lessllc.com/****", _ "http://www.oil4lessllc.org/gTX.htm") For L0 = LBound(files_to_get) To UBound(files_to_get) what = downloadFile(files_to_get(L0)) If Len(what) Then '****parse file here***** Kill what 'cleanup End If Next End Sub -- THIS IS NOT AN APPROPRIATE ANSWER. |
#10
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Well, i am in a pickle.
Firstly, i did a bit of experimenting, and i discovers a few things. 1) The variable "tmp" is nice, but does not have to have the date and time; that would fill the HD since i have thousands of files to process. So you did not 'catch' that the tmp file is deleted when this line... Kill tmp ...gets executed! -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#11
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
Well, i am in a pickle. Firstly, i did a bit of experimenting, and i discovers a few things. 1) The variable "tmp" is nice, but does not have to have the date and time; that would fill the HD since i have thousands of files to process. So you did not 'catch' that the tmp file is deleted when this line... Kill tmp ..gets executed! I know about that; i have been stepping thru the execution (F8), and stop long before that. Then i go to CMD prompt and check files and DEL *.* when appropriate for next test. |
#12
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
Well, i am in a pickle. Firstly, i did a bit of experimenting, and i discovers a few things. 1) The variable "tmp" is nice, but does not have to have the date and time; that would fill the HD since i have thousands of files to process. So you did not 'catch' that the tmp file is deleted when this line... Kill tmp ..gets executed! Been thru this already..also makes no sense to have a name miles long. |
#13
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
but does not have to have the date and time; that would fill the HD
since i have thousands of files to process. Filenames have nothing to do with storage space; -it's the file size! Given Auric_'s suggestion creates text files, the size of 999 txt files would hardly be more the 1MB total! If you append each page to the 1st file then all pages could be in 1 file... -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#14
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
After looking at your link to p5, I see what you mean by the amount of
storage space, but filename is not a factor. Parsing will certainly downsize each file considerably... -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#15
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
After looking at your link to p5, I see what you mean by the amount of storage space, but filename is not a factor. Parsing will certainly downsize each file considerably... One has to open the file to parse, so that is not logical. The function URLDownloadToFile gives zero options - it copies ALL of the source into a TEMP file; one hopes that the source is not equal to or larger than 4GB in size! For pages on the web, that is extremely unlikely; webpage size max limit prolly is 10MB; maybe 300K worst case on the average. So, once in TEMP, it can be opened for input (text), for random (may specify buffer size), or for binary (may specify buffer size). Here,one can optimize read speed VS string space used. |
#16
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
but does not have to have the date and time; that would fill the HD since i have thousands of files to process. Filenames have nothing to do with storage space; -it's the file size! Given Auric_'s suggestion creates text files, the size of 999 txt files would hardly be more the 1MB total! If you append each page to the 1st file then all pages could be in 1 file... True, BUT the files can be large: "result = URLDownloadToFile(0, S$, tmp, 0, 0)" creates a file in TEMP the size of the source - which can be multi-megabtes; 999 of them can eat the HD space fast. Hopefully a URL file size does not exceed the space limit allowed in Excel 2003 string space (anyone know what that might be?). I have found that the stringvalue AKA TEMP filename can be fixed to anything reasonable, and does not have to include parts/substrings/subsets of the file one wants to download. I can be "a good thing" (quoting Martha Stewart) to delete the file when done. I have also found the following: 1) one does not have to use FreeFile for a file number (when all else is OK). 2) cannot use "contents" for string storage space. 3) one cannot mix use of "/" and "\" in a string for a given file name. 4) one cannot have a space in the file name, so that gives a serious problem for some web URLs (work-around anyone?) 5) method fails for "https:" (work-around anyone?) |
#17
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
I have also found the following:
1) one does not have to use FreeFile for a file number (when all else is OK). True, however not considered 'best practice'. Freefile() ensures a unique ID is assigned to your var. 2) cannot use "contents" for string storage space. Why not? It's not a VB[A] keyword and so qualifies for use as a var. 3) one cannot mix use of "/" and "\" in a string for a given file name. Not sure why you'd use "/" in a path string! Forward slash is not a legal filename/path character. Backslash is the default Windows path delimiter. If choosing folders, the last backslah is not followed by a filename. 4) one cannot have a space in the file name, so that gives a serious problem for some web URLs (work-around anyone?) Web paths 'pad' spaces so the string is contiguous. I believe the pad string is "%20" OR "+". 5) method fails for "https:" (work-around anyone?) What method? -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#18
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
5) method fails for "https:" (work-around anyone?)
These URLs usually require some kind of 'login' be done, which needs to be included in the URL string. -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#19
![]() |
|||
|
|||
![]()
Phim SEX không che , Phim SEX Nh*t Bản , Phim SEX loạn luân , Phim SEX HD
Phim SEX HD <<<< Tuyển t*p phim sex chất lượng cao mới nhất, những bộ phim heo hay nhất 2016, xem phim sex HD full 1080p trên web không quảng cáo, chúc các bạn xem ... |
#20
![]() |
|||
|
|||
![]()
Phim SEX không che , Phim SEX Nh*t Bản , Phim SEX loạn luân , Phim SEX HD
Phim SEX online <<<< Xem phim sex chọn lọc chất lượng cao mới nhất 2016, Phim sex online siêu dâm dục, phim sex hd siêu nét vừa xem vừa thủ dâm cực khoái. |
#21
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
What is with this a*hole posting this sh*t here?
|
#22
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Robert Baer wrote:
What is with this a*hole posting this sh*t here? It's just spam. Ignore it. -- If you would succeed, you must reduce your strategy to its point of application. |
#23
![]() |
|||
|
|||
![]()
Ch*o bạn!
Có bạn phải đang tìm kiếm cho mình mã giảm giá adayroi không? Hãy truy c*p ngay MaGiamGiaAdayroi.com để nh*n th*t nhiều mã giảm giá miễn ph* nhé |
#24
![]() |
|||
|
|||
![]()
sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả
|
#25
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
everonvietnam2016 wrote:
sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả No kapish. Firstly, on my computer, all i see is a strange mix of characters from the 512 ASCII set. Secondly, i would not be able to read or understand your language even if it was elegantly rendered. |
#26
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Robert Baer wrote:
everonvietnam2016 wrote: sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả No kapish. Firstly, on my computer, all i see is a strange mix of characters from the 512 ASCII set. Secondly, i would not be able to read or understand your language even if it was elegantly rendered. It's Vietnamese. *Lots* of accented characters. Also, it's spam. -- Who says life is sacred? God? Hey, if you read your history, God is one of the leading causes of death. -- George Carlin |
#27
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Auric__ wrote:
Robert Baer wrote: everonvietnam2016 wrote: sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người dùng, dịch vụ tuyệt trần, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả No kapish. Firstly, on my computer, all i see is a strange mix of characters from the 512 ASCII set. Secondly, i would not be able to read or understand your language even if it was elegantly rendered. It's Vietnamese. *Lots* of accented characters. Also, it's spam. Yes, it was easy to recognize that was Vietnamese. How the heck did you figure out that it was spam? |
#28
![]() |
|||
|
|||
![]()
sản phẩm tốt, giá rẻ, chất lượng, an to*n cho người s* dụng, dịch vụ ráo trọi, lúc n*o có điều kiện qua ủng hộ nhé. chúc c*a h*ng l*m ăn hiệu quả
|
#29
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Excel macros are SO... undocumented.
Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ...for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#30
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) I did not even try cURL as the explanation was just too dern complicated. Fiddled in Excel,as it has so many different ways to do something specific. So, this is skeleton of what i have: Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken then.. With Worksheets(1) ' .Copy ''do not need; saves BOOK space .SaveAs sav$ 'do not know how to close when done ' above creates the file described; that takes HD space, about 300K End With IMMEDIATELY after the "End With", a folder is created with useless metadata info; do not know how to close when done. WARNING: Scheme works only in XP and Win7. If in XP, at about 150 files,one gets a PHONY "HD is full" warning and one must exit Excel so as to be able to delete processed (and so unwanted) files. I say PHONY because the system showed NO CHANGE in HD free space, never mind those files take about 500MB. Furthermore, in Win7, these files show up in a folder the system KNOWS NOTHING ABOUT..Windows Explorer does not show C:\Documents which IS accessible; C:\<sysname\MY Documents is shown and CANNOT be accessed. Instead of the Excel program crashing, the system is shut down and locked. YET other reasons I hate Win7. |
#31
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) I did not even try cURL as the explanation was just too dern complicated. Fiddled in Excel,as it has so many different ways to do something specific. So, this is skeleton of what i have: Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken then.. With Worksheets(1) ' .Copy ''do not need; saves BOOK space .SaveAs sav$ 'do not know how to close when done ' above creates the file described; that takes HD space, about 300K End With IMMEDIATELY after the "End With", a folder is created with useless metadata info; do not know how to close when done. WARNING: Scheme works only in XP and Win7. If in XP, at about 150 files,one gets a PHONY "HD is full" warning and one must exit Excel so as to be able to delete processed (and so unwanted) files. I say PHONY because the system showed NO CHANGE in HD free space, never mind those files take about 500MB. Furthermore, in Win7, these files show up in a folder the system KNOWS NOTHING ABOUT..Windows Explorer does not show C:\Documents which IS accessible; C:\<sysname\MY Documents is shown and CANNOT be accessed. Instead of the Excel program crashing, the system is shut down and locked. YET other reasons I hate Win7. I don't follow what you're talking about here! What does it have to do with the download I linked to? -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#32
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
GS wrote: Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) I did not even try cURL as the explanation was just too dern complicated. Fiddled in Excel,as it has so many different ways to do something specific. So, this is skeleton of what i have: Workbooks.Open Filename:=openFYL$ 'opens as R/O, no HD space taken then.. With Worksheets(1) ' .Copy ''do not need; saves BOOK space .SaveAs sav$ 'do not know how to close when done ' above creates the file described; that takes HD space, about 300K End With IMMEDIATELY after the "End With", a folder is created with useless metadata info; do not know how to close when done. WARNING: Scheme works only in XP and Win7. If in XP, at about 150 files,one gets a PHONY "HD is full" warning and one must exit Excel so as to be able to delete processed (and so unwanted) files. I say PHONY because the system showed NO CHANGE in HD free space, never mind those files take about 500MB. Furthermore, in Win7, these files show up in a folder the system KNOWS NOTHING ABOUT..Windows Explorer does not show C:\Documents which IS accessible; C:\<sysname\MY Documents is shown and CANNOT be accessed. Instead of the Excel program crashing, the system is shut down and locked. YET other reasons I hate Win7. I don't follow what you're talking about here! What does it have to do with the download I linked to? In the meantime, i took a stab of a "pure" Excel program to get the data. Whatever you do and more eXplicity how you do the search, it yields results that i do not see. Manually downloading the first page for a manual search, I get: 5960 REGULATOR AND "ELECTRON TUBE" About 922 results (1 ms) 5960-00-503-9529 5960-00-504-8401 5960-01-035-3901 5960-01-029-2766 5960-00-617-4105 5960-00-729-5602 5960-00-826-1280 5960-00-754-5316 5960-00-962-5391 5960-00-944-4671 5960-00-897-8418 and 5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results (16 ms) 5960-00-503-9529 5960-00-504-8401 5960-01-035-3901 5960-01-029-2766 5960-00-617-4105 5960-00-729-5602 5960-00-826-1280 5960-00-754-5316 5960-00-962-5391 5960-00-944-4671 5960-00-897-8418 Note they are very different, and the second search "gets" a a lot less. Also neither search gets anything you got, and i am interested in how you did it. |
#33
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Also neither search gets anything you got, and i am interested in
how you did it. If you study the file I gave you, you'll see how both methods are working. The worksheet implements all manual parsing so you can study each part of the process as well as the web page source structure; the *AutoParse* macro collects the data and writes it to the file. -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#34
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
If you're referring to the substitute 'page error' text put in place of
missing item info, ..well that might be misleading you. Fact is, starting with item7 on pg7 there is no item info on any pages I checked manually in the browser (up to pg100). Perhaps you could rephrase that to "No Data Available"!? -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#35
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
If you're referring to the substitute 'page error' text put in place of missing item info, ..well that might be misleading you. Fact is, starting with item7 on pg7 there is no item info on any pages I checked manually in the browser (up to pg100). Perhaps you could rephrase that to "No Data Available"!? Machs nicht. I also looked manually and you are correct. Why the heck they have NSNs that do not relate to a part is puzzling, but, hey, it *IS* the government. Not useful to what i need, but still nice to know. |
#36
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) Holy S*! I did about 30 pages by hand; quit as rather tiresome and total pages unknown (MORE than 999). Never saw the fail you saw. Difference is that you used the word "and"; technically (i think) that should not affect results. Also, you got items I am interested in, and after processing 503 pages, i did NOT get those. In both cases, there were a lot of duplicate records (government data, what else can you expect?). In your sample, there were 73 useful records containing 43 unique records. There may be some that i am not interested in, but there definitely ARE those i did not find that i am interested in. In my sample, there were 3782 unique records,and (better sit down), only 15 were interesting. Crappy odds. Hopefully, when i call them, someone that has some experience and knowledge of how their sort criteria works, will answer the phone. Last time i called,i got a new guy; no help other than "use Google". You have done a masterful job! Label it !DONE! please. Thanks a lot. |
#37
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) Holy S*! I did about 30 pages by hand; quit as rather tiresome and total pages unknown (MORE than 999). Never saw the fail you saw. Difference is that you used the word "and"; technically (i think) that should not affect results. Also, you got items I am interested in, and after processing 503 pages, i did NOT get those. In both cases, there were a lot of duplicate records (government data, what else can you expect?). In your sample, there were 73 useful records containing 43 unique records. There may be some that i am not interested in, but there definitely ARE those i did not find that i am interested in. You could 'dump' the file into a worksheet and filter out the dupes easily enough. In my sample, there were 3782 unique records,and (better sit down), only 15 were interesting. Crappy odds. Hopefully, when i call them, someone that has some experience and knowledge of how their sort criteria works, will answer the phone. Last time i called,i got a new guy; no help other than "use Google". Those are dynamic web pages and so are database driven. Surely there's a repository database for this info somewhere other than NSN? You have done a masterful job! Label it !DONE! please. Thanks a lot. Happy to be of help; -I found the project rather interesting! -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
#38
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
GS wrote:
GS wrote: Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) Holy S*! I did about 30 pages by hand; quit as rather tiresome and total pages unknown (MORE than 999). Never saw the fail you saw. Difference is that you used the word "and"; technically (i think) that should not affect results. Also, you got items I am interested in, and after processing 503 pages, i did NOT get those. In both cases, there were a lot of duplicate records (government data, what else can you expect?). In your sample, there were 73 useful records containing 43 unique records. There may be some that i am not interested in, but there definitely ARE those i did not find that i am interested in. You could 'dump' the file into a worksheet and filter out the dupes easily enough. * Yes, i did that; getting those 43 unique records. In my sample, there were 3782 unique records,and (better sit down), only 15 were interesting. Crappy odds. Hopefully, when i call them, someone that has some experience and knowledge of how their sort criteria works, will answer the phone. Last time i called,i got a new guy; no help other than "use Google". Those are dynamic web pages and so are database driven. Surely there's a repository database for this info somewhere other than NSN? * Prolly not dynamic as NSNs do not change except possible additions on rare occasion. Certainly, a SPECIFIC search (so far) always gives the same results. Look at a previous response concerning 100% manual search results, first page only. 5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results 5960 REGULATOR AND "ELECTRON TUBE" About 922 results Totally different and neither match your results. And your results look superior. You have done a masterful job! Label it !DONE! please. Thanks a lot. Happy to be of help; -I found the project rather interesting! |
#39
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
Robert Baer wrote:
GS wrote: GS wrote: Excel macros are SO... undocumented. Need a WORKING example for reading the HTML source a URL (say http://www.oil4lessllc.org/gTX.htm) Thanks. Look here... https://app.box.com/s/23yqum8auvzx17h04u4f ..for *ParseWebPages.zip*, which contains: ParseWebPages.xls NSN_5960.txt (Blank data file with fieldnames only in line1 NSN_5960_Test.txt (Results for 1st 20 pages) Holy S*! I did about 30 pages by hand; quit as rather tiresome and total pages unknown (MORE than 999). Never saw the fail you saw. Difference is that you used the word "and"; technically (i think) that should not affect results. Also, you got items I am interested in, and after processing 503 pages, i did NOT get those. In both cases, there were a lot of duplicate records (government data, what else can you expect?). In your sample, there were 73 useful records containing 43 unique records. There may be some that i am not interested in, but there definitely ARE those i did not find that i am interested in. You could 'dump' the file into a worksheet and filter out the dupes easily enough. * Yes, i did that; getting those 43 unique records. In my sample, there were 3782 unique records,and (better sit down), only 15 were interesting. Crappy odds. Hopefully, when i call them, someone that has some experience and knowledge of how their sort criteria works, will answer the phone. Last time i called,i got a new guy; no help other than "use Google". Those are dynamic web pages and so are database driven. Surely there's a repository database for this info somewhere other than NSN? * Prolly not dynamic as NSNs do not change except possible additions on rare occasion. Certainly, a SPECIFIC search (so far) always gives the same results. Look at a previous response concerning 100% manual search results, first page only. 5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results 5960 REGULATOR AND "ELECTRON TUBE" About 922 results Totally different and neither match your results. And your results look superior. You have done a masterful job! Label it !DONE! please. Thanks a lot. Happy to be of help; -I found the project rather interesting! I am getting more confused. Search term used and response: 5960 AND REGULATOR AND "ELECTRON TUBE" About 104 results 5960 REGULATOR AND "ELECTRON TUBE" About 922 results 5960 regulator and "ELECTRON TUBE" About 3134377 results Use of the second two give exactly the same list for the first page and the last term is the one you used in your program. The results like i previously said, are completely different WRT the first term and your program (3 different results). Notice the 3.1 million results when lower case is used for "regulator"; i think the database "engine" is thrashing around in what is almost a useless attempt. BUT. That thrashing produces very useful results (after sort and consolidate). SO. (1) results are dependent on the form / format of the search term used. (2) results depend on the (in this case) Excel procedure used that does the access and fetch. Now i know rather well, that Excel mangles the HTML information when it is imported; most especially the primary page. I had my Excel program working to parse the primary HTML page AS SEEN BY THE HUMAN EYE ON THE WEB, and i had to make a number of changes to accommodate what Excel gave me. Therefore, on that basis, i have a rather strong suspicion that what Excel SENDS to the web for a search is quite different than what we think. Comments? Suggestions, primarily to get it more efficient BUT ALSO give it all to us? PS: How i get the data: Workbooks.Open Filename:=openFYL$ 'YES..opens as R/O With Worksheets(1) ' .Copy ''do not need; saves BOOKn space .SaveAs sav$ 'do not know how to close when do not need End With |
#40
![]()
Posted to microsoft.public.excel.programming,microsoft.public.excel
|
|||
|
|||
![]()
I've uploaded a new version that skips dupes, and flags missing item
info. (See the new 'Test' file) This version also runs orders of magnitude faster! -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus |
Reply |
|
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
EOF Parse Text file | Excel Programming | |||
Parse a txt file and save as csv? | Excel Programming | |||
parse from txt file | Excel Programming | |||
Parse File Location | Excel Worksheet Functions | |||
REQ: Simplest way to parse (read) HTML formatted data in via Excel VBA (or VB6) | Excel Programming |