Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
In message of Sat, 29 Oct 2016 17:52:09
in microsoft.public.excel.programming, GS writes Not sure why you're automating IE since most things it does can be accessed directly via APIs with no wait time! Gary, I know no better. What APIs do you refer to? I send a URL, await a response and analyse ie.doc. I repeat, I know no better. ;) Have a look at my replies to the post by Robert Baer in this forum on May 26th to see if it applies to what you are trying to do... -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
In message of Sat, 29 Oct 2016
17:52:09 in microsoft.public.excel.programming, GS writes Not sure why you're automating IE since most things it does can be accessed directly via APIs with no wait time! Gary, I know no better. What APIs do you refer to? I send a URL, await a response and analyse ie.doc. I repeat, I know no better. ;) Have a look at my replies to the post by Robert Baer in this forum on May 26th to see if it applies to what you are trying to do... Subject of his post is... "Read (and parse) file on the web" -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
In message of Sun, 30 Oct 2016 17:36:45 in
microsoft.public.excel.programming, GS writes In message of Sat, 29 Oct 2016 17:52:09 in microsoft.public.excel.programming, GS writes Not sure why you're automating IE since most things it does can be accessed directly via APIs with no wait time! Gary, I know no better. What APIs do you refer to? I send a URL, await a response and analyse ie.doc. I repeat, I know no better. ;) Have a look at my replies to the post by Robert Baer in this forum on May 26th to see if it applies to what you are trying to do... Subject of his post is... "Read (and parse) file on the web" Thanks! I had a look at your contributions to this 92 member thread. I saw 2 interesting "names": URLDownloadToFile and fParseWebPages.frm. I googled URLDownloadToFile. I adapted a web example. (I later found an example from Auric in the thread.) When I opened the downloaded page in IE, it looked different - simpler. I made another example, which downloaded to memory. The download was about 120k, compared with 110k. I converted that example to write to a file. The file was similarly simpler. I would like an API technique to convert the downloaded HTML to DOM. I have a reliable - but slow - method for analysing DOM information. While Googling, I learned Regular Expressions are not recommended for HTML analysis. I also grabbed ParseWebPages.zip from https://www.nsncenter.com/NSN/. When I ran the code, I got "Method or data member not found" referring to webbrowser in Private Sub Worksheet_Change(ByVal Target As Range) If Target = Range("PgNum") Then Range("WebAddr") = gsUrl1 & Target.Value: Me.WebBrowser1.Navigate Range("WebAddr"): SetBtnState End If End Sub I found no "Me" definition. My understanding is zilch, but I am happy to learn. OOPS! I nearly forgot to quote my example code: Public Declare Function URLDownloadToFile Lib "urlmon" Alias "URLDownloadToFileA" _ (ByVal pCaller As Long, ByVal szURL As String, ByVal szFileName As String, ByVal dwReserved As Long, _ ByVal lpfnCB As Long) As Long Private Function GetWebPage(ByRef URL As String) As String Dim xml As Object ' need reference to get IXMLHTTPRequest On Error Resume Next Set xml = CreateObject("Microsoft.XMLHTTP") With xml .Open "GET", URL, False .send GetWebPage = .responseText End With Set xml = Nothing ' Is this needed? End Function Public Function DownloadFile1() As String Dim lngRetVal As Long Dim hFile As Long Dim localFilename As String Dim strCurrentURL As String Dim content As String localFilename = "C:\Users\IBM\AppData\Roaming\Microsoft\Excel\Down lo ad.htm" strCurrentURL = "https://tfl.gov.uk/bus/stop/490000235Z/new-oxford- street?" If True Then content = GetWebPage(strCurrentURL) hFile = FreeFile Open localFilename For Output As #hFile Write #hFile, content Close #hFile Else lngRetVal = URLDownloadToFile(0, strCurrentURL, localFilename, 0, 0) End If hFile = FreeFile Open localFilename For Input As #hFile DownloadFile1 = Input$(LOF(hFile), hFile) Close #hFile End Function Public Function DownloadFile0() As String Dim lngRetVal As Long Dim hFile As Long Dim localFilename As String Dim strCurrentURL As String localFilename = "C:\Users\IBM\AppData\Roaming\Microsoft\Excel\Down lo ad.htm" strCurrentURL = "https://tfl.gov.uk/bus/stop/490000235Z/new-oxford- street?" lngRetVal = URLDownloadToFile(0, strCurrentURL, localFilename, 0, 0) hFile = FreeFile Open localFilename For Input As #hFile DownloadFile = Input$(LOF(hFile), hFile) Close #hFile End Function -- Walter Briscoe |
#4
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
The latest version of ParseWebPages.xls doesn't contain any userforms.
You can download it's zip here... https://app.box.com/s/23yqum8auvzx17h04u4f ...and review what it does and how it works. Let me know if you need further assistance... -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion |
#5
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On 02/11/16 17:30, GS wrote:
The latest version of ParseWebPages.xls doesn't contain any userforms. You can download it's zip here... This one of the projects that my directions would be "If I wanted to get there, I wouldn't start from here". Use Pup. http://www.compciv.org/recipes/cli/p...-parsing-html/ https://github.com/EricChiang/pup Works in Windows and in everything else. -- Adrian C |
#6
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On 02/11/16 17:30, GS wrote:
The latest version of ParseWebPages.xls doesn't contain any userforms. You can download it's zip here... This one of the projects that my directions would be "If I wanted to get there, I wouldn't start from here". Use Pup. http://www.compciv.org/recipes/cli/p...-parsing-html/ https://github.com/EricChiang/pup Works in Windows and in everything else. Very nice tool! Not sure the usefullness importing the data into Excel is from a command window without additional arduous coding. Do you have any suggestions for Walter? -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion |
#7
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On 03/11/16 19:33, GS wrote:
On 02/11/16 17:30, GS wrote: The latest version of ParseWebPages.xls doesn't contain any userforms. You can download it's zip here... This one of the projects that my directions would be "If I wanted to get there, I wouldn't start from here". Use Pup. http://www.compciv.org/recipes/cli/p...-parsing-html/ https://github.com/EricChiang/pup Works in Windows and in everything else. Very nice tool! Not sure the usefullness importing the data into Excel is from a command window without additional arduous coding. Do you have any suggestions for Walter? Well, for one it would enable dropping the nasty internet controls stuff from Excel, and CLI results can be got at through WScript.Shell http://stackoverflow.com/questions/1...sing-excel-vba I use pup in a bash script on linux, but the thing would be a fit here with less overall coding. -- Adrian C |
#8
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
In message of Wed, 2 Nov 2016 13:30:11 in microsoft.public.excel.programming, GS
writes The latest version of ParseWebPages.xls doesn't contain any userforms. You can download it's zip here... https://app.box.com/s/23yqum8auvzx17h04u4f ..and review what it does and how it works. Let me know if you need further assistance... Garry, This is my day for admitting stupidity. I downloaded the revised ParseWebPages.zip file and opened ParseWebPages.xls. I can't see what it does. When I try to run AutoParse, I get "Compile Error Can't find project or library". Format is highlighted in "oTimer.StopTimer: Debug.Print "AutoParse took " & Format(oTimer.Elapsed, "#.000000") & " seconds."" I am suspicious it is a problem with my own implementation. For a while, I have not been able to run Bob Bovey's VBA Code Cleaner, found in <http:www.appspro.com/Utilities/CodeCleaner.htm It installs, but quietly fails to add a menu item to the VBE Tools Menu. VBA CODE DECOMPILER AND COMPACTOR from http://orlando.mvps.org seems to have vanished since I use it on 20160524. I have had success with getting URLs into strings using "Microsoft.XMLHTTP". Thousand of successful reads had a mean of about a second and a standard deviation of about 1.5 seconds. I Googled Microsoft.XMLHTTP and saw a reference to a responseXML member. I can't again find something I read that it is only set for IE10 and upwards. (I run IE9 and IE10 won't run on Vista). I ran happily, grabbing each piece of information with a dedicated regular expression (RE). Naughtily (I did not measure timing), I decided to go for one RE for all pieces of information. I ended with Const grabpage0 As String = "^[\x00-\xFF]*([\x00-\xFF]{100}headline-info with-icon[\x00-\xFF]{500})" & _ "[\x00-\xFF]*arrival times of the next bus, text (\d{5}) to 87287[" & _ "[\x00-\xFF]*(first-last-details.*)" & _ "[\x00-\xFF]*(toId=.*)" & _ "[\x00-\xFF]*(href..*maps..*InputGeolocation=.*)[\x00-\xFF]*$" That RE is 95%+ effective. Some data are missing from some pages. I tried (:arrival times of the next bus, text (\d{5}) to 87287)?, but that always returned empty - the RE engine finds it "easier" to skip a datum than to set it, when given a choice. The code seems to run in about 6 hours, rather than 12 when I started. I aim to run it about once a week. I don't have a watchdog timer on "Microsoft.XMLHTTP". It seems less susceptible to seizing up than the IE9 browser. -- Walter Briscoe |
#9
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Garry,
This is my day for admitting stupidity. I downloaded the revised ParseWebPages.zip file and opened ParseWebPages.xls. I can't see what it does. When I try to run AutoParse, I get "Compile Error Can't find project or library". Format is highlighted in "oTimer.StopTimer: Debug.Print "AutoParse took " & Format(oTimer.Elapsed, "#.000000") & " seconds."" The project requires 'References' to: Microsoft Internet Controls Microsoft HTML Object Library All the controls on 'Sheet1' work with WebBrowser1 and txtPgSrc. You must 1st 'View Source' for the page rendered in the browser. If the page is a 'parent' it contains links to items (child pages); Click 'Get Item Links' to parse these links into txtPgSrc; If you click a link in the browser: Click 'View Source' to load the HTML into txtPgSrc; Click 'Get Item Info' to load the search results into txtPgSrc; The 'AutoParse' proc in m_OpenClose uses 'URLDownloadToFile()'. This returns the entire web page source which, you will see if you compare, is entirely different than the source returned for the rendered page in the browser. This example searches nsncenter.com for specific product info. It demos how to parse the same info from a browser as well as directly from a downloaded file. Some links have more than 1 product item listed, which you'll see using the controls on 'Sheet1'. The same info is written to an output file using the 'AutoParse' proc. Currently, this proc is hard-coded to process 10 parent pages, each of which contain links to 11 child pages. As of this reply both approaches are working as expected. Try closing Excel and start over. Be careful to not just 'End' VBA when an error occurs because this leaves objects loaded to memory; -make sure these are set to 'Nothing'. -- Garry Free usenet access at http://www.eternal-september.org Classic VB Users Regroup! comp.lang.basic.visual.misc microsoft.public.vb.general.discussion |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
timer | Excel Programming | |||
timer | Excel Programming | |||
Lap-timer | Excel Discussion (Misc queries) | |||
Stopping a Timer / Running a timer simultaneously on Excel | Excel Discussion (Misc queries) | |||
need help with a timer | Excel Programming |