ExcelBanter

ExcelBanter (https://www.excelbanter.com/)
-   Excel Programming (https://www.excelbanter.com/excel-programming/)
-   -   find string in web page (https://www.excelbanter.com/excel-programming/402434-find-string-web-page.html)

sal21

find string in web page
 
Now have a little other qst:

I have this web page attached, i want to find the word "Trovati:" and get
the value near tath in this case is: 827 , after store 827 in a cell of sheet.
tath is all.
Note:already have a VBA code to naviate in web page.
i use this part of code:
.....
Dim IE As Object
Set IE = CreateObject("InternetExplorer.Application")
ShowWindow .hwnd, SW_MAXIMIZE
..Visible = True
..navigate "http://xxxx"
....
Tks.
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"
<!-- saved from
url=(0078)http://telefoni/trovadipendenze.asp?... RCA=&TIPRIC= --
<HTML<HEAD<TITLE</TITLE
<META http-equiv=Content-Type content="text/html; charset=windows-1252"
<META content="Microsoft FrontPage 5.0" name=GENERATOR</HEAD
<BODY vLink=#ffffff link=#00ffff bgColor=#344ab1
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 20px"
align=center<B<FONT color=#ffffff<FONT face=Arial size=1Trovati
</FONT<FONT
face=Arial size=2827 </FONT</FONT</B<FONT face=Tahoma color=#ffffff
size=1<BAggiornamento al : 06/12/2007</B</FONT </P
<DIV align=center
<CENTER
<TABLE height=1 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell3.jpg border=0
<TBODY
<TR
<TD vAlign=center align=left width=232 height=4
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT:
23px"<B<FONT
face="MS Sans Serif" color=#ffff00 size=2Dipendenze</FONT</B
</P</TD</CENTER
<TD vAlign=center align=left width=81 height=4 </TD
<CENTER
<TD vAlign=center align=left width=70 height=4
<P align=center<FONT face="MS Sans Serif" color=#ffff00
size=2<BTelefono</B</FONT</P</TD
<TD vAlign=center align=left width=75 height=4 </TD
<TD vAlign=center align=left width=43 height=4 </TD
<TD vAlign=center align=left width=120 height=4 </TD
<TD vAlign=center align=left width=71 height=4<FONT face="MS Sans Serif"
color=#ffff00 size=2Cod.Sede</FONT
</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 5px; MARGIN-BOTTOM: 5px"
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1ABBIATEGRASSO
</FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=202 94963900 -
94963915 fax </FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center4375</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE-NORD OVEST&TIPRIC=5"REGIONE-NORD
OVEST </A</FONT</P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff20081 ABBIATEGRASSO - PIAZZA
CASTELLO, 19 </FONT</FONT</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1ACIREALE
</FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=2095 891890 -
891734
fax </FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center1752</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE SUD&TIPRIC=5"REGIONE
SUD </A</FONT</P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff95024 ACIREALE - CORSO ITALIA, 2
</FONT</FONT</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1ACQUI TERME
</FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=20144 356090 -
356088 fax </FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center0346</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE. NORD-OVEST&TIPRIC=5"REGIONE.
NORD-OVEST </A</FONT</P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff15011 ACQUI TERME - PIAZZA LEVI,
11
</FONT</FONT</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1AGRIGENTO
</FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=20922 402506 -
402335 fax </FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center4755</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE SUD&TIPRIC=5"REGIONE
SUD </A</FONT</P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff92100 AGRIGENTO - VIA IMERA, 203
</FONT</FONT</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1ALBA </FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=20173 362551 -
35461
fax</FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center2443</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE NORD-OVEST&TIPRIC=5"REGIONE
NORD-OVEST</A</FONT </P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff12051 ALBA - CORSO
LANGHE</FONT</FONT </TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1ALBANO LAZIALE
</FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=206 9324483/4 -
9324485 fax </FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center6337</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE LAZIO E SARDEGNA&TIPRIC=5"REGIONE
LAZIO E SARDEGNA </A</FONT</P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff00041 ALBANO LAZIALE - VIA
TORRIONE,
1 </FONT</FONT</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center
<DIV align=center
<CENTER
<TABLE height=24 cellSpacing=0 cellPadding=0 width=731
background=trovadipendenze_file/sfondocell2.gif border=0
<TBODY
<TR
<TD vAlign=center align=left width=31 height=6
<P align=center<FONT face="MS Sans Serif" size=1</FONT </P</TD
<TD vAlign=center align=left width=216 height=1<B<FONT
face="Microsoft Sans Serif" color=#ffffff size=1ALBENGA
</FONT</B</TD
<TD vAlign=center align=left width=98 height=6
<P align=left </P</TD
<TD vAlign=center align=middle width=281 height=6
<P align=left<B<FONT face=Arial color=#ffffff size=20182 555318 -
555318 fax </FONT</B</P</TD
<TD vAlign=center align=left width=105 height=6<FONT
face="Microsoft Sans Serif" color=#ffffff size=2
<P align=center6746</FONT</P</TD</TR
<TR
<TD vAlign=center align=left width=310 colSpan=3 height=18
<P style="MARGIN-LEFT: 30px"<FONT face=Arial color=#00ffff size=1<A
title="Risultato: Tutte le sedi"

href="http://telefoni/TrovaDipendenze.asp?START=1&NPAG=1&VPARTENZA=1&VCE RCA=REGIONE. NORD-OVEST&TIPRIC=5"REGIONE.
NORD-OVEST </A</FONT</P</TD
<TD vAlign=center align=left width=346 colSpan=2 height=18<FONT
face=Arial size=1<FONT color=#00ffff17031 ALBENGA - VIA TRIESTE, 49
</FONT</FONT</TD</TR</TBODY</TABLE</CENTER</DIV
<P style="MARGIN-TOP: 0px; MARGIN-BOTTOM: 0px; MARGIN-LEFT: 15px"
align=center<FONT face=Tahoma color=#00ffff size=1<BPagine :
</B</FONT<FONT
face=Tahoma color=#ffffff size=2(1) </FONT<FONT face=Tahoma color=#00ffff
size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=7&npag=2&VPARTENZA=1&VCE RCA=&TIPRIC="
target=_self2</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=14&npag=3&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self3</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=21&npag=4&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self4</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=28&npag=5&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self5</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=35&npag=6&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self6</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=42&npag=7&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self7</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=49&npag=8&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self8</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=56&npag=9&VPARTENZA=1&VC ERCA=&TIPRIC="
target=_self9</A </B</FONT <FONT face=Tahoma color=#00ffff size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=63&npag=10&VPARTENZA=1&V CERCA=&TIPRIC="
target=_self10</A </B</FONT <FONT face=Tahoma color=#00ffff
size=1<B<A
href="http://telefoni/trovadipendenze.asp?start=70&npag=11&VPARTENZA=11& VCERCA=&TIPRIC="
target=_selfSuccessive</A </B</FONT </P</BODY</HTML

Ron Rosenfeld

find string in web page
 
On Fri, 7 Dec 2007 13:59:01 -0800, sal21
wrote:

Now have a little other qst:

I have this web page attached, i want to find the word "Trovati:" and get
the value near tath in this case is: 827 , after store 827 in a cell of sheet.
tath is all.
Note:already have a VBA code to naviate in web page.


I note that the string "Trovati:" is NOT in the web page you posted, although
the string "Trovati" is present.

In any event, the following function should look in Src for a phrase (str) and
return the first word in the first phrase that is not enclosed in <....

In your example, if Src is your web page, and str is "Trovati", the function
will return "827".

I'm not sure if it will take into account all of the possible variations in
context, so be sure to test it thoroughly.

========================================
Option Explicit
Function GetDataAfter(Src As String, str As String) As String
'Src is the page being searched
'str is the string to be found within that page
'returns the first word in a string following str that is not within a tag

Dim re As Object
Dim mc As Object
Dim sPat As String

sPat = str & "[\s\S]*?(<[\s\S]*?)*(\S+)"

Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

If re.test(Src) = True Then
Set mc = re.Execute(Src)
GetDataAfter = mc(0).submatches(1)
End If
End Function
==============================
--ron

sal21

find string in web page
 
Hi Ron tks for reply....
But i am a newbie how can use the function? wath is the correct sintyax if i
want use in my original code?How to store the "827" in a Var?
Tks from Napoli.

Note: Yes the word is "Trovati" and not "Trovati:", sorry me.

"Ron Rosenfeld" wrote:

On Fri, 7 Dec 2007 13:59:01 -0800, sal21
wrote:

Now have a little other qst:

I have this web page attached, i want to find the word "Trovati:" and get
the value near tath in this case is: 827 , after store 827 in a cell of sheet.
tath is all.
Note:already have a VBA code to naviate in web page.


I note that the string "Trovati:" is NOT in the web page you posted, although
the string "Trovati" is present.

In any event, the following function should look in Src for a phrase (str) and
return the first word in the first phrase that is not enclosed in <....

In your example, if Src is your web page, and str is "Trovati", the function
will return "827".

I'm not sure if it will take into account all of the possible variations in
context, so be sure to test it thoroughly.

========================================
Option Explicit
Function GetDataAfter(Src As String, str As String) As String
'Src is the page being searched
'str is the string to be found within that page
'returns the first word in a string following str that is not within a tag

Dim re As Object
Dim mc As Object
Dim sPat As String

sPat = str & "[\s\S]*?(<[\s\S]*?)*(\S+)"

Set re = CreateObject("vbscript.regexp")
With re
.Pattern = sPat
.Global = True
.ignorecase = True
End With

If re.test(Src) = True Then
Set mc = re.Execute(Src)
GetDataAfter = mc(0).submatches(1)
End If
End Function
==============================
--ron


Ron Rosenfeld

find string in web page
 
On Sat, 8 Dec 2007 10:17:00 -0800, sal21
wrote:

Hi Ron tks for reply....
But i am a newbie how can use the function? wath is the correct sintyax if i
want use in my original code?How to store the "827" in a Var?
Tks from Napoli.

Note: Yes the word is "Trovati" and not "Trovati:", sorry me.


You can store the function in the same module as the one you are using to
manipulate the web page.
Then you can just use the function from your routine that is storing the web
page.

So in your code you might have something like this:

=======================
Dim sVar as string
sVar = GetDataAfter(Src, "Trovati")

'where Src is your web page

============================
--ron


All times are GMT +1. The time now is 06:09 PM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
ExcelBanter.com