View Single Post
  #16   Report Post  
Posted to microsoft.public.excel.programming
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default Regular Expression Help on syntax

On Sat, 9 Jan 2010 17:29:42 +0000, joel wrote:


I tested Ron's code and it didn't return what you really need. Ron's
string returns everthing to the end of the number you are looking for.


My "code" was merely a regex which is quite different from your code.

If implemented properly, it returns only the number. If you are returning only
what you say, then you probably are not implementing it properly.

To demonstrate some code segments, we could set up the following, with the data
in an Excel worksheet. And code similar to what the OP posted, with a few
minor changes to correct his errors.

I just put MyStr into A1 for testing, not having access to the rest of the OP's
code. And, by setting c=A2 and leaving i=0, the OP's sub would place the
results in B2:D2.



A1: (the sample text given by the OP
</tr
</tbody
<tbody
<tr
<td<a href="/history/airport/KSTP/2010/1/1/DailyHistory.html"1</a</td
<td class="bl gb"
8
</td
<td class="gb"
0
</td
<td class="br gb"
-8
</td

Then use this routine -- quite similar to that of the OP:

====================================
Option Explicit
Sub TestExtract()
Dim i As Long
Dim c As Range
Dim myStr As String
Dim sURLdate As String
Set c = Range("A2")
myStr = Range("A1").Value
sURLdate = Format(CDate("1/1/2010"), "/yyyy/m/d/")


c.Offset(0, i + 1).Value = RegexMid(myStr, sURLdate, "bl gb")
c.Offset(0, i + 2).Value = RegexMid(myStr, sURLdate, "br gb")
c.Offset(0, i + 3).Value = RegexMid(myStr, sURLdate, "class=""gb""")
End Sub

'------------------------------------------------------------------
Private Function RegexMid(s As String, sDate As String, sTempType As String) _
As String
Dim re As Object, mc As Object

Set re = CreateObject("vbscript.regexp")
re.ignorecase = True
re.Pattern = sDate & "DailyHistory[\s\S]+?" & _
sTempType & "[\s\S]+?(-?\b\d+)\b"

If re.test(s) = True Then
Set mc = re.Execute(s)
RegexMid = mc(0).submatches(0)
End If
End Function
=================================

As expected, this returns into B2, C2 and D2 the signed integers following the
sTempType variables.

Note also the extra quote marks required in the third call, c/w the OP's.

Also note the terminal regex representing the signed integer.





I finally figure it out. You need to put parenthsis around the two sub
parts of the search string. the first part is the tag and the second
part is the number. Then use the submatches prperty to get the 2nd
submatch.


Note that your code for a signed integer:

(\d+|([-+]\d+))

could be more simply expressed as

([-+]?\d+)

Why did you choose to use alternation?


--ron