View Single Post
  #36   Report Post  
Posted to microsoft.public.excel.misc
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default Need formula to extract a numeric value from a free-format text

On Sat, 18 Jul 2009 19:26:09 -0400, "Rick Rothstein"
wrote:

I had the damnedest time trying to find what RegExp considered a word
boundary (basically, each sight just kept say use \w without listing what it
consider the boundary). When I finally found the one I cited, I figured it
was a universal definition. Now I'm guessing there might be version
differences between the various RegExp engines. Yes, I fix to account for
the underbar is as you have shown it. For the archives, here is the UDF with
the change you indicated...

Function First7DigitNumber(S As String) As String
Dim X As Long
For X = 1 To Len(S)
If Mid(" " & S & " ", X, 9) Like "[!a-zA-Z0-9_]#######[!a-zA-Z0-9_]"
Then
First7DigitNumber = Mid(S, X, 7)
Exit Function
End If
Next
First7DigitNumber = "*MISSING*"
End Function


And, to keep them together, here is the Regex version:

===================
Option Explicit
Function Seven(s As String) As String
Dim re As Object, mc As Object
Set re = CreateObject("vbscript.regexp")
re.Pattern = "\b\d{7}\b"
If re.test(s) = True Then
Set mc = re.Execute(s)
Seven = mc(0)
Else
Seven = "*MISSING*"
End If
End Function
====================

Here are some definitions with regard to word boundaries and some of the
differences:

There are three different positions that qualify as word boundaries:

1. Before the first character in the string, if the first character is
a word character.
2. After the last character in the string, if the last character is a
word character.
3. Between two characters in the string, where one is a word character
and the other is not a word character.

In all flavors, the characters [a-zA-Z0-9_] are word characters. And that is
the case for VBScript. However, some of the other flavors will recognize
characters from other languages, and/or unicode characters, as word characters.
And I believe there is one flavor (?Python) where you can even set flags to
change the definition of a word character.
--ron