View Single Post
  #33   Report Post  
Posted to microsoft.public.excel.misc
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default Need formula to extract a numeric value from a free-format text

On Sat, 18 Jul 2009 12:33:47 -0400, "Rick Rothstein"
wrote:

The reason I thought that is because
of the "dot" that followed the 7-digit number in the OP's posted example
text. In thinking about it, I'm guessing you took that to be a period at the
end of a sentence.


Yes, I did.


I found this definition of a word boundary in Regular Expressions...

"A word boundary represents the spot where a letter or
number meets a space, apostrophe, a period, or anything
else that isn't a letter or number"


Not quite what I understand it to be (but close)./

The definitions I've seen indicate that a word boundary "Matches at the
position between a word character (anything matched by \w) and a non-word
character (anything matched by [^\w] or \W) as well as at the start and/or end
of the string if the first and/or last characters in the string are word
characters."

And, at least in VBScript, a word character is a digit, letter or underscore
e.g: [A-Za-z0-9_]



Given that, this modification of my function should do what your RegExp
solution does...

Function First7DigitNumber(S As String) As String
Dim X As Long
For X = 1 To Len(S)
If Mid(" " & S & " ", X, 9) Like "[!a-zA-Z0-9]#######[!a-zA-Z0-9]" Then
First7DigitNumber = Mid(S, X, 7)
Exit Function
End If
Next
First7DigitNumber = "*MISSING*"
End Function


It comes pretty close. Just change this line to include the underscores:

If Mid(" " & S & " ", X, 9) Like "[!a-zA-Z0-9_]#######[!a-zA-Z0-9_]" Then

--ron