View Single Post
  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
IPerlovsky IPerlovsky is offline
external usenet poster
 
Posts: 32
Default Extracting the last set of words from a text string

anyway we can work this into a formula w/o VB?

--
iperlovsky


"Gary''s Student" wrote:

Try this:

Function lastpart(r As Range) As String
lastpast = ""
s = Split(r.Value, " ")
fnd = False
For i = UBound(s) To LBound(s) Step -1
If s(i) = "N" Or IsNumeric(s(i)) Then
fnd = True
Exit For
End If
Next

If fnd = False Or i = UBound(s) Then
Exit Function
End If

lastpart = s(i + 1)
For j = i + 2 To UBound(s)
lastpart = lastpart & " " & s(j)
Next
End Function
--
Gary''s Student
gsnu200708


"IPerlovsky" wrote:

Hey Ron,

That works for most of the strings, but in the following examples it pulls a
little more info than we need:

2/26/2007 TX 106213-FH BRAZOS RIVER AUTH TE1MM+ 5.75 11/1/2011 102.244 5.20
3.59 161 Y Baa2 BBB N TXU ENERGY COMPANY

...here it pulls "BBB N TXU ENERGY COMPANY" but I want just "TXU ENERGY
COMPANY"

2/26/2007 LA 988844-CN ZACHARY LA CMNTY SC 1000000 4 3/1/2008 CIFG 100.311
3.64 3.59 5 Y AAA N

...and here "Y AAA N", but I want nothing to pull

I guess, if the last character is an "N" or a number, then it should not
pull anything.
--
iperlovsky


"Ron Rosenfeld" wrote:

On Wed, 28 Feb 2007 10:10:33 -0800, IPerlovsky
wrote:

What formula would I use (preferably using a morefunc REGEX.MID function) to
extract the last series of words/acronyms from a text string? The key to
identifiying where that last set of words begins, is that they will always
follow either an "N" or a number that may or may not have a decimal. See the
examples below:

2/26/2007 PA 23061N-BE CUMBERLAND CNTY PA 1000000 5.25 11/1/2008 AMBAC
102.575 3.64 3.59 5 P Aaa AAA N DICKINSON COLLEGE

...here I am trying to extract "DICKINSON COLLEGE"

2/26/2007 NC 230584-CS CUMBERLAND CNTY N C1MM+ 5.25 10/1/2009 104.603 3.74
3.59 15 P 10/1/2009 A3 A- Y 10/1/2009 101.1 CUMBERLAND CNTY HO

...here I am trying to extract "CUMBERLAND CNTY HO"



Try this:

=REGEX.MID(A1,"((?<=\sN\s)|(?<=\d\s))[A-Z\s]+",-1)

It does presume that the last set of words contains only capital letters. That
can be broadened, if necessary, I think.
--ron