Extracting the last set of words from a text string
On Wed, 28 Feb 2007 10:10:33 -0800, IPerlovsky
wrote:
What formula would I use (preferably using a morefunc REGEX.MID function) to
extract the last series of words/acronyms from a text string? The key to
identifiying where that last set of words begins, is that they will always
follow either an "N" or a number that may or may not have a decimal. See the
examples below:
2/26/2007 PA 23061N-BE CUMBERLAND CNTY PA 1000000 5.25 11/1/2008 AMBAC
102.575 3.64 3.59 5 P Aaa AAA N DICKINSON COLLEGE
...here I am trying to extract "DICKINSON COLLEGE"
2/26/2007 NC 230584-CS CUMBERLAND CNTY N C1MM+ 5.25 10/1/2009 104.603 3.74
3.59 15 P 10/1/2009 A3 A- Y 10/1/2009 101.1 CUMBERLAND CNTY HO
...here I am trying to extract "CUMBERLAND CNTY HO"
Try this:
=REGEX.MID(A1,"((?<=\sN\s)|(?<=\d\s))[A-Z\s]+",-1)
It does presume that the last set of words contains only capital letters. That
can be broadened, if necessary, I think.
--ron
|