View Single Post
  #23   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Harlan Grove Harlan Grove is offline
external usenet poster
 
Posts: 733
Default Extracting a numbers from a text string

IPerlovsky wrote...
We are getting closer. This one would not work because the numbers
are not always going to be between the 14th and 15th spaces.

Here are some examples of where it would not work:

....

So far you've provided several examples, namely,

A 544646-BA CALIF UN1MM+ 5 7/1/2017 FSA 108.579 3.90 3.82 8 Aaa AAA

1/4/2007 CA 544646-BH LOS ANGELES CALIF UN 1000000 5 7/1/2022 FGIC
108.151 3.96 3.86 10 Aaa AAA Y 7/1/2016

1/8/2007 CA 544644-NC LOS ANGELES CALIF UN 1000000 5.75 7/1/2016 MBIA
115.907 3.74 3.67 7 Aaa AAA N 7/1/2016

1/8/2007 CA 544644-NC LOS ANGELES CALIF UN 1000000 5.75 7/1/2016 MBIA
115.907 3.74 3.67 17 Aaa AAA N 7/1/2016

12/15/2006 CA 544644-L6 LOS ANGELES CALIF UN1MM+ 5 7/1/2018 AMBAC
108.748 3.79 3.71 8 Y Aaa AAA Y 7/1/2015

2/6/2007 CA 544646-BA LOS ANGELES CALIF UN1MM+ 5 7/1/2017 FSA 108.579
3.90 3.82 8 Aaa AAA Y 7/1/2016

These don't all follow the same format, so simple parsing rules won't
work. However, it APPEARS that what MIGHT work would be to ensure that
the last token is nonnumeric, then find the last numeric token, where
tokens are any substrings containing no spaces. If that spec holds,
then define the name seq referring to

=ROW(INDEX(Sheet1!$1:$65536,1,1):INDEX(Sheet1!$1:$ 65536,255,1))

and if one of these records were in cell B1, try the array formula

=VLOOKUP(9.99999999999999E+307,--MID(TRIM(B1)&"#",
SMALL(IF(MID(" "&TRIM(B1),seq,1)=" ",seq),seq),
SMALL(IF(MID(TRIM(B1)&"# ",seq,1)=" ",seq),seq)
-SMALL(IF(MID(" "&TRIM(B1),seq,1)=" ",seq),seq)),1)

Given the 6 example records above, this formula returns

8
10
7
17
8
8

The trick here is that the TRIM(B1)&"# " terms ensure that the last
token converts to an error rather than a number. The MID call returns
an array of the separate tokens in B1 followed by a lot of #NUM!
errors after the final token. The -- converts the numeric tokens into
numbers and the other tokens into error values, and the VLOOKUP call
given its 1st arg and missing 4th arg returns the last number found in
col 1 of its 2nd arg, the --MID(...) array.

Given all this, you'd be better off following Ron Rosenfeld's advice
about using Laurent Longre's MOREFUNC.XLL add-in.