View Single Post
  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Richard Buttrey
 
Posts: n/a
Default String splitting for inconsistent strings

On Tue, 20 Jun 2006 13:29:40 -0400, Ron Rosenfeld
wrote:


It seems as if this can be done with "Regular Expressions". I will make use of
the fact that the value which you want to pull out is the first floating point
number that ends with a "p " (thats "p" followed by <space). If there are
any issues where the stock name includes a string of the form <numberp<space
you might want to change the formula to take that into account.

Download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr

To extract the company name:

=REGEX.MID(A1,"(\w+\s+)+(?=\d+(\.\d*)?(?=p\s))" )

To extract the price:

=REGEX.MID(A1,"\d+(\.\d*)?(?=p\s)")

To be a little more bullet-proof one can require that the price number be
identified as the first positive floating point number followed by the sequence
of p<space then optionally a "-" and another digit. So the formula would
then be:

Company Name:
=REGEX.MID(A1,"(\w+\s+)+(?=\d+(\.\d*)?(?=p\s-?\d))")

Price:
=REGEX.MID(A1,"\d+(\.\d*)?(?=p\s-?\d)")


--ron


Ron,

Many thanks for the pointer to this Add In which I've just downloaded.
The REGEX function looks very interesting - I just need to get my mind
around the syntax and terminology.

One supplementary if I may. I've just used your example on my data and
it works a treat.
However there's one other small variation I've just noticed, and
that's where the integer part of the share price is four characters
with a comma separating the thousands.

e.g. the following string

GlaxoSmithKline 1,480.00p 0.16p (1.09)

returns nothing for the name in B1, and in C1 480.00, not 1480.00

Can you suggest a modification that will overcome this please - it'll
probably take me a couple of weeks to understand the syntax!

Kind regards,

Richard Buttrey



__
Richard Buttrey
Grappenhall, Cheshire, UK
__________________________