LinkBack Thread Tools Search this Thread Display Modes
Prev Previous Post   Next Post Next
  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Ron Rosenfeld
 
Posts: n/a
Default String splitting for inconsistent strings

On Thu, 22 Jun 2006 00:04:37 +0100, Richard Buttrey
wrote:

On Tue, 20 Jun 2006 13:29:40 -0400, Ron Rosenfeld
wrote:


It seems as if this can be done with "Regular Expressions". I will make use of
the fact that the value which you want to pull out is the first floating point
number that ends with a "p " (thats "p" followed by <space). If there are
any issues where the stock name includes a string of the form <numberp<space
you might want to change the formula to take that into account.

Download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr

To extract the company name:

=REGEX.MID(A1,"(\w+\s+)+(?=\d+(\.\d*)?(?=p\s)) ")

To extract the price:

=REGEX.MID(A1,"\d+(\.\d*)?(?=p\s)")

To be a little more bullet-proof one can require that the price number be
identified as the first positive floating point number followed by the sequence
of p<space then optionally a "-" and another digit. So the formula would
then be:

Company Name:
=REGEX.MID(A1,"(\w+\s+)+(?=\d+(\.\d*)?(?=p\s-?\d))")

Price:
=REGEX.MID(A1,"\d+(\.\d*)?(?=p\s-?\d)")


--ron


Ron,

Many thanks for the pointer to this Add In which I've just downloaded.
The REGEX function looks very interesting - I just need to get my mind
around the syntax and terminology.

One supplementary if I may. I've just used your example on my data and
it works a treat.
However there's one other small variation I've just noticed, and
that's where the integer part of the share price is four characters
with a comma separating the thousands.

e.g. the following string

GlaxoSmithKline 1,480.00p 0.16p (1.09)

returns nothing for the name in B1, and in C1 480.00, not 1480.00

Can you suggest a modification that will overcome this please - it'll
probably take me a couple of weeks to understand the syntax!

Kind regards,

Richard Buttrey



__
Richard Buttrey
Grappenhall, Cheshire, UK
__________________________


What you need to do, I believe, is optionally look for zero or more sequences,
within the number, of a comma followed by three digits. The syntax for that
would be: (,\d{3})*



I think this modification will do that:

\d+(,\d{3})*(\.\d*)(?=p\s)

So the modified "more bulletproof" formulas would be:

Company Name:

=REGEX.MID(A1,"(\w+\s?)+(?=\s\d+(,\d{3})*(\.\d*)?( ?=p\s-?\d))")

Price:

=REGEX.MID(A1,"\d+(,\d{3})*(\.\d*)?(?=p\s-?\d)")

If you look at the Company Name formula, you will note I made a small change as
the previous formula would include a trailing <space after each name. This
one does not.

The simpler formulas would be

Name:

=REGEX.MID(A1,"(\w+\s?)+(?=\s\d+(,\d{3})*(\.\d*)?( ?=p\s))")

Price:

=REGEX.MID(A1,"\d+(,\d{3})*(\.\d*)(?=p\s)")

=========================================

http://www.regular-expressions.info/reference.html

is a place on the net to start learning a bit about syntax.

I would also recommend the book "Mastering Regular Expressions" by Jeffrey
Friedl. I got mine for just a few dollars, used, at amazon.com

Finally, be aware that a limitation of Longre's functions is that they cannot
process strings longer than 255 characters. I believe this limitation is due
to the data types used in the XLL, and also applies to the other functions in
the add-in.

If you need to process longer strings, it can be done in VBA by setting a
reference to Microsoft VBScript Regular Expressions 5.5 and writing some code.
I've got a little of that done, but nothing as flexible as what Longre
provides. And there's also some syntax that won't work in the VBScript
implementation, so I rarely bother with it.

Have fun
--ron
 
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Splitting a text string into string and number mcambrose Excel Discussion (Misc queries) 4 February 21st 06 03:47 PM
Splitting text strings along commas. Andrew Excel Discussion (Misc queries) 3 January 10th 06 04:01 PM
Colomn of strings, how to count each string with "abc" OR "def" Tommy T Excel Worksheet Functions 8 September 5th 05 04:57 PM
splitting number strings in excel Bob Excel Worksheet Functions 1 February 3rd 05 03:33 AM
Read Text File into Excel Using VBA Willie T Excel Discussion (Misc queries) 13 January 8th 05 12:37 AM


All times are GMT +1. The time now is 08:26 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"