Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.worksheet.functions
m m is offline
external usenet poster
 
Posts: 64
Default Extract text within ( )

Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS
Currently the "(XXX)" is always at the end of the text string
Using MS Excel 2002



  #2   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 964
Default Extract text within ( )

Are those the only parenthesis?

=SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","")

replace A1 with your cell



--


Regards,


Peo Sjoblom

"M" wrote in message
...
Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS
Currently the "(XXX)" is always at the end of the text string
Using MS Excel 2002





  #3   Report Post  
Posted to microsoft.public.excel.worksheet.functions
m m is offline
external usenet poster
 
Posts: 64
Default Extract text within ( )

Thank you ever so much Peo -
WORKED GREAT!!!!!
On going through a long list it turned out that there were cases
with additional parenthesis -
But the prenethesis which I want to extract the text from is the
one on the furthest right.
For example I found one that looked like:
"My Text String (XPYZ (MTS)" in addition
"My Text String (MTS)"
and I still want to return MTS from both examples






"Peo Sjoblom" wrote:

Are those the only parenthesis?

=SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","")

replace A1 with your cell



--


Regards,


Peo Sjoblom

"M" wrote in message
...
Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS
Currently the "(XXX)" is always at the end of the text string
Using MS Excel 2002






  #4   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 964
Default Extract text within ( )

If there always will be a space before the parenthesis
with the string you want to extract

=SUBSTITUTE(MID(A1,FIND("^^",SUBSTITUTE(A1,"
","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))+2,255),")","")



--


Regards,


Peo Sjoblom

"M" wrote in message
...
Thank you ever so much Peo -
WORKED GREAT!!!!!
On going through a long list it turned out that there were cases
with additional parenthesis -
But the prenethesis which I want to extract the text from is the
one on the furthest right.
For example I found one that looked like:
"My Text String (XPYZ (MTS)" in addition
"My Text String (MTS)"
and I still want to return MTS from both examples






"Peo Sjoblom" wrote:

Are those the only parenthesis?

=SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","")

replace A1 with your cell



--


Regards,


Peo Sjoblom

"M" wrote in message
...
Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS
Currently the "(XXX)" is always at the end of the text string
Using MS Excel 2002








  #5   Report Post  
Posted to microsoft.public.excel.worksheet.functions
m m is offline
external usenet poster
 
Posts: 64
Default Extract text within ( )

Thanks Peo
Worked Great!!!!
But in addition to "My Text String (MTS)" I also found
"My Text String (ADT (MTS)"
For which I would still like to return "MTS"

"Peo Sjoblom" wrote:

Are those the only parenthesis?

=SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","")

replace A1 with your cell
Regards,


Peo Sjoblom

"M" wrote in message
...
Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS
Currently the "(XXX)" is always at the end of the text string
Using MS Excel 2002




  #6   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 5,651
Default Extract text within ( )

On Wed, 22 Oct 2008 07:35:01 -0700, M wrote:

Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS
Currently the "(XXX)" is always at the end of the text string
Using MS Excel 2002



You could download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr and use this Regular Expression formula to extract the
alphanumeric text string within the last set of "(...)"

Easy to modify if that is not precisely what you want. And fairly easy to
write a UDF to mimic it.

Also, this will not work on strings longer than 255 characters -- if that might
be the case, post back and I will put up an equivalent UDF.


=REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]")


--ron
  #7   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 1,231
Default Extract text within ( )

Ron Rosenfeld wrote...
....
You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr*and use this Regular Expression formula to extract the
alphanumeric text string within the last set of "(...)"

....
=REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]")


Or use assertions.

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))")

Also begs the question whether to be as inclusive as possible within
the parentheses. If so, [^()]* matches more substrings than \w+.
  #8   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 5,651
Default Extract text within ( )

On Wed, 22 Oct 2008 12:26:58 -0700 (PDT), Harlan Grove
wrote:

Ron Rosenfeld wrote...
...
You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr*and use this Regular Expression formula to extract the
alphanumeric text string within the last set of "(...)"

...
=REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]")


Or use assertions.

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))")



If you are going to use assertions that way, per the OP's requirements, you
need to ensure you return the last match.

Given, for example:

My Text String (XPYZ) (MTS)) in addition

your REGEX.MID returns (XPYZ),

So, I would use:

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) or
=REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)")

But I've been purposely trying to avoid using lookbehind assertions since they
are not supported in VBScript. And often enough, it has happened that I've had
to switch to VBA.


Also begs the question whether to be as inclusive as possible within
the parentheses. If so, [^()]* matches more substrings than \w+.


Absolutely correct, and a good point to make explicitly. (That is why, in my
description, I wrote "alphanumeric text string".)

Now, having written all that, and thought about it some more, it appears as if
the look-behind assertion may not even be necessary!

For example:

[^()]+(?=\)[^(]*$)

or even:

\w+(?=\)[^(]*$)

or

[A-Z]+(?=\)[^(]*$)

All seem to work, and would work in either PCRE or VBScript.

Thanks.
--ron
  #9   Report Post  
Posted to microsoft.public.excel.worksheet.functions
m m is offline
external usenet poster
 
Posts: 64
Default Extract text within ( )

Thank You "ALL", Peo, Ron & Harlen
ALL super helpful: I am learnin a lot -
Back to Peo's original suggestion - I finally figured out what
everything in the formula =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") and
will use this and those suggestions form Ron and Harlen in the future - BUT
what does the 255 refer to in the above example. I also just found another
useful information site: http://www.techonthenet.com/excel/formulas/

AGAIN THANKS ALL!!!!! REALLY I sometimes spend hours trying to figure this
stuff out on my own and you all are a super HELP!!!!
M

"Ron Rosenfeld" wrote:

On Wed, 22 Oct 2008 12:26:58 -0700 (PDT), Harlan Grove
wrote:

Ron Rosenfeld wrote...
...
You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr and use this Regular Expression formula to extract the
alphanumeric text string within the last set of "(...)"

...
=REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]")


Or use assertions.

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))")



If you are going to use assertions that way, per the OP's requirements, you
need to ensure you return the last match.

Given, for example:

My Text String (XPYZ) (MTS)) in addition

your REGEX.MID returns (XPYZ),

So, I would use:

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) or
=REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)")

But I've been purposely trying to avoid using lookbehind assertions since they
are not supported in VBScript. And often enough, it has happened that I've had
to switch to VBA.


Also begs the question whether to be as inclusive as possible within
the parentheses. If so, [^()]* matches more substrings than \w+.


Absolutely correct, and a good point to make explicitly. (That is why, in my
description, I wrote "alphanumeric text string".)

Now, having written all that, and thought about it some more, it appears as if
the look-behind assertion may not even be necessary!

For example:

[^()]+(?=\)[^(]*$)

or even:

\w+(?=\)[^(]*$)

or

[A-Z]+(?=\)[^(]*$)

All seem to work, and would work in either PCRE or VBScript.

Thanks.
--ron

  #10   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 1,231
Default Extract text within ( )

Ron Rosenfeld wrote...
....
If you are going to use assertions that way, per the OP's requirements, you
need to ensure you return the last match.

....

Fair point.

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) *or


Best.

=REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)")


Less efficient.

But I've been purposely trying to avoid using lookbehind assertions since they
are not supported in VBScript. *And often enough, it has happened that I've had
to switch to VBA.

....

For long string support? You could handle them with VBScript by using
a regex to check the match pattern itself. If you find a look behind
assertion, convert it to a separate expression, e.g.,

(?<=\()[^()]*(?=\)) - \([^()]*(?=\))

use the modified regex to get the desired substring from the source
string, then trim off the bit matching the look behind assertion. It'd
make the wrapper function more complicated, but you could add support
for look behind assertions this way.

Now, having written all that, and thought about it some more, it appears as if
the look-behind assertion may not even be necessary!

For example:

[^()]+(?=\)[^(]*$)

or even:

\w+(?=\)[^(]*$)

or

[A-Z]+(?=\)[^(]*$)

All seem to work, and would work in either PCRE or VBScript.


All would fail if there were an unmatched left parenthesis to the
right of all other parentheses. Not necessarily an idle consideration
since your example string

My Text String (XPYZ) (MTS)) in addition

has unbalanced parentheses. All may fail if there were only a right
parenthesis with no left parentheses if the OP wanted "" returned in
such cases.


  #11   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 964
Default Extract text within ( )

It tells you how many characters to return after the last space, I didn't
think you would have a
string parenthesis included that would be more than that :) So it is just an
arbitrary number
and very rarely would a last string (since you told us it was always the
last word) exceed
that. I am sure putting 20 would be enough.

--


Regards,


Peo Sjoblom

"M" wrote in message
...
Thank You "ALL", Peo, Ron & Harlen
ALL super helpful: I am learnin a lot -
Back to Peo's original suggestion - I finally figured out what
everything in the formula =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","")
and
will use this and those suggestions form Ron and Harlen in the future -
BUT
what does the 255 refer to in the above example. I also just found another
useful information site: http://www.techonthenet.com/excel/formulas/

AGAIN THANKS ALL!!!!! REALLY I sometimes spend hours trying to figure this
stuff out on my own and you all are a super HELP!!!!
M

"Ron Rosenfeld" wrote:

On Wed, 22 Oct 2008 12:26:58 -0700 (PDT), Harlan Grove

wrote:

Ron Rosenfeld wrote...
...
You could download and install Longre's free morefunc.xll add-in from
http://xcell05.free.fr and use this Regular Expression formula to
extract the
alphanumeric text string within the last set of "(...)"
...
=REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]")

Or use assertions.

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))")



If you are going to use assertions that way, per the OP's requirements,
you
need to ensure you return the last match.

Given, for example:

My Text String (XPYZ) (MTS)) in addition

your REGEX.MID returns (XPYZ),

So, I would use:

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) or
=REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)")

But I've been purposely trying to avoid using lookbehind assertions since
they
are not supported in VBScript. And often enough, it has happened that
I've had
to switch to VBA.


Also begs the question whether to be as inclusive as possible within
the parentheses. If so, [^()]* matches more substrings than \w+.


Absolutely correct, and a good point to make explicitly. (That is why,
in my
description, I wrote "alphanumeric text string".)

Now, having written all that, and thought about it some more, it appears
as if
the look-behind assertion may not even be necessary!

For example:

[^()]+(?=\)[^(]*$)

or even:

\w+(?=\)[^(]*$)

or

[A-Z]+(?=\)[^(]*$)

All seem to work, and would work in either PCRE or VBScript.

Thanks.
--ron



  #12   Report Post  
Posted to microsoft.public.excel.worksheet.functions
external usenet poster
 
Posts: 5,651
Default Extract text within ( )

On Thu, 23 Oct 2008 11:24:53 -0700 (PDT), Harlan Grove
wrote:

But I've been purposely trying to avoid using lookbehind assertions since they
are not supported in VBScript. *And often enough, it has happened that I've had
to switch to VBA.

...

For long string support?


Yes. That and also dealing with capturing groups (.submatches) in certain
instances.

You could handle them with VBScript by using
a regex to check the match pattern itself. If you find a look behind
assertion, convert it to a separate expression, e.g.,

(?<=\()[^()]*(?=\)) - \([^()]*(?=\))

use the modified regex to get the desired substring from the source
string, then trim off the bit matching the look behind assertion. It'd
make the wrapper function more complicated, but you could add support
for look behind assertions this way.


Interesting approach. I like the flexibility. I'll have to think about
implementing that one of these days.


Now, having written all that, and thought about it some more, it appears as if
the look-behind assertion may not even be necessary!

For example:

[^()]+(?=\)[^(]*$)

or even:

\w+(?=\)[^(]*$)

or

[A-Z]+(?=\)[^(]*$)

All seem to work, and would work in either PCRE or VBScript.


All would fail if there were an unmatched left parenthesis to the
right of all other parentheses. Not necessarily an idle consideration
since your example string

My Text String (XPYZ) (MTS)) in addition

has unbalanced parentheses. All may fail if there were only a right
parenthesis with no left parentheses if the OP wanted "" returned in
such cases.


Good point. I had not considered something like:

"My Text String XPYZ (MTS) ( in addition"

--ron
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Extract text juanpablo Excel Discussion (Misc queries) 0 May 27th 08 10:00 PM
Need to extract certain text from text string Trista @ Pacific Excel Worksheet Functions 4 November 21st 07 07:07 PM
Extract text from large Text ldiaz Excel Discussion (Misc queries) 4 November 14th 07 01:21 AM
Text Extract Arturo Excel Worksheet Functions 1 December 12th 05 04:24 PM
EXTRACT TEXT FROM TEXT STRING carricka Excel Worksheet Functions 4 July 8th 05 11:00 AM


All times are GMT +1. The time now is 12:46 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"