Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Formula to enter to extract the text within ( ) in a text string
Example: "My Text String (MTS)" and extract or return MTS Currently the "(XXX)" is always at the end of the text string Using MS Excel 2002 |
#2
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Are those the only parenthesis?
=SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") replace A1 with your cell -- Regards, Peo Sjoblom "M" wrote in message ... Formula to enter to extract the text within ( ) in a text string Example: "My Text String (MTS)" and extract or return MTS Currently the "(XXX)" is always at the end of the text string Using MS Excel 2002 |
#3
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Thank you ever so much Peo -
WORKED GREAT!!!!! On going through a long list it turned out that there were cases with additional parenthesis - But the prenethesis which I want to extract the text from is the one on the furthest right. For example I found one that looked like: "My Text String (XPYZ (MTS)" in addition "My Text String (MTS)" and I still want to return MTS from both examples "Peo Sjoblom" wrote: Are those the only parenthesis? =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") replace A1 with your cell -- Regards, Peo Sjoblom "M" wrote in message ... Formula to enter to extract the text within ( ) in a text string Example: "My Text String (MTS)" and extract or return MTS Currently the "(XXX)" is always at the end of the text string Using MS Excel 2002 |
#4
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
If there always will be a space before the parenthesis
with the string you want to extract =SUBSTITUTE(MID(A1,FIND("^^",SUBSTITUTE(A1," ","^^",LEN(A1)-LEN(SUBSTITUTE(A1," ",""))))+2,255),")","") -- Regards, Peo Sjoblom "M" wrote in message ... Thank you ever so much Peo - WORKED GREAT!!!!! On going through a long list it turned out that there were cases with additional parenthesis - But the prenethesis which I want to extract the text from is the one on the furthest right. For example I found one that looked like: "My Text String (XPYZ (MTS)" in addition "My Text String (MTS)" and I still want to return MTS from both examples "Peo Sjoblom" wrote: Are those the only parenthesis? =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") replace A1 with your cell -- Regards, Peo Sjoblom "M" wrote in message ... Formula to enter to extract the text within ( ) in a text string Example: "My Text String (MTS)" and extract or return MTS Currently the "(XXX)" is always at the end of the text string Using MS Excel 2002 |
#5
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Thanks Peo
Worked Great!!!! But in addition to "My Text String (MTS)" I also found "My Text String (ADT (MTS)" For which I would still like to return "MTS" "Peo Sjoblom" wrote: Are those the only parenthesis? =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") replace A1 with your cell Regards, Peo Sjoblom "M" wrote in message ... Formula to enter to extract the text within ( ) in a text string Example: "My Text String (MTS)" and extract or return MTS Currently the "(XXX)" is always at the end of the text string Using MS Excel 2002 |
#6
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
On Wed, 22 Oct 2008 07:35:01 -0700, M wrote:
Formula to enter to extract the text within ( ) in a text string Example: "My Text String (MTS)" and extract or return MTS Currently the "(XXX)" is always at the end of the text string Using MS Excel 2002 You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr and use this Regular Expression formula to extract the alphanumeric text string within the last set of "(...)" Easy to modify if that is not precisely what you want. And fairly easy to write a UDF to mimic it. Also, this will not work on strings longer than 255 characters -- if that might be the case, post back and I will put up an equivalent UDF. =REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]") --ron |
#7
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Ron Rosenfeld wrote...
.... You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr*and use this Regular Expression formula to extract the alphanumeric text string within the last set of "(...)" .... =REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]") Or use assertions. =REGEX.MID(A1,"(?<=\()[^()]*(?=\))") Also begs the question whether to be as inclusive as possible within the parentheses. If so, [^()]* matches more substrings than \w+. |
#8
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
On Wed, 22 Oct 2008 12:26:58 -0700 (PDT), Harlan Grove
wrote: Ron Rosenfeld wrote... ... You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr*and use this Regular Expression formula to extract the alphanumeric text string within the last set of "(...)" ... =REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]") Or use assertions. =REGEX.MID(A1,"(?<=\()[^()]*(?=\))") If you are going to use assertions that way, per the OP's requirements, you need to ensure you return the last match. Given, for example: My Text String (XPYZ) (MTS)) in addition your REGEX.MID returns (XPYZ), So, I would use: =REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) or =REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)") But I've been purposely trying to avoid using lookbehind assertions since they are not supported in VBScript. And often enough, it has happened that I've had to switch to VBA. Also begs the question whether to be as inclusive as possible within the parentheses. If so, [^()]* matches more substrings than \w+. Absolutely correct, and a good point to make explicitly. (That is why, in my description, I wrote "alphanumeric text string".) Now, having written all that, and thought about it some more, it appears as if the look-behind assertion may not even be necessary! For example: [^()]+(?=\)[^(]*$) or even: \w+(?=\)[^(]*$) or [A-Z]+(?=\)[^(]*$) All seem to work, and would work in either PCRE or VBScript. Thanks. --ron |
#9
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Thank You "ALL", Peo, Ron & Harlen
ALL super helpful: I am learnin a lot - Back to Peo's original suggestion - I finally figured out what everything in the formula =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") and will use this and those suggestions form Ron and Harlen in the future - BUT what does the 255 refer to in the above example. I also just found another useful information site: http://www.techonthenet.com/excel/formulas/ AGAIN THANKS ALL!!!!! REALLY I sometimes spend hours trying to figure this stuff out on my own and you all are a super HELP!!!! M "Ron Rosenfeld" wrote: On Wed, 22 Oct 2008 12:26:58 -0700 (PDT), Harlan Grove wrote: Ron Rosenfeld wrote... ... You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr and use this Regular Expression formula to extract the alphanumeric text string within the last set of "(...)" ... =REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]") Or use assertions. =REGEX.MID(A1,"(?<=\()[^()]*(?=\))") If you are going to use assertions that way, per the OP's requirements, you need to ensure you return the last match. Given, for example: My Text String (XPYZ) (MTS)) in addition your REGEX.MID returns (XPYZ), So, I would use: =REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) or =REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)") But I've been purposely trying to avoid using lookbehind assertions since they are not supported in VBScript. And often enough, it has happened that I've had to switch to VBA. Also begs the question whether to be as inclusive as possible within the parentheses. If so, [^()]* matches more substrings than \w+. Absolutely correct, and a good point to make explicitly. (That is why, in my description, I wrote "alphanumeric text string".) Now, having written all that, and thought about it some more, it appears as if the look-behind assertion may not even be necessary! For example: [^()]+(?=\)[^(]*$) or even: \w+(?=\)[^(]*$) or [A-Z]+(?=\)[^(]*$) All seem to work, and would work in either PCRE or VBScript. Thanks. --ron |
#10
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
Ron Rosenfeld wrote...
.... If you are going to use assertions that way, per the OP's requirements, you need to ensure you return the last match. .... Fair point. =REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) *or Best. =REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)") Less efficient. But I've been purposely trying to avoid using lookbehind assertions since they are not supported in VBScript. *And often enough, it has happened that I've had to switch to VBA. .... For long string support? You could handle them with VBScript by using a regex to check the match pattern itself. If you find a look behind assertion, convert it to a separate expression, e.g., (?<=\()[^()]*(?=\)) - \([^()]*(?=\)) use the modified regex to get the desired substring from the source string, then trim off the bit matching the look behind assertion. It'd make the wrapper function more complicated, but you could add support for look behind assertions this way. Now, having written all that, and thought about it some more, it appears as if the look-behind assertion may not even be necessary! For example: [^()]+(?=\)[^(]*$) or even: \w+(?=\)[^(]*$) or [A-Z]+(?=\)[^(]*$) All seem to work, and would work in either PCRE or VBScript. All would fail if there were an unmatched left parenthesis to the right of all other parentheses. Not necessarily an idle consideration since your example string My Text String (XPYZ) (MTS)) in addition has unbalanced parentheses. All may fail if there were only a right parenthesis with no left parentheses if the OP wanted "" returned in such cases. |
#11
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
It tells you how many characters to return after the last space, I didn't
think you would have a string parenthesis included that would be more than that :) So it is just an arbitrary number and very rarely would a last string (since you told us it was always the last word) exceed that. I am sure putting 20 would be enough. -- Regards, Peo Sjoblom "M" wrote in message ... Thank You "ALL", Peo, Ron & Harlen ALL super helpful: I am learnin a lot - Back to Peo's original suggestion - I finally figured out what everything in the formula =SUBSTITUTE(MID(A1,FIND("(",A1)+1,255),")","") and will use this and those suggestions form Ron and Harlen in the future - BUT what does the 255 refer to in the above example. I also just found another useful information site: http://www.techonthenet.com/excel/formulas/ AGAIN THANKS ALL!!!!! REALLY I sometimes spend hours trying to figure this stuff out on my own and you all are a super HELP!!!! M "Ron Rosenfeld" wrote: On Wed, 22 Oct 2008 12:26:58 -0700 (PDT), Harlan Grove wrote: Ron Rosenfeld wrote... ... You could download and install Longre's free morefunc.xll add-in from http://xcell05.free.fr and use this Regular Expression formula to extract the alphanumeric text string within the last set of "(...)" ... =REGEX.SUBSTITUTE(A1,".*\((\w+)\).*$","[1]") Or use assertions. =REGEX.MID(A1,"(?<=\()[^()]*(?=\))") If you are going to use assertions that way, per the OP's requirements, you need to ensure you return the last match. Given, for example: My Text String (XPYZ) (MTS)) in addition your REGEX.MID returns (XPYZ), So, I would use: =REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) or =REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)") But I've been purposely trying to avoid using lookbehind assertions since they are not supported in VBScript. And often enough, it has happened that I've had to switch to VBA. Also begs the question whether to be as inclusive as possible within the parentheses. If so, [^()]* matches more substrings than \w+. Absolutely correct, and a good point to make explicitly. (That is why, in my description, I wrote "alphanumeric text string".) Now, having written all that, and thought about it some more, it appears as if the look-behind assertion may not even be necessary! For example: [^()]+(?=\)[^(]*$) or even: \w+(?=\)[^(]*$) or [A-Z]+(?=\)[^(]*$) All seem to work, and would work in either PCRE or VBScript. Thanks. --ron |
#12
Posted to microsoft.public.excel.worksheet.functions
|
|||
|
|||
Extract text within ( )
On Thu, 23 Oct 2008 11:24:53 -0700 (PDT), Harlan Grove
wrote: But I've been purposely trying to avoid using lookbehind assertions since they are not supported in VBScript. *And often enough, it has happened that I've had to switch to VBA. ... For long string support? Yes. That and also dealing with capturing groups (.submatches) in certain instances. You could handle them with VBScript by using a regex to check the match pattern itself. If you find a look behind assertion, convert it to a separate expression, e.g., (?<=\()[^()]*(?=\)) - \([^()]*(?=\)) use the modified regex to get the desired substring from the source string, then trim off the bit matching the look behind assertion. It'd make the wrapper function more complicated, but you could add support for look behind assertions this way. Interesting approach. I like the flexibility. I'll have to think about implementing that one of these days. Now, having written all that, and thought about it some more, it appears as if the look-behind assertion may not even be necessary! For example: [^()]+(?=\)[^(]*$) or even: \w+(?=\)[^(]*$) or [A-Z]+(?=\)[^(]*$) All seem to work, and would work in either PCRE or VBScript. All would fail if there were an unmatched left parenthesis to the right of all other parentheses. Not necessarily an idle consideration since your example string My Text String (XPYZ) (MTS)) in addition has unbalanced parentheses. All may fail if there were only a right parenthesis with no left parentheses if the OP wanted "" returned in such cases. Good point. I had not considered something like: "My Text String XPYZ (MTS) ( in addition" --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Extract text | Excel Discussion (Misc queries) | |||
Need to extract certain text from text string | Excel Worksheet Functions | |||
Extract text from large Text | Excel Discussion (Misc queries) | |||
Text Extract | Excel Worksheet Functions | |||
EXTRACT TEXT FROM TEXT STRING | Excel Worksheet Functions |