View Single Post
  #10   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Harlan Grove[_2_] Harlan Grove[_2_] is offline
external usenet poster
 
Posts: 1,231
Default Extract text within ( )

Ron Rosenfeld wrote...
....
If you are going to use assertions that way, per the OP's requirements, you
need to ensure you return the last match.

....

Fair point.

=REGEX.MID(A1,"(?<=\()[^()]*(?=\))",-1) *or


Best.

=REGEX.MID(A1,"(?<=\()[^()]+(?=\)[^(]*$)")


Less efficient.

But I've been purposely trying to avoid using lookbehind assertions since they
are not supported in VBScript. *And often enough, it has happened that I've had
to switch to VBA.

....

For long string support? You could handle them with VBScript by using
a regex to check the match pattern itself. If you find a look behind
assertion, convert it to a separate expression, e.g.,

(?<=\()[^()]*(?=\)) - \([^()]*(?=\))

use the modified regex to get the desired substring from the source
string, then trim off the bit matching the look behind assertion. It'd
make the wrapper function more complicated, but you could add support
for look behind assertions this way.

Now, having written all that, and thought about it some more, it appears as if
the look-behind assertion may not even be necessary!

For example:

[^()]+(?=\)[^(]*$)

or even:

\w+(?=\)[^(]*$)

or

[A-Z]+(?=\)[^(]*$)

All seem to work, and would work in either PCRE or VBScript.


All would fail if there were an unmatched left parenthesis to the
right of all other parentheses. Not necessarily an idle consideration
since your example string

My Text String (XPYZ) (MTS)) in addition

has unbalanced parentheses. All may fail if there were only a right
parenthesis with no left parentheses if the OP wanted "" returned in
such cases.