Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 94
Default Regular Expression for cell address

Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

The second and fourth part are the optional absolute/relative dollar sign,
the third part is the one or two character column, and the fifth part is the
1 to 5 digit row number.

What are the first and last parts of this expression?

Is there a more robust way to designate the row number as 1-65536?

What about the column designation? I was thinking something like
([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]). This allows for lower case
letters and avoids the possibility of columns greater than IV.

I am new to regular expressions, having read an online tutorial, so any
references to sources of help are also appreciated. Thanks in advance for
your help!


  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 87
Default Regular Expression for cell address

I'd actually have to read up on VBA Regex, so I'll leave the
explanation to someone with a working understanding.

Your second part... that is not going to work like you want it to.

([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]) allows ZZ

You probably meant:
([A-Za-z]|[A-Ha-h][A-Za-z]|[Ii][A-Va-v])

Scott

M. Authement wrote:
Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

The second and fourth part are the optional absolute/relative dollar sign,
the third part is the one or two character column, and the fifth part is the
1 to 5 digit row number.

What are the first and last parts of this expression?

Is there a more robust way to designate the row number as 1-65536?

What about the column designation? I was thinking something like
([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]). This allows for lower case
letters and avoids the possibility of columns greater than IV.

I am new to regular expressions, having read an online tutorial, so any
references to sources of help are also appreciated. Thanks in advance for
your help!


  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 947
Default Regular Expression for cell address

(?:[\^\])-/+*:,="[(])

Hi believe brackets capture any of the items within [ ??? ].
In general, (?: __ ) is a subexpression that matches pattern, but does not
capture the match for use later on. (ie looking at a formula =Sum(..) )

I'm not to sure of the last on. ([^\d]|$)
$ usually matches the position at the end of the string.
| is "Or"
[^\d] is a non-digit, but could have been written as \D.
Again, not too sure on that one. ??

--
Dana DeLouis
Windows XP & Office 2003


"M. Authement" wrote in message
...
Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

The second and fourth part are the optional absolute/relative dollar sign,
the third part is the one or two character column, and the fifth part is
the 1 to 5 digit row number.

What are the first and last parts of this expression?

Is there a more robust way to designate the row number as 1-65536?

What about the column designation? I was thinking something like
([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]). This allows for lower case
letters and avoids the possibility of columns greater than IV.

I am new to regular expressions, having read an online tutorial, so any
references to sources of help are also appreciated. Thanks in advance for
your help!



  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default Regular Expression for cell address

On Tue, 2 Jan 2007 12:43:13 -0500, "M. Authement" wrote:

Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

The second and fourth part are the optional absolute/relative dollar sign,
the third part is the one or two character column, and the fifth part is the
1 to 5 digit row number.

What are the first and last parts of this expression?

Is there a more robust way to designate the row number as 1-65536?

What about the column designation? I was thinking something like
([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]). This allows for lower case
letters and avoids the possibility of columns greater than IV.

I am new to regular expressions, having read an online tutorial, so any
references to sources of help are also appreciated. Thanks in advance for
your help!


The last part is relatively easy.

It says match anything that is either
NOT a digit [^\d]
or
is the end of line.

I think equivalent expressions would be:

(\D|$)

([\D$])

My guess is that the last would be the most efficient.

The first expression is more complex.

(?:pattern) is a non-capturing match.

In this case you are looking to match, but not capture, any single character
within the opening and closing brackets. The opening bracket is just after the
initial colon ":", and the closing bracket is just before the ending close
parenthesis.

The carat (^) and right bracket (]) are preceded by the forward slash (\) so
they will be interpreted as literals within the bracket expression. All of the
other characters within the bracket expression should be interpreted as
literals, if I recall correctly.


--ron
  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 94
Default Regular Expression for cell address

Thanks Scott. Yes, you are right, I intended the second set to be A-H, not
A-Z.


"Scott" wrote in message
oups.com...
I'd actually have to read up on VBA Regex, so I'll leave the
explanation to someone with a working understanding.

Your second part... that is not going to work like you want it to.

([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]) allows ZZ

You probably meant:
([A-Za-z]|[A-Ha-h][A-Za-z]|[Ii][A-Va-v])

Scott

M. Authement wrote:
Can someone explain this regular expression to me? I found it in some
VBA
code for finding/altering a cell address within a string. I put spaces
in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

The second and fourth part are the optional absolute/relative dollar
sign,
the third part is the one or two character column, and the fifth part is
the
1 to 5 digit row number.

What are the first and last parts of this expression?

Is there a more robust way to designate the row number as 1-65536?

What about the column designation? I was thinking something like
([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]). This allows for lower case
letters and avoids the possibility of columns greater than IV.

I am new to regular expressions, having read an online tutorial, so any
references to sources of help are also appreciated. Thanks in advance
for
your help!






  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 94
Default Regular Expression for cell address

Thanks Ron.

Does that mean that the potential cell address MUST be preceeded by one of
the characters in the first part of the expression in order for the rest of
the expression to be true (found in a string)?

Also, any thoughts on making the row number expression more robust?

"Ron Rosenfeld" wrote in message
...
On Tue, 2 Jan 2007 12:43:13 -0500, "M. Authement"
wrote:

Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

The second and fourth part are the optional absolute/relative dollar sign,
the third part is the one or two character column, and the fifth part is
the
1 to 5 digit row number.

What are the first and last parts of this expression?

Is there a more robust way to designate the row number as 1-65536?

What about the column designation? I was thinking something like
([A-Za-z]|[A-Za-z][A-Za-z]|[Ii][A-Va-v]). This allows for lower case
letters and avoids the possibility of columns greater than IV.

I am new to regular expressions, having read an online tutorial, so any
references to sources of help are also appreciated. Thanks in advance for
your help!


The last part is relatively easy.

It says match anything that is either
NOT a digit [^\d]
or
is the end of line.

I think equivalent expressions would be:

(\D|$)

([\D$])

My guess is that the last would be the most efficient.

The first expression is more complex.

(?:pattern) is a non-capturing match.

In this case you are looking to match, but not capture, any single
character
within the opening and closing brackets. The opening bracket is just
after the
initial colon ":", and the closing bracket is just before the ending close
parenthesis.

The carat (^) and right bracket (]) are preceded by the forward slash (\)
so
they will be interpreted as literals within the bracket expression. All
of the
other characters within the bracket expression should be interpreted as
literals, if I recall correctly.


--ron



  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 733
Default Regular Expression for cell address

M. Authement wrote...
Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

....

The first, (?:...) looks like it's supposed to be any character that
could precede a valid cell address, but it doesn't include !, so this
regex won't match cell addresses that include the worksheet name. Also
it's got an innocuous bug: if hyphen, -, should be included in the
class, it should be the first character in the class. If the original
author intended it to be a character range from ) to /, then that would
have included * + , - . as well as ) and /, so the separate appearance
of those other characters is unnecessary.

The next bit matches zero or one literal dollar sign.

The next one or two letters.

The next another zero or one literal dollar sign.

The next one to five decimal numerals.

The last matches anything other than a decimal numeral or end-of-line,
but it's inclusive.

Much more efficient to use (remove the spaces)

\b (\$?) ([A-Ha-h]?[A-Za-z]|[Ii][A-Va-v]) (\$?) (\d{1,5}) \b

There's no need to parenthesize the break patterns, \b. Technically
this would match tokens like AX99999, which is NOT a valid cell address
but IS a valid name. Restricting the row number token to 1..65536 would
require something like

([1-5]?\d{1,4}|6([0-4]\d{3}|5([0-4]\d{2}|5([0-2]\d|3[0-6]))))

if you want to be a real stickler.

  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default Regular Expression for cell address

On Tue, 2 Jan 2007 20:16:31 -0500, "M. Authement" wrote:

Thanks Ron.

Does that mean that the potential cell address MUST be preceeded by one of
the characters in the first part of the expression in order for the rest of
the expression to be true (found in a string)?


Yes



Also, any thoughts on making the row number expression more robust?


Depending on how you are implementing this, it might be simpler to capture the
digit string and then test it to see if it is in the range.

You can do it with a regex, but the one I came up with is messy:

\b(([1-9]\d{0,3})|([1-5]\d{4})|(6[0-4]\d{3})|(65[0-4]\d{2})|(655[0-2]\d)|(6553[0-6]))\b


Maybe someone has a simpler one.

Oh, you can combine some of the stuff at the end (the 65's), but you'd still
have just as much alternation and it would be more difficult to understand.

Also, you'd need to change the starting word boundary token (\b) to represent
your column letters optionally followed by the "$".


--ron
  #9   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 94
Default Regular Expression for cell address

Thanks Ron. Makes sense to test the digits separately. I'll give it a try.

"Ron Rosenfeld" wrote in message
...
On Tue, 2 Jan 2007 20:16:31 -0500, "M. Authement"
wrote:

Thanks Ron.

Does that mean that the potential cell address MUST be preceeded by one of
the characters in the first part of the expression in order for the rest
of
the expression to be true (found in a string)?


Yes



Also, any thoughts on making the row number expression more robust?


Depending on how you are implementing this, it might be simpler to capture
the
digit string and then test it to see if it is in the range.

You can do it with a regex, but the one I came up with is messy:

\b(([1-9]\d{0,3})|([1-5]\d{4})|(6[0-4]\d{3})|(65[0-4]\d{2})|(655[0-2]\d)|(6553[0-6]))\b


Maybe someone has a simpler one.

Oh, you can combine some of the stuff at the end (the 65's), but you'd
still
have just as much alternation and it would be more difficult to
understand.

Also, you'd need to change the starting word boundary token (\b) to
represent
your column letters optionally followed by the "$".


--ron



  #10   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 94
Default Regular Expression for cell address

Thanks for the explanations Harlan. If I understand your more efficient
string, the ([A-Ha-h]?[A-Za-z]) will look for a two character reference
starting with A-H, and if that fails will look for a single character
reference of A-Z...is that correct?


"Harlan Grove" wrote in message
oups.com...
M. Authement wrote...
Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

...

The first, (?:...) looks like it's supposed to be any character that
could precede a valid cell address, but it doesn't include !, so this
regex won't match cell addresses that include the worksheet name. Also
it's got an innocuous bug: if hyphen, -, should be included in the
class, it should be the first character in the class. If the original
author intended it to be a character range from ) to /, then that would
have included * + , - . as well as ) and /, so the separate appearance
of those other characters is unnecessary.

The next bit matches zero or one literal dollar sign.

The next one or two letters.

The next another zero or one literal dollar sign.

The next one to five decimal numerals.

The last matches anything other than a decimal numeral or end-of-line,
but it's inclusive.

Much more efficient to use (remove the spaces)

\b (\$?) ([A-Ha-h]?[A-Za-z]|[Ii][A-Va-v]) (\$?) (\d{1,5}) \b

There's no need to parenthesize the break patterns, \b. Technically
this would match tokens like AX99999, which is NOT a valid cell address
but IS a valid name. Restricting the row number token to 1..65536 would
require something like

([1-5]?\d{1,4}|6([0-4]\d{3}|5([0-4]\d{2}|5([0-2]\d|3[0-6]))))

if you want to be a real stickler.





  #11   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 733
Default Regular Expression for cell address

M. Authement wrote...
....
string, the ([A-Ha-h]?[A-Za-z]) will look for a two character reference
starting with A-H, and if that fails will look for a single character
reference of A-Z...is that correct?


Correct.

  #12   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 87
Default Regular Expression for cell address

It's more a case of the ? representing an optional value.

A related explanation is the ? represents either 0 or 1 occurences of
whatever it follows.

Scott

M. Authement wrote:
Thanks for the explanations Harlan. If I understand your more efficient
string, the ([A-Ha-h]?[A-Za-z]) will look for a two character reference
starting with A-H, and if that fails will look for a single character
reference of A-Z...is that correct?


"Harlan Grove" wrote in message
oups.com...
M. Authement wrote...
Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

...

The first, (?:...) looks like it's supposed to be any character that
could precede a valid cell address, but it doesn't include !, so this
regex won't match cell addresses that include the worksheet name. Also
it's got an innocuous bug: if hyphen, -, should be included in the
class, it should be the first character in the class. If the original
author intended it to be a character range from ) to /, then that would
have included * + , - . as well as ) and /, so the separate appearance
of those other characters is unnecessary.

The next bit matches zero or one literal dollar sign.

The next one or two letters.

The next another zero or one literal dollar sign.

The next one to five decimal numerals.

The last matches anything other than a decimal numeral or end-of-line,
but it's inclusive.

Much more efficient to use (remove the spaces)

\b (\$?) ([A-Ha-h]?[A-Za-z]|[Ii][A-Va-v]) (\$?) (\d{1,5}) \b

There's no need to parenthesize the break patterns, \b. Technically
this would match tokens like AX99999, which is NOT a valid cell address
but IS a valid name. Restricting the row number token to 1..65536 would
require something like

([1-5]?\d{1,4}|6([0-4]\d{3}|5([0-4]\d{2}|5([0-2]\d|3[0-6]))))

if you want to be a real stickler.


Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
what are the regular expression special characters while searching Howdy Excel Discussion (Misc queries) 3 January 18th 10 02:04 PM
Can someone help me with this regular expression? [email protected] Excel Discussion (Misc queries) 3 March 10th 09 07:36 PM
Regular expression search/replace in Excel Paul Excel Discussion (Misc queries) 15 August 6th 08 04:57 PM
Regular Expression sl Excel Discussion (Misc queries) 2 January 23rd 07 11:57 PM
Regular expression searching problem LarryLev Excel Programming 0 September 15th 05 07:44 PM


All times are GMT +1. The time now is 03:46 PM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"