View Single Post
  #12   Report Post  
Posted to microsoft.public.excel.programming
Scott Scott is offline
external usenet poster
 
Posts: 87
Default Regular Expression for cell address

It's more a case of the ? representing an optional value.

A related explanation is the ? represents either 0 or 1 occurences of
whatever it follows.

Scott

M. Authement wrote:
Thanks for the explanations Harlan. If I understand your more efficient
string, the ([A-Ha-h]?[A-Za-z]) will look for a two character reference
starting with A-H, and if that fails will look for a single character
reference of A-Z...is that correct?


"Harlan Grove" wrote in message
oups.com...
M. Authement wrote...
Can someone explain this regular expression to me? I found it in some VBA
code for finding/altering a cell address within a string. I put spaces in
to break the expression apart into the parts (I think) I understand.

(?:[\^\])-/+*:,="[(]) (\$?) ([A-Z]{1,2}) (\$?) (\d{1,5}) ([^\d]|$)

...

The first, (?:...) looks like it's supposed to be any character that
could precede a valid cell address, but it doesn't include !, so this
regex won't match cell addresses that include the worksheet name. Also
it's got an innocuous bug: if hyphen, -, should be included in the
class, it should be the first character in the class. If the original
author intended it to be a character range from ) to /, then that would
have included * + , - . as well as ) and /, so the separate appearance
of those other characters is unnecessary.

The next bit matches zero or one literal dollar sign.

The next one or two letters.

The next another zero or one literal dollar sign.

The next one to five decimal numerals.

The last matches anything other than a decimal numeral or end-of-line,
but it's inclusive.

Much more efficient to use (remove the spaces)

\b (\$?) ([A-Ha-h]?[A-Za-z]|[Ii][A-Va-v]) (\$?) (\d{1,5}) \b

There's no need to parenthesize the break patterns, \b. Technically
this would match tokens like AX99999, which is NOT a valid cell address
but IS a valid name. Restricting the row number token to 1..65536 would
require something like

([1-5]?\d{1,4}|6([0-4]\d{3}|5([0-4]\d{2}|5([0-2]\d|3[0-6]))))

if you want to be a real stickler.