Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 66
Default Regex Pattern

Hi
The aim is to identify anomalous strings in user data using a Regex pattern
but as a complete newbie I have got bogged down.
My test code is below but as you will see there are some "bad" results.
The criteria - a string can be:
text only = Head, Dept Head
text plus numerics = Class 8-10, Class 8, 6th form, Head KS6
a single dot
If numerics are included then:
a single or double digit must be followed by a hyphen or an ordinal =
1st, 10-12, 15th
there must be no lead zeros = 0 or 01270
ordinals must not be hyphenated = 2nd-3rd, 2nd-4

My Regex doesn't yet include a test for text only and wondered if that had
to be a separate process?

I would be grateful of any help to complete the pattern.

T.I.A.

Geoff
Sub TestPattern()

Dim objRegExp As Object
Dim i As Long

'''some test and result strings
Dim arrString() As Variant
Dim arrResult() As Variant
arrString = Array("String", "0", "class 1-", "8-", "8-10", _
"12", "12-15", "165", "12-165", "0 12", "Class 8", "1st class", _
"2nd class", "3rd class", "6th form", "40th class", "400th class", _
"Head", "Head KS6", ".")
arrResult = Array("Result", "ok", "ok", "ok", "ok", "ok", "ok", _
"ok", "bad", "bad", "bad", "ok", "ok", "ok", "ok", "ok", "ok", _
"bad", "bad", "bad")

'''set pattern
Set objRegExp = CreateObject("Vbscript.RegExp")
With objRegExp
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "\b[1-9](-|st|nd|rd|th)\b|\b[1-9][0-9](|-|st|nd|rd|th)\b"
End With

With Sheets(1)

'''setup test strings
.Range("a:c").ClearContents
.Range("a:c").NumberFormat = "@"
.Cells(1, "B") = "Regex Test"
For i = LBound(arrString) To UBound(arrString)
.Cells(i + 1, "A") = arrString(i)
.Cells(i + 1, "C") = arrResult(i)
Next

'''run test pattern
For i = LBound(arrString) + 1 To UBound(arrString)
If objRegExp.Test(.Cells(i + 1, 1)) Then
.Cells(i + 1, "B") = "valid"
Else
.Cells(i + 1, "B") = "invalid"
End If
Next

End With

End Sub
  #2   Report Post  
Posted to microsoft.public.excel.programming
r r is offline
external usenet poster
 
Posts: 125
Default Regex Pattern

you must use in excel application?

Your example is not clear to me

you can make a list of cases ok and a
list of cases bad with poor reasoning

in the list example set all your text between ""
I'm sorry for my english

regards
r

Il mio ultimo lavoro ...
http://excelvba.altervista.org/blog/...ternative.html


"Geoff K" wrote:

Hi
The aim is to identify anomalous strings in user data using a Regex pattern
but as a complete newbie I have got bogged down.
My test code is below but as you will see there are some "bad" results.
The criteria - a string can be:
text only = Head, Dept Head
text plus numerics = Class 8-10, Class 8, 6th form, Head KS6
a single dot
If numerics are included then:
a single or double digit must be followed by a hyphen or an ordinal =
1st, 10-12, 15th
there must be no lead zeros = 0 or 01270
ordinals must not be hyphenated = 2nd-3rd, 2nd-4

My Regex doesn't yet include a test for text only and wondered if that had
to be a separate process?

I would be grateful of any help to complete the pattern.

T.I.A.

Geoff
Sub TestPattern()

Dim objRegExp As Object
Dim i As Long

'''some test and result strings
Dim arrString() As Variant
Dim arrResult() As Variant
arrString = Array("String", "0", "class 1-", "8-", "8-10", _
"12", "12-15", "165", "12-165", "0 12", "Class 8", "1st class", _
"2nd class", "3rd class", "6th form", "40th class", "400th class", _
"Head", "Head KS6", ".")
arrResult = Array("Result", "ok", "ok", "ok", "ok", "ok", "ok", _
"ok", "bad", "bad", "bad", "ok", "ok", "ok", "ok", "ok", "ok", _
"bad", "bad", "bad")

'''set pattern
Set objRegExp = CreateObject("Vbscript.RegExp")
With objRegExp
.Global = True
.IgnoreCase = True
.MultiLine = True
.Pattern = "\b[1-9](-|st|nd|rd|th)\b|\b[1-9][0-9](|-|st|nd|rd|th)\b"
End With

With Sheets(1)

'''setup test strings
.Range("a:c").ClearContents
.Range("a:c").NumberFormat = "@"
.Cells(1, "B") = "Regex Test"
For i = LBound(arrString) To UBound(arrString)
.Cells(i + 1, "A") = arrString(i)
.Cells(i + 1, "C") = arrResult(i)
Next

'''run test pattern
For i = LBound(arrString) + 1 To UBound(arrString)
If objRegExp.Test(.Cells(i + 1, 1)) Then
.Cells(i + 1, "B") = "valid"
Else
.Cells(i + 1, "B") = "invalid"
End If
Next

End With

End Sub

  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,651
Default Regex Pattern

On Tue, 9 Jun 2009 06:24:01 -0700, Geoff K
wrote:

Hi
The aim is to identify anomalous strings in user data using a Regex pattern
but as a complete newbie I have got bogged down.
My test code is below but as you will see there are some "bad" results.
The criteria - a string can be:
text only = Head, Dept Head
text plus numerics = Class 8-10, Class 8, 6th form, Head KS6
a single dot
If numerics are included then:
a single or double digit must be followed by a hyphen or an ordinal =
1st, 10-12, 15th
there must be no lead zeros = 0 or 01270
ordinals must not be hyphenated = 2nd-3rd, 2nd-4

My Regex doesn't yet include a test for text only and wondered if that had
to be a separate process?

I would be grateful of any help to complete the pattern.

T.I.A.

Geoff



It shouldn't be that hard to do, but your criteria are not congruent with your
examples.

For example, you give as an example in the text plus numerics category
"Class 8" and "Head KS6". But your criteria state that if numerics are
included, a single or double digit must be followed by a hyphen or an ordinal.
That is not present in these examples.

You also give "Class 8-10" as an example. But the 10 is not followed by a
hyphen or an ordinal.

In addition, are these single entries within a cell; or are there multiple
entries within a cell as shown on the various lines you have provided?





--ron
Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex Pattern to extract Sheet Name from .Address(external:=True) ExcelMonkey Excel Programming 9 June 4th 08 11:17 AM
Another Problem with Regex Pattern ExcelMonkey Excel Programming 2 May 31st 08 02:53 PM
Help with a regex pattern please [email protected] Excel Programming 3 March 22nd 08 12:14 AM
Help with a Regex Pattern [email protected] Excel Programming 11 April 30th 07 01:49 AM
Regex Question William Barnes Excel Programming 5 January 2nd 04 11:57 AM


All times are GMT +1. The time now is 01:35 AM.

Powered by vBulletin® Copyright ©2000 - 2025, Jelsoft Enterprises Ltd.
Copyright ©2004-2025 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"