Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Hi
The aim is to identify anomalous strings in user data using a Regex pattern but as a complete newbie I have got bogged down. My test code is below but as you will see there are some "bad" results. The criteria - a string can be: text only = Head, Dept Head text plus numerics = Class 8-10, Class 8, 6th form, Head KS6 a single dot If numerics are included then: a single or double digit must be followed by a hyphen or an ordinal = 1st, 10-12, 15th there must be no lead zeros = 0 or 01270 ordinals must not be hyphenated = 2nd-3rd, 2nd-4 My Regex doesn't yet include a test for text only and wondered if that had to be a separate process? I would be grateful of any help to complete the pattern. T.I.A. Geoff Sub TestPattern() Dim objRegExp As Object Dim i As Long '''some test and result strings Dim arrString() As Variant Dim arrResult() As Variant arrString = Array("String", "0", "class 1-", "8-", "8-10", _ "12", "12-15", "165", "12-165", "0 12", "Class 8", "1st class", _ "2nd class", "3rd class", "6th form", "40th class", "400th class", _ "Head", "Head KS6", ".") arrResult = Array("Result", "ok", "ok", "ok", "ok", "ok", "ok", _ "ok", "bad", "bad", "bad", "ok", "ok", "ok", "ok", "ok", "ok", _ "bad", "bad", "bad") '''set pattern Set objRegExp = CreateObject("Vbscript.RegExp") With objRegExp .Global = True .IgnoreCase = True .MultiLine = True .Pattern = "\b[1-9](-|st|nd|rd|th)\b|\b[1-9][0-9](|-|st|nd|rd|th)\b" End With With Sheets(1) '''setup test strings .Range("a:c").ClearContents .Range("a:c").NumberFormat = "@" .Cells(1, "B") = "Regex Test" For i = LBound(arrString) To UBound(arrString) .Cells(i + 1, "A") = arrString(i) .Cells(i + 1, "C") = arrResult(i) Next '''run test pattern For i = LBound(arrString) + 1 To UBound(arrString) If objRegExp.Test(.Cells(i + 1, 1)) Then .Cells(i + 1, "B") = "valid" Else .Cells(i + 1, "B") = "invalid" End If Next End With End Sub |
#2
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
you must use in excel application?
Your example is not clear to me you can make a list of cases ok and a list of cases bad with poor reasoning in the list example set all your text between "" I'm sorry for my english regards r Il mio ultimo lavoro ... http://excelvba.altervista.org/blog/...ternative.html "Geoff K" wrote: Hi The aim is to identify anomalous strings in user data using a Regex pattern but as a complete newbie I have got bogged down. My test code is below but as you will see there are some "bad" results. The criteria - a string can be: text only = Head, Dept Head text plus numerics = Class 8-10, Class 8, 6th form, Head KS6 a single dot If numerics are included then: a single or double digit must be followed by a hyphen or an ordinal = 1st, 10-12, 15th there must be no lead zeros = 0 or 01270 ordinals must not be hyphenated = 2nd-3rd, 2nd-4 My Regex doesn't yet include a test for text only and wondered if that had to be a separate process? I would be grateful of any help to complete the pattern. T.I.A. Geoff Sub TestPattern() Dim objRegExp As Object Dim i As Long '''some test and result strings Dim arrString() As Variant Dim arrResult() As Variant arrString = Array("String", "0", "class 1-", "8-", "8-10", _ "12", "12-15", "165", "12-165", "0 12", "Class 8", "1st class", _ "2nd class", "3rd class", "6th form", "40th class", "400th class", _ "Head", "Head KS6", ".") arrResult = Array("Result", "ok", "ok", "ok", "ok", "ok", "ok", _ "ok", "bad", "bad", "bad", "ok", "ok", "ok", "ok", "ok", "ok", _ "bad", "bad", "bad") '''set pattern Set objRegExp = CreateObject("Vbscript.RegExp") With objRegExp .Global = True .IgnoreCase = True .MultiLine = True .Pattern = "\b[1-9](-|st|nd|rd|th)\b|\b[1-9][0-9](|-|st|nd|rd|th)\b" End With With Sheets(1) '''setup test strings .Range("a:c").ClearContents .Range("a:c").NumberFormat = "@" .Cells(1, "B") = "Regex Test" For i = LBound(arrString) To UBound(arrString) .Cells(i + 1, "A") = arrString(i) .Cells(i + 1, "C") = arrResult(i) Next '''run test pattern For i = LBound(arrString) + 1 To UBound(arrString) If objRegExp.Test(.Cells(i + 1, 1)) Then .Cells(i + 1, "B") = "valid" Else .Cells(i + 1, "B") = "invalid" End If Next End With End Sub |
#3
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
On Tue, 9 Jun 2009 06:24:01 -0700, Geoff K
wrote: Hi The aim is to identify anomalous strings in user data using a Regex pattern but as a complete newbie I have got bogged down. My test code is below but as you will see there are some "bad" results. The criteria - a string can be: text only = Head, Dept Head text plus numerics = Class 8-10, Class 8, 6th form, Head KS6 a single dot If numerics are included then: a single or double digit must be followed by a hyphen or an ordinal = 1st, 10-12, 15th there must be no lead zeros = 0 or 01270 ordinals must not be hyphenated = 2nd-3rd, 2nd-4 My Regex doesn't yet include a test for text only and wondered if that had to be a separate process? I would be grateful of any help to complete the pattern. T.I.A. Geoff It shouldn't be that hard to do, but your criteria are not congruent with your examples. For example, you give as an example in the text plus numerics category "Class 8" and "Head KS6". But your criteria state that if numerics are included, a single or double digit must be followed by a hyphen or an ordinal. That is not present in these examples. You also give "Class 8-10" as an example. But the 10 is not followed by a hyphen or an ordinal. In addition, are these single entries within a cell; or are there multiple entries within a cell as shown on the various lines you have provided? --ron |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Regex Pattern to extract Sheet Name from .Address(external:=True) | Excel Programming | |||
Another Problem with Regex Pattern | Excel Programming | |||
Help with a regex pattern please | Excel Programming | |||
Help with a Regex Pattern | Excel Programming | |||
Regex Question | Excel Programming |