View Single Post
  #6   Report Post  
Posted to microsoft.public.excel.programming
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default Extract text plus "n" characters or date on partial match

On 22 Mar 2007 04:20:21 -0700, "FabZ" wrote:

Thanks for your answers.
Well, my first post, effectively, was not so clear.

My situation is:
W2K+Excel2003
I have a worksheets with thousands of rows data, in the first column
there is a text like:

der. husky b/n f 4 aa. rmd 2161 huma - p.p. adopted 10.02.04
dobermann n/f m. steril+rmd24378 faruk - p.p. trasf. palombara 13/1/05


I need to extract (copy-paste) to a different column, the text it
matches the criteria:

-Start with RM
+ Have a third different character ("D" or "A" or "E" etc.)
+ Have a sequence of four or five numbers, sometimes preceded by a
<space.

The results sound like "rmd 2161" or "rmd24378" or "rme45758".

Having a working formula for the above first case, I tought to re-use
it, adapted, for my second "challenge":
extract dates looking for their previous word always is present and
insert it in one of two different column:

if "adopted" or "ado" or "adopt." is found with "*ado*" then the
following date must to be copy-past to column K (Exits).

if "re-entered" or "re-ent" or "reenter." is found, the following date
must to be copy-past to column J (Entrances).

Anyway I found that sometimes there are two key words and two dates,
then I suppose I need a third "L" column, but, at current time, it's
important to extract only the last date.

Now I'm trying your formulas, It seems to me that the merjet one could
suit my needed but it returns

"Run-time error '5': Invalid Procedure Call or Argument"

at this point:

char = Mid(sstr, iPos1, iPos2 - iPos1)

and I found, looking for, in the newsgroups, it could be a problem of
mid with InStr.

Thank again for your help!

FabZ


Well, your explanation still does not cover everything, and may well be
incomplete. However, the SUB below will do what you describe on the data you
posted.

It still assumes that the date is at the very end of the string.

It looks for one of your listed variants of ado or reent to determine which
column to place the date.

It places the rm number in the column adjacent to the string.

If, as in your second example, there is NO adopted or reentered, it will not
extract a date.

---------------------------------------------------
Sub ExtractTattoo()
Dim i As Long
Dim c As Range

Dim oRegExp As Object
Dim colMatches As Object
Const sPattern As String = "(rm[a-z]\s?\d+).*(\s\S+$)"
Const sDateAdopt As String = "\b((adopted)|(ado)|(adopt.))"
Const sDateReent As String = "\b((re-entered)|(re-ent)|(reenter.))"

Set oRegExp = CreateObject("VBScript.RegExp")

With oRegExp
.IgnoreCase = True
.Global = True
For Each c In Selection
i = 0
.Pattern = sDateAdopt
If oRegExp.test(c.Text) = True Then i = 11 'column K
.Pattern = sDateReent
If oRegExp.test(c.Text) = True Then i = 10 'column J
.Pattern = sPattern
If oRegExp.test(c.Text) = True Then
Set colMatches = oRegExp.Execute(c.Text)
c.Offset(0, 1) = colMatches(0).submatches(0) 'rmd adjacent
If i < 0 Then
Cells(c.Row, i).Value = colMatches(0).submatches(1)
End If
End If
Next c
End With
End Sub
==============================================


--ron