Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 5
Default Regular expressions for replacements in Excel?

Doesn't seem like many people are using the regular expression capability in
Excel. Some reason? Apart from the fact that it's kind of hard to figure
out? Actually, I'm trying to port some JavaScript regular expressions into
Excel... I think I've enabled the required library linkage, but I can't seem
to figure out the syntax required to refer to all of the lines in the
spreadsheet (which is really just a list of text strings). Don't seem to be
any good examples anywhere I can find them on the Web, and the Microsoft
documentation has been typically unenlightening.

Are the examples and useful instructions more liable to be found on the Word
side? Some of the later manipulations would be better with the spreadsheet's
capabilities, however.

  #2   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 3,885
Default Regular expressions for replacements in Excel?

Hi
post your current code which does not work

--
Regards
Frank Kabel
Frankfurt, Germany
"Shannon Jacobs" schrieb im Newsbeitrag
...
Doesn't seem like many people are using the regular expression capability
in
Excel. Some reason? Apart from the fact that it's kind of hard to figure
out? Actually, I'm trying to port some JavaScript regular expressions into
Excel... I think I've enabled the required library linkage, but I can't
seem
to figure out the syntax required to refer to all of the lines in the
spreadsheet (which is really just a list of text strings). Don't seem to
be
any good examples anywhere I can find them on the Web, and the Microsoft
documentation has been typically unenlightening.

Are the examples and useful instructions more liable to be found on the
Word
side? Some of the later manipulations would be better with the
spreadsheet's
capabilities, however.



  #3   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 6
Default Regular expressions for replacements in Excel?

There's a tutorial on the MVPS site. Word's regular expressions are pretty
well the same as any other, except that it uses minimal matching where unix
regex uses maximal.


"Shannon Jacobs" wrote in message
...
Doesn't seem like many people are using the regular expression capability

in
Excel. Some reason? Apart from the fact that it's kind of hard to figure
out? Actually, I'm trying to port some JavaScript regular expressions into
Excel... I think I've enabled the required library linkage, but I can't

seem
to figure out the syntax required to refer to all of the lines in the
spreadsheet (which is really just a list of text strings). Don't seem to

be
any good examples anywhere I can find them on the Web, and the Microsoft
documentation has been typically unenlightening.

Are the examples and useful instructions more liable to be found on the

Word
side? Some of the later manipulations would be better with the

spreadsheet's
capabilities, however.



  #4   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 5
Default Regular expressions for replacements in Excel?

Frank Kabel wrote:
Hi
post your current code which does not work

<old context snip

Well, I'd be willing to consider pasting in the code, but right now I can't
do it. I can't even tell for sure whether or not it is working. If it is
working, then Microsoft's implementation is incredibly bad and slow, which
would also explain why it seems that very few people are using the feature.
The current status is that I can't even get Excel's attention to ask it to
look at the code, but it's using 95% of the machine cycles and seems happy,
so I'm waiting to see what results come out, if any.

From memory I can say that I just defined a simple regexp, basically two
options before a short trailer string in the Pattern. Then I tried to invoke
a Replace on the Cells (just over 2000 in a single column), which did
nothing yesterday. Today I created an outer loop and tried to pass each cell
in individually--but I'm still waiting for it to finish or crash or do
something. About 10 minutes now, and still no results, and CPU still around
95%.

In yesterday's experiments I just zapped Excel at this point, and upon
restarting it would attempt to recover the file, but whether I used the
original saved file or the recovered version, in either case after that
Excel would not allow me to use any macros. The error message complained
about the security settings, but no matter what I did to the security
settings I could not get back into the file. Had to zot the file completely
each time and start from scratch.

Like they say: Microsoft's way or the highway.

  #5   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 6
Default Regular expressions for replacements in Excel?

If you are dealing with just one column of the spreadsheet, it might be
easier to copy it into a Word document, make the changes there, and copy it
back. Word's Find and Replace has the regular expression functions built in
(check the 'Use Wildcards' checkbox), so you can see what's going on and you
don't need to screw around with coding at all.



"Shannon Jacobs" wrote in message
...
Jezebel wrote:
You could try --

Sub clean()
Dim aRegExp as object
Dim I As Long

Set aRegExp = CreateObject("vbscript.regexp")
With aRegExp
.Pattern = "(height|width)=""12"" "
.Global = True
.IgnoreCase = True
' .MultiLine = True
End With

For I = 65 To ActiveSheet.UsedRange.Rows.Count
Cells(I, 1) = aRegExp.Replace(Cells(I, 1), "")
Next

End Sub

Are there more search strings you're dealing with? Otherwise, it
would be heaps quicker to use:

with Range("A65:A" & ActiveSheet.UsedRange.Rows.Count)
.Replace What:="Height=""12""", Replacement:="",
LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
.Replace What:="Width=""12""", Replacement:="",
LookAt:=xlPart, SearchOrder:=xlByRows, MatchCase:=False
end with


This sounds very much like what I've been trying to figure out, though I
won't be able to pursue it farther until later, probably Friday at

earliest.
However, right now I'm struggling with the following problem. I want to

drop
the last part of the image tags, and all of the first part including the
directory information. All I want to have left is the stem name of the

file.
The first triplet works properly, but the second one is overmatching
somehow, and the stem name is also dropped. Even worse, in this latest

form
I tried to use the parenthetic grouping for a clause and then the \1 to
return the stem, but now it just sticks a "\1" into the file.

' remove ends of image tags
aRegExp.Pattern = "\.gif[\w+ ?= ?\w+]*"
theLine = aRegExp.Replace(theLine, "")

' remove fronts of image tags, but save the last word
aRegExp.Pattern = "<img src ?= ?\./[\w+/]*(\w+)"
theLine = aRegExp.Replace(theLine, "\1")

I haven't even gotten to the really nasty part of the file transformation
yet, though so far I'm getting a moderate feel for how they work here...

Not sure if it will help, but my current primary reference is the

following
URL, and I'm working with an English version of Excel XP right now.


http://msdn.microsoft.com/library/de...ting051099.asp





  #6   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 5
Default Regular expressions for replacements in Excel?

Jezebel wrote:
If you are dealing with just one column of the spreadsheet, it might
be easier to copy it into a Word document, make the changes there,
and copy it back. Word's Find and Replace has the regular expression
functions built in (check the 'Use Wildcards' checkbox), so you can
see what's going on and you don't need to screw around with coding at
all.


The Word wildcards seem to be much more limited than real regular
expressions, so I had already eliminated that option. I haven't listed all
of my transformations here, but a number of them are somewhat trickier, so
that won't work. Your suggestion with the loop simplification is good, but I
don't think the Call itself is really causing much of the overhead, but you
are right that I should move it to one level. The other idea about using the
range.replace doesn't seem to apply here.

(In addition, I admit that part of my motivation is to learn about macro
programming specifically within Excel, since my coworkers already use it a
lot (though none of them are into regular expressions).)

For now, I'll just note the mistake and fix for the problem I commented on
last time:

This is the incorrect code:
' remove fronts of image tags, but save the last word
aRegExp.Pattern = "<img src ?= ?\./[\w+/]*(\w+)"
theLine = aRegExp.Replace(theLine, "\1")


This is correct:
' remove fronts of image tags, but save the last word
aRegExp.Pattern = "<img src ?= ?\./(\w+/)*"
theLine = aRegExp.Replace(theLine, "")


Typically dumb typo with typical fatal results in a RegExp. Within a
bracketed class of course the \w picked up the rest of the string, whether
or not it ended in the slash that I intended to require. By ending the match
after the last slash, I didn't have to worry about explicitly preserving the
last word.

I still don't understand why the \1 didn't work as expected, but just for
grins, I tried ye olde $1, which produced really weird results... This may
be a future problem, since I think that one of the later transformations is
not just a simple deletion, but will require submatch reference in the
replacement... Right now I'm just massaging the line breaks, which is kind
of a nuisance since I still can't get the MultiLine parameter to work as I
had hoped. If that would only work, I should be able to slurp the next major
transformation in one step or two...

  #7   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 6
Default Regular expressions for replacements in Excel?

Check some of the web tutorials (eg
http://word.mvps.org/FAQs/General/UsingWildcards.htm) on Word's Find and
Replace -- it really does support full regular expressions.

BTW, why do your posts use JIS (Japanese) encoding?




"Shannon Jacobs" wrote in message
...
Jezebel wrote:
If you are dealing with just one column of the spreadsheet, it might
be easier to copy it into a Word document, make the changes there,
and copy it back. Word's Find and Replace has the regular expression
functions built in (check the 'Use Wildcards' checkbox), so you can
see what's going on and you don't need to screw around with coding at
all.


The Word wildcards seem to be much more limited than real regular
expressions, so I had already eliminated that option. I haven't listed all
of my transformations here, but a number of them are somewhat trickier, so
that won't work. Your suggestion with the loop simplification is good, but

I
don't think the Call itself is really causing much of the overhead, but

you
are right that I should move it to one level. The other idea about using

the
range.replace doesn't seem to apply here.

(In addition, I admit that part of my motivation is to learn about macro
programming specifically within Excel, since my coworkers already use it a
lot (though none of them are into regular expressions).)

For now, I'll just note the mistake and fix for the problem I commented on
last time:

This is the incorrect code:
' remove fronts of image tags, but save the last word
aRegExp.Pattern = "<img src ?= ?\./[\w+/]*(\w+)"
theLine = aRegExp.Replace(theLine, "\1")


This is correct:
' remove fronts of image tags, but save the last word
aRegExp.Pattern = "<img src ?= ?\./(\w+/)*"
theLine = aRegExp.Replace(theLine, "")


Typically dumb typo with typical fatal results in a RegExp. Within a
bracketed class of course the \w picked up the rest of the string, whether
or not it ended in the slash that I intended to require. By ending the

match
after the last slash, I didn't have to worry about explicitly preserving

the
last word.

I still don't understand why the \1 didn't work as expected, but just for
grins, I tried ye olde $1, which produced really weird results... This may
be a future problem, since I think that one of the later transformations

is
not just a simple deletion, but will require submatch reference in the
replacement... Right now I'm just massaging the line breaks, which is kind
of a nuisance since I still can't get the MultiLine parameter to work as I
had hoped. If that would only work, I should be able to slurp the next

major
transformation in one step or two...



  #8   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 5
Default Regular expressions for replacements in Excel?

As regards the Word regular expressions, I'm afraid the answer is "No, they
are not 'full' regular expressions." For example, my very first simple
transformation (expressed as "(height|width) ?=? ?""?\d+%?""? ?" using VB)
cannot be expressed in that system. At least I could find no way to do so,
and the URL you provided also shows no way to do the simple string
alternative. Here is another important bit of evidence from the URL you
provided:

"If you are using VBA, you might want to look at the RegExp object (from
VBScript, which you can include in your macro projects, and which offers
some features not included in Word wildcards)."

I'm never absolutely sure in dealing with Microsoft, but at this point I'm
pretty well convinced that the situation is that there are four "levels" of
search and replace available within Word: (1) literal, (2) very basic
wildcards with a few special notations, (3) extended wildcards with some
regexp notation, and (4) full regular expressions (which can only be
accessed via VBScript). (Actually, there also appear to be two levels of the
"full" regular expressions.) The advantage of being Microsoft is that you
can announce that all of these are now de facto standards.

Within Excel, I haven't found access to levels (2) or (3), but that's moot,
since they can't handle some of the transformations I'm already doing from
level (4). (Or is that really level (4)b?)

If I can just figure out some way to make the MultiLine option work, I think
I'll basically be done with the project. I feel like I need some minor trick
on the "Cells" object, but so far I haven't been able to find it. Hmm...
Have you ever made a *really* long string by just appending all of the
fields together?

And sorry about the Japanese encoding. Sometimes OE respects my "standing
request" to use a Western encoding, and sometimes it does not. It's pretty
annoying to have to check all the time trying to catch it when it slips
back. I have not been able to find the pattern of whatever is controlling
the encoding (apart from OE trying to use the original encoding when making
a reply, but that's actually a reasonable behavior and works properly (most
of the time)).

Jezebel wrote:
Check some of the web tutorials (eg
http://word.mvps.org/FAQs/General/UsingWildcards.htm) on Word's Find
and Replace -- it really does support full regular expressions.

BTW, why do your posts use JIS (Japanese) encoding?

<older stuff snipped

  #9   Report Post  
Posted to microsoft.public.excel.programming,microsoft.public.word.vba.beginners,microsoft.public.word.vba.general
external usenet poster
 
Posts: 5
Default Regular expressions for replacements in Excel?

Well, below is the current version of the code, but unless someone is
willing to help and can clarify how to make the MultiLine thing work, I'm
probably going to let it stand. This is about 5 times faster than the
original version, but basically it's just inlining the subroutine call and
reducing the overhead for object creation. This will process the 2000 lines
in around 4 seconds.

By using a "Dim theText() as String" statement the system seemed to accept
the MultiLine assignment, but in reality only the first string from Cells
was copied into my array, and all of my experiments failed. I wasn't able to
find any directly applicable source code examples, and I couldn't twist any
of the examples I did find into working as expected... My current conclusion
is that there are very few people using the MultiLine capability, and none
of them are monitoring any of these newsgroups.

The other approach of using large strings seems implausible since there's a
32 KB size limit on the strings, and the total file is something over 200
KB. However, I may still be able to pursue that approach if I can devise
some criteria for lumping things into reasonably small packages...

Basically you can take the following as an example of how to use true
regular expressions within Excel via a VB macro:

Sub FastClean()

Dim startTime
startTime = Now

Dim I As Long
Dim lastI As Long
Dim S As String

Set aRegExp = CreateObject("vbscript.regexp")
With aRegExp
.Global = True
.IgnoreCase = True
' .MultiLine = True
End With

ActiveCell.SpecialCells(xlLastCell).Select
lastI = ActiveCell.Row

For I = 1 To lastI
S = Cells(I, 1)

'kill first three parameters
aRegExp.Pattern = "(height|width|border)=""?\d+%?""? ?"
S = aRegExp.Replace(S, "")

' Eight more patterns and replacements

Cells(I, 1) = S
Next

MsgBox "Finished in " + Format(Now - startTime, "ss.s") + " seconds."
End Sub

Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Get rid of with regular expressions Howdy Excel Discussion (Misc queries) 1 January 18th 10 07:42 PM
Regular expressions in VB FiluDlidu Excel Discussion (Misc queries) 4 March 21st 08 01:10 AM
Substring in excel? How about regular expressions? Samuel Excel Discussion (Misc queries) 8 May 22nd 06 04:43 PM
Regular expressions in Excel vigi98 Excel Discussion (Misc queries) 3 November 10th 05 04:40 PM
Regular Expressions in VBA & Excel including an interactive tool Tushar Mehta Excel Programming 0 December 30th 03 03:17 AM


All times are GMT +1. The time now is 02:27 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"