View Single Post
  #6   Report Post  
Posted to microsoft.public.excel.programming
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default Deleting non-alphanumeric characters

On Tue, 29 Apr 2008 07:24:31 -0700 (PDT), Mark
wrote:

Hello, I have code that extracts certain things from a text file
containing source code for a web site. One of the things I am
extracting is the web page name. For some reason it extracts it with
a LOT of carriage returns. Like 7-8 of those rectangular characters.
I have in my code the following:

Cells.Replace What:=Chr(13), Replacement:="", LookAt:=xlPart,
SearchOrder:=xlByRows, MatchCase:=False
Cells.Replace What:=Chr(10), Replacement:="", LookAt:=xlPart,
SearchOrde:=xlByRows, MatchCase:=False

But it only seem to delete like the first and last 2 of the
characters...I can't figure out how to get rid of them. I figured if
I also put something in the code along the lines of "If not
alphanumeric then delete character (not entire cell)"

Any suggestions??
Thanks!


Is the text file coded in Unicode? If *not*, then you could do something like:

=======================
Dim c As Range
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.IgnoreCase = True
re.Global = True
re.Pattern = "[^ -~]+"
For Each c In Selection
c.Value = re.Replace(c.Value, "")
Next c
=========================

The regular expression selects any character that is not in the class (range)
of the <space character (ASCII code 32) to the tilde (ASCII code 126) and
replaces it with nothing.
--ron