View Single Post
  #25   Report Post  
Posted to microsoft.public.excel.worksheet.functions
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default With a Function, how can I get rid of formating codes like <di

On Mon, 1 Jun 2009 18:20:02 -0700, AFSSkier
wrote:

Ron,

You're getting closer. However, the & HTML codes need to be replaced, not
stripped.

&(amper)nbsp = " " (space)
&(amper)amp = & (and)

--
Thanks, Kevin


Kevin,

That can be done. I hope you have a typo in your information, though, because
you have substituted the word "and" for a "text" ampersand in your original
data. In certain types of coding, the "text" ampersand would be an incorrect
usage.

Also, your comments interspersed with the text string to be processed makes
things a bit more confusing to me than might ordinarily have been the case, so
if this doesn't work, you may need to provide more accurate input and desired
outputs, with comments outside of the string to be processed.

In any event, one way is to first process the string to get rid of the <html
codes and all of the & HTML codes except for &amp and &nbsp.

Then we do replacements on those two codes.

So:

===========================
Option Explicit
Function StripFormat(S As String) As String
Dim sTemp As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.Global = True
re.Pattern = "<[^<]+|[\r\n]|(&\s*(amp|nbsp)\s*;)|&[^;&]+;"
sTemp = re.Replace(S, "$1")
re.Pattern = "&\s*amp\s*;"
sTemp = re.Replace(sTemp, "&")
re.Pattern = "&\s*nbsp\s*;"
StripFormat = re.Replace(sTemp, " ")
End Function
===========================

In your examples, you had <space's included within the ampersand HTML coding.
If you can be certain that there will never be any spaces in those locations,
the UDF could be made a bit more efficient, by using the VBA replace method
instead of the Regular Expression replace method.
--ron