View Single Post
  #13   Report Post  
Posted to microsoft.public.excel.misc
Ron Rosenfeld Ron Rosenfeld is offline
external usenet poster
 
Posts: 5,651
Default How to get a numbered list of unique words in a column?

On Tue, 23 Jun 2009 13:56:01 -0700, J741
wrote:

Thanks Ron. That worked.


Glad to hear it. Thanks for the feedback.


Now, how can I refine this to ignore words that are smaller than 3 letters
in length? Words like 'and', 'the', not', etc.


I would do that work in the StripWord function. That's where we clean up and
can also easily test words. If a null string is returned to the calling
routine, it already ignores it.

So, for example, to eliminate words that are 3 or fewer characters in length:

======================
Private Function StripWord(s As String) As String
Dim re As Object
Set re = CreateObject("vbscript.regexp")
re.Global = True
'allow only letters, digits, slashes and hyphens
re.Pattern = "[^-/A-Za-z0-9]"
StripWord = re.Replace(s, "")
' eliminate words with length of three or less
If Len(StripWord) <= 3 Then StripWord = ""
Set re = Nothing
End Function
=======================

Other modifications as to unacceptable words, would be simple to do here, also.
--ron