View Single Post
  #7   Report Post  
Posted to microsoft.public.excel.misc
Otto Moehrbach[_2_] Otto Moehrbach[_2_] is offline
external usenet poster
 
Posts: 1,071
Default How to get a numbered list of unique words in a column?

James
I forgot to mention that I also assumed that the 2 columns that have the
long cells in question are columns A & B. Change that in the code to match
your actual data layout. Otto
"Otto Moehrbach" wrote in message
...
James

Here's some code that does what you want, I think. I wrote it
in several macros for ease of development. I assumed that the sheet that
you start with is named "Main". Change that in the code as needed. I
also assumed that you have a blank sheet named "Utility". The code first
clears the Utility sheet, then builds everything in that sheet. The end
product, in the "Utility" sheet is one or more columns consisting of all
the words, including all repeat words. Then there is one column with a
header of "Unique Words" and that's what is in that column. The last
column has a number for each unique word and that number is the number of
times that that word appears in all the words. I ran the code and it
works for me. The macro that you want to run is "GetList". All the other
macros run from this one macro. Place all you see below in a regular
module

If you need more help in running this, send me an email and
I'll send you the file in which I developed this code. My email address
is . Remove the "extra" from this address.
HTH Otto

Option Explicit
Dim rColA As Range
Dim rColB As Range
Dim rTheRng As Range
Dim i As Range
Dim Dest As Range
Dim Str As String



Sub GetList()
Set rColA = Range("A2", Range("A" & Rows.Count).End(xlUp))
Set rColB = Range("B2", Range("B" & Rows.Count).End(xlUp))
Set rTheRng = Union(rColA, rColB)
Call GetAllWords
Call GetUniqueWords
End Sub



Sub GetAllWords()
Dim TheArray() As String
Dim c As Long
Sheets("Main").Select
With Sheets("Utility")
.Cells.Clear
Set Dest = .Range("A1")
For Each i In rTheRng
If IsEmpty(i.Value) Then GoTo NextCell
Call CleanEntry
TheArray = Split(Str, " ")
Dest.Resize(1 + UBound(TheArray)) =
WorksheetFunction.Transpose(TheArray)
Set Dest = .Cells(Rows.Count, Dest.Column).End(xlUp).Offset(1)
If Dest.Row 60000 Then _
Set Dest = .Cells(1, Dest.Column + 1)
NextCell:
Next i
End With
End Sub



Sub CleanEntry()
Str = i.Value
Str = Application.Trim(Str)
Str = Replace(Str, ".", "")
Str = Replace(Str, ",", "")
Str = Replace(Str, "?", "")
Str = Replace(Str, "!", "")
End Sub

Sub GetUniqueWords()
Dim LastColumn As Long
Dim UW As Range
Sheets("Utility").Select
Set rTheRng = ActiveSheet.UsedRange
LastColumn = rTheRng(rTheRng.Count).Column + 1
Rows("1:1").Insert Shift:=xlDown
Range("A1") = "All The Words"
Cells(1, LastColumn) = "Unique Words"
Cells(1, LastColumn + 1) = "Qty"
Set Dest = Cells(2, LastColumn)
Set UW = Dest
For Each i In rTheRng
If UW.Find(What:=i.Value, LookAt:=xlWhole) Is Nothing Then
Dest = i.Value
Dest.Offset(, 1) = Application.CountIf(rTheRng, i.Value)
Set Dest = Dest.Offset(1)
Set UW = Range(Cells(2, Dest.Column), Cells(Rows.Count,
Dest.Column).End(xlUp).Offset(1))
End If
Next i
End Sub


"J741" wrote in message
...
Hi Otto.

1. The cells have a _lot_ of words, but I don't think it's any more than
a
few hundred per cell.

2. The range of cells containing the words is C2:D5000

3. I am using Excel 2003 SP3


- James.



"Otto Moehrbach" wrote:

James
What version of Excel are you using? Also, do you have any idea of
the
maximum number of words in these cells? Like, no more than 30. Or no
more
than 300. Or no more than 3000. What columns are these 2 columns, and
what
row has the first entry in these 2 columns. HTH Otto
"J741" wrote in message
...
Hi.

I have a spreadsheet with almost 5000 rows with about 15 columns of
data,
2
columns of which contain sentences, phrases, or paragraphs of comments
entered by users.

What I need to have as an end result is a list of all unique words
found
in
those two columns, along with the number of occurrences of each word.
I
will
then use this to further analyze the data in the spreadsheet.

Right now, I do this manually and it takes a VERY long time for me to
do
so
(over 200 man-hours). So, I really need to automate this somehow.

The problem is, I have no idea how to begin, because the words will
not be
by themselves in their own cell (so I cant use the auto filter
functions,
pivot tables, or anything else I can think of), but will be part of
groups
of
words or symbols within multiple cells.

I think this will need to be done programmatically with code, scripts,
macros, or some other method with which I am not familiar, but I am
just
not
sure.

Can anyone help me with this? Can anyone point me in the right
direction?
Is there anything already built in to excel to do this?

- James.