Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 157
Default Genarating count of unique words in a cell or cells

Hi,

Suppose in cell A1 I have a text --- "I had some bread in morning. I had
some eggs at night."

Is it possible to programmatically get a count of unique words in the above
string.
Like in column B the unique words are listed one word in each row in the
following manner..B1 -- "I"
B2 -- "I
B2 -- "had
B3 -- "some"
B4 -- "bread"
B5 -- "in"
B6 -- "morning"
B7 -- "eggs"
B8 -- "at"
B9 -- "night"

Actually I can use excel's text to columns feature ( using space as a
delimiter) but automating is a problem. Like I have 500 or so rows of data
and out of those rows I want to get count of unique words within those 500
rows ( basically a consolidated unique list..). Now, each row would be
having different number of words so, programmatically how would it be
implemented.

Please guide me if possible.

Regards,
Hari
India



  #2   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default Genarating count of unique words in a cell or cells

You could get the text from the cell and use split() to create an
array of words. Exactly how you did this would depend on what you'd
consider as a "word".

You might want to use replace() on the string before splitting it to
replace newlines, tabs etc with spaces

'eg (untested)
dim sTxt as string, arrWords
sTxt=activesheet.range("A1").value
sTxt=replace(sTxt,vbTab," ")
sTxt=replace(sTxt,vbcrlf," ")
arrWords=split(sTxt," ")

Once you have the array you could loop through it and count the
occurences of the words: one way might be to use a dictionary object
where the word is the key and the value would be incremented each time
the word crops up. Then loop though the dictionary and write the
contents to column B.

tim






"Hari" wrote in message
...
Hi,

Suppose in cell A1 I have a text --- "I had some bread in morning. I
had
some eggs at night."

Is it possible to programmatically get a count of unique words in
the above
string.
Like in column B the unique words are listed one word in each row in
the
following manner..B1 -- "I"
B2 -- "I
B2 -- "had
B3 -- "some"
B4 -- "bread"
B5 -- "in"
B6 -- "morning"
B7 -- "eggs"
B8 -- "at"
B9 -- "night"

Actually I can use excel's text to columns feature ( using space as
a
delimiter) but automating is a problem. Like I have 500 or so rows
of data
and out of those rows I want to get count of unique words within
those 500
rows ( basically a consolidated unique list..). Now, each row would
be
having different number of words so, programmatically how would it
be
implemented.

Please guide me if possible.

Regards,
Hari
India





  #3   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default Genarating count of unique words in a cell or cells

Try this. Requires a reference to the Scripting runtime library.

Tim.


Sub tester()
ActiveSheet.Range("B1").Value =
CountWords(ActiveSheet.Range("A1").Value)
End Sub

Function CountWords(sText As String) As String

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim oDict As New Scripting.Dictionary
Dim tmp As String
Dim k As Variant
Dim sReturn As String

arrReplace = Array(vbTab, ":", ";", ".", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")
oDict.CompareMode = TextCompare 'case-insensitive
For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then
oDict(tmp) = IIf(oDict.Exists(tmp), oDict(tmp) + 1, 1)
End If
Next x

sReturn = ""
For Each k In oDict.Keys
sReturn = sReturn & "{" & k & "} : " & oDict(k) & Chr(10)
Next k

CountWords = sReturn
End Function

Tim.


"Hari" wrote in message
...
Hi,

Suppose in cell A1 I have a text --- "I had some bread in morning.
I had
some eggs at night."

Is it possible to programmatically get a count of unique words in
the above
string.
Like in column B the unique words are listed one word in each row
in the
following manner..B1 -- "I"
B2 -- "I
B2 -- "had
B3 -- "some"
B4 -- "bread"
B5 -- "in"
B6 -- "morning"
B7 -- "eggs"
B8 -- "at"
B9 -- "night"

Actually I can use excel's text to columns feature ( using space as
a
delimiter) but automating is a problem. Like I have 500 or so rows
of data
and out of those rows I want to get count of unique words within
those 500
rows ( basically a consolidated unique list..). Now, each row would
be
having different number of words so, programmatically how would it
be
implemented.

Please guide me if possible.

Regards,
Hari
India







  #4   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default Genarating count of unique words in a cell or cells

Hi Hari,

You could use a worksheet formula:

=SUMPRODUCT((ref<"")*(LEN(TRIM(ref))+1-LEN(SUBSTITUTE(ref," ",""))))

Regards
Peter T

"Hari" wrote in message
...
Hi,

Suppose in cell A1 I have a text --- "I had some bread in morning. I had
some eggs at night."

Is it possible to programmatically get a count of unique words in the

above
string.
Like in column B the unique words are listed one word in each row in the
following manner..B1 -- "I"
B2 -- "I
B2 -- "had
B3 -- "some"
B4 -- "bread"
B5 -- "in"
B6 -- "morning"
B7 -- "eggs"
B8 -- "at"
B9 -- "night"

Actually I can use excel's text to columns feature ( using space as a
delimiter) but automating is a problem. Like I have 500 or so rows of data
and out of those rows I want to get count of unique words within those 500
rows ( basically a consolidated unique list..). Now, each row would be
having different number of words so, programmatically how would it be
implemented.

Please guide me if possible.

Regards,
Hari
India





  #5   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 157
Default Genarating count of unique words in a cell or cells

Hi Tim,

(Im sorry, my computer has some bios/date problem, so Im Re-Posting the
below message after correcting the date/time setting)

Thanks a lot for your code.

I have a small change if possible.

Presently if In A1 I have --- I have measles. I also have TB.
and if in A2 I have --- I want to go to Paris in order to cure my TB.
Then in B1I get the below result

{I} : 2
{have} : 2
{measles} : 1
{also} : 1
{TB} : 1

and for getting the below result in B2 (by running your sub again by
changing the address)

{I} : 1
{want} : 1
{to} : 3
{go} : 1
{Paris} : 1
{in} : 1
{order} : 1
{cure} : 1
{my} : 1
{TB} : 1


I want to know if it is possible to ..

a) To have a consolidated count of words in both the rows. That is the above
result to combine the words in both rows 1 and 2 for giving a total result.
b) Is it is possible to have the count of words listed in different rows
rather than within a single cell (as per help you have used a chr(10) for
linefeed character). Accordingly have the counts in Column C for the
corresponding unique word.

For example for the above example with 2 rows, I visualise the following
result :-

Column -- A
B C
Row 1 I have measles. I also have TB.
I 3
2 I want to go to Paris in order to cure my TB.
Have 2
3
Measles 2
4
also 1
5
TB 2
6
want 1
7
to 3
8
go 1
9
Paris 1
10
in 1
11
order 1
12
cure 1
13
my 1


Actually why am requesting for the above way is I want to analyse count of
unique words across 500 rows or so and a consolidated list would help me a
lot.
-
Regards,
Hari
India

"Tim Williams" <saxifrax@pacbell*dot*net wrote in message
...
Try this. Requires a reference to the Scripting runtime library.

Tim.


Sub tester()
ActiveSheet.Range("B1").Value =
CountWords(ActiveSheet.Range("A1").Value)
End Sub

Function CountWords(sText As String) As String

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim oDict As New Scripting.Dictionary
Dim tmp As String
Dim k As Variant
Dim sReturn As String

arrReplace = Array(vbTab, ":", ";", ".", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")
oDict.CompareMode = TextCompare 'case-insensitive
For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then
oDict(tmp) = IIf(oDict.Exists(tmp), oDict(tmp) + 1, 1)
End If
Next x

sReturn = ""
For Each k In oDict.Keys
sReturn = sReturn & "{" & k & "} : " & oDict(k) & Chr(10)
Next k

CountWords = sReturn
End Function

Tim.


"Hari" wrote in message
...
Hi,

Suppose in cell A1 I have a text --- "I had some bread in morning.
I had
some eggs at night."

Is it possible to programmatically get a count of unique words in
the above
string.
Like in column B the unique words are listed one word in each row
in the
following manner..B1 -- "I"
B2 -- "I
B2 -- "had
B3 -- "some"
B4 -- "bread"
B5 -- "in"
B6 -- "morning"
B7 -- "eggs"
B8 -- "at"
B9 -- "night"

Actually I can use excel's text to columns feature ( using space as
a
delimiter) but automating is a problem. Like I have 500 or so rows
of data
and out of those rows I want to get count of unique words within
those 500
rows ( basically a consolidated unique list..). Now, each row would
be
having different number of words so, programmatically how would it
be
implemented.

Please guide me if possible.

Regards,
Hari
India










--
Regards,
Hari
India
"Tim Williams" <saxifrax@pacbell*dot*net wrote in message
...
Try this. Requires a reference to the Scripting runtime library.

Tim.


Sub tester()
ActiveSheet.Range("B1").Value =
CountWords(ActiveSheet.Range("A1").Value)
End Sub

Function CountWords(sText As String) As String

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim oDict As New Scripting.Dictionary
Dim tmp As String
Dim k As Variant
Dim sReturn As String

arrReplace = Array(vbTab, ":", ";", ".", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")
oDict.CompareMode = TextCompare 'case-insensitive
For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then
oDict(tmp) = IIf(oDict.Exists(tmp), oDict(tmp) + 1, 1)
End If
Next x

sReturn = ""
For Each k In oDict.Keys
sReturn = sReturn & "{" & k & "} : " & oDict(k) & Chr(10)
Next k

CountWords = sReturn
End Function

Tim.


"Hari" wrote in message
...
Hi,

Suppose in cell A1 I have a text --- "I had some bread in morning.
I had
some eggs at night."

Is it possible to programmatically get a count of unique words in
the above
string.
Like in column B the unique words are listed one word in each row
in the
following manner..B1 -- "I"
B2 -- "I
B2 -- "had
B3 -- "some"
B4 -- "bread"
B5 -- "in"
B6 -- "morning"
B7 -- "eggs"
B8 -- "at"
B9 -- "night"

Actually I can use excel's text to columns feature ( using space as
a
delimiter) but automating is a problem. Like I have 500 or so rows
of data
and out of those rows I want to get count of unique words within
those 500
rows ( basically a consolidated unique list..). Now, each row would
be
having different number of words so, programmatically how would it
be
implemented.

Please guide me if possible.

Regards,
Hari
India











  #6   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default Genarating count of unique words in a cell or cells

Hari,

Try this - seems to work OK but you should test it before using.

Regards
Tim


Option Explicit

Sub tester()
AddWordCount ActiveSheet.Range("A1").Value
AddWordCount ActiveSheet.Range("A2").Value
End Sub

Sub AddWordCount(sText As String)

Const COL_WORDS As Integer = 2
Const COL_COUNTS As Integer = 3
Const ROW_START As Integer = 1
Const MAX_ROWS As Integer = 10000

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim tmp As String
Dim lRow As Long
Dim lLastRow As Long
Dim rngSrch As Range, rngWord As Range

'find extent of current count
lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row
If lLastRow = 0 Then lLastRow = 1
Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _
ActiveSheet.Cells(lLastRow, COL_WORDS))

arrReplace = Array(vbTab, ":", ";", ".", ",", _
"""", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")

For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then

On Error Resume Next
Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False)
On Error GoTo 0

If rngWord Is Nothing Then
lLastRow = lLastRow + 1
Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1)
With ActiveSheet.Cells(lLastRow, COL_WORDS)
.Value = tmp
.Offset(0, 1).Value = 1
End With
Else
rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value +
1
End If

End If
Next x


End Sub

"Hari" wrote in message
...
Hi Tim,

(Im sorry, my computer has some bios/date problem, so Im Re-Posting
the
below message after correcting the date/time setting)

Thanks a lot for your code.

I have a small change if possible.

Presently if In A1 I have --- I have measles. I also have TB.
and if in A2 I have --- I want to go to Paris in order to cure my
TB.
Then in B1I get the below result

{I} : 2
{have} : 2
{measles} : 1
{also} : 1
{TB} : 1

and for getting the below result in B2 (by running your sub again by
changing the address)

{I} : 1
{want} : 1
{to} : 3
{go} : 1
{Paris} : 1
{in} : 1
{order} : 1
{cure} : 1
{my} : 1
{TB} : 1




  #7   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 157
Default Genarating count of unique words in a cell or cells

Hi Tim,

Thanx a ton for posting the codes. Just to tell you of why I needed it, I
analyse Market Research data and I needed count of unique words to analyse
open ended responses.
For example I am studying/tracking the usage of Software Development tools.
I ran your code on the following 8 responses (8 rows of data).

hot dog pro
As 400 RPG
adobe photo workshop
microfocus emulators
html
ibm web sphere
vx works
powerhouse

The results Im getting is :-

hot 1
dog 1
pro 1
As 1
400 1
RPG 1
adobe 1
photo 1
workshop 2
microfocus 1
emulators 1
html 1
ibm 1
web 1
sphere 1
vx 1
powerhouse 1

Whats happening is that the SUB is treating "works" which is in the 7th row
same as "workshop" which is in the 3rd row. Consequently the count of
"workshop" is being shown as 2 while "works" doesnt appear in the result.
Please tell me whether it would be possible to modify the code in order to
get the count for "workshop" as 1 and count of "works" as 1.

(Just in case if u have a doubt on the futility of analysing, in a 'faulty'
manner, by counting unique words where separating doesnt make sense - Like
"adobe photo workshop" being one tool and separating these 3 - I plan to
tackle that by running your sub first to get the initial stage 'frequency'
and then I will bind those individual responses "adobe photo workshop" in to
"AdobePhotoWorkshop". I will not bind those rows where 2 software tools are
mentioned in a single row.)

Regards,
Hari
India

"Tim Williams" <saxifrax@pacbell*dot*net wrote in message
...
Hari,

Try this - seems to work OK but you should test it before using.

Regards
Tim


Option Explicit

Sub tester()
AddWordCount ActiveSheet.Range("A1").Value
AddWordCount ActiveSheet.Range("A2").Value
End Sub

Sub AddWordCount(sText As String)

Const COL_WORDS As Integer = 2
Const COL_COUNTS As Integer = 3
Const ROW_START As Integer = 1
Const MAX_ROWS As Integer = 10000

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim tmp As String
Dim lRow As Long
Dim lLastRow As Long
Dim rngSrch As Range, rngWord As Range

'find extent of current count
lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row
If lLastRow = 0 Then lLastRow = 1
Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _
ActiveSheet.Cells(lLastRow, COL_WORDS))

arrReplace = Array(vbTab, ":", ";", ".", ",", _
"""", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")

For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then

On Error Resume Next
Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False)
On Error GoTo 0

If rngWord Is Nothing Then
lLastRow = lLastRow + 1
Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1)
With ActiveSheet.Cells(lLastRow, COL_WORDS)
.Value = tmp
.Offset(0, 1).Value = 1
End With
Else
rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value +
1
End If

End If
Next x


End Sub

"Hari" wrote in message
...
Hi Tim,

(Im sorry, my computer has some bios/date problem, so Im Re-Posting
the
below message after correcting the date/time setting)

Thanks a lot for your code.

I have a small change if possible.

Presently if In A1 I have --- I have measles. I also have TB.
and if in A2 I have --- I want to go to Paris in order to cure my
TB.
Then in B1I get the below result

{I} : 2
{have} : 2
{measles} : 1
{also} : 1
{TB} : 1

and for getting the below result in B2 (by running your sub again by
changing the address)

{I} : 1
{want} : 1
{to} : 3
{go} : 1
{Paris} : 1
{in} : 1
{order} : 1
{cure} : 1
{my} : 1
{TB} : 1






  #8   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1,588
Default Genarating count of unique words in a cell or cells

Hari,

Modify the Find() line:

Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False,
lookat:=xlWhole)

xlWhole will match the complete cell contents and not just a
substring.

Tim.



"Hari" < wrote in message
...
Hi Tim,

Thanx a ton for posting the codes. Just to tell you of why I needed
it, I
analyse Market Research data and I needed count of unique words to
analyse
open ended responses.
For example I am studying/tracking the usage of Software Development
tools.
I ran your code on the following 8 responses (8 rows of data).

hot dog pro
As 400 RPG
adobe photo workshop
microfocus emulators
html
ibm web sphere
vx works
powerhouse

The results Im getting is :-

hot 1
dog 1
pro 1
As 1
400 1
RPG 1
adobe 1
photo 1
workshop 2
microfocus 1
emulators 1
html 1
ibm 1
web 1
sphere 1
vx 1
powerhouse 1

Whats happening is that the SUB is treating "works" which is in the
7th row
same as "workshop" which is in the 3rd row. Consequently the count
of
"workshop" is being shown as 2 while "works" doesnt appear in the
result.
Please tell me whether it would be possible to modify the code in
order to
get the count for "workshop" as 1 and count of "works" as 1.



  #9   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 157
Default Genarating count of unique words in a cell or cells

Hi Tim,

Thanks a lot for persisting with me. It works great now.

Just wanted to SHARE this with you.

Presently in the column C am getting count of a particular unique word by
summing each instance of its occurence in the Column A.
In Column D I wanted to see the count of unique word by summing the number
of rows within column A that it appears.

What I mean is if in cell A1, A2 the response is :-
A1 - "VC++"
A2 - "I use VC++, Cobol, Fortran. But mainly I use VC++"
A3 - "I use Basic"

Then the answers I get in Column C is

VC++ 3
I 3
use 3
Cobol 1
Fortran 1
But 1
mainly 1
Basic 1


In column D I desired an output as given below.

VC++ 2
I 2
use 2
Cobol 1
Fortran 1
But 1
mainly 1
Basic 1


The difference between column C and D is that if a unique word is found more
than once within the same row then it is counted only once. Like in A2 "I",
"VC++", "Use" , appears twice within the same row so its WEIGHTAGE in the
global count is given only once. So, In effect the count of unique words is
done by summing the number of rows within column A that it appears.

For accomplishing that in cell D1 I wrote the following formula (array
formula)

{=SUM(IF(ISERROR(FIND(B2,$A$1:$A$8)),0,1))} and thankfully I am able get the
result.

I think by perusing newsgroup posts am learning (albeit slowly) to control
array formulas a little better.

Thanks to all you folks who support us.
--
Regards,
Hari
India


  #10   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default Genarating count of unique words in a cell or cells

You could use a worksheet formula:

=SUMPRODUCT((ref<"")*(LEN(TRIM(ref))+1-LEN(SUBSTITUTE(ref," ",""))))

Regards
Peter T


Ignore - I totally misread "unique"
Peter T




  #11   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 157
Default Genarating count of unique words in a cell or cells

Hi Peter,

No problem.

The fact that you people are out there helping problem posers ,gladdens me a
lot.

--
Thanks a lot,
Hari
India


  #12   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default Genarating count of unique words in a cell or cells

Hi Hari,

Thanks for that, maybe it will come in handy as a "total" word count!

Regards,
Peter T

Hi Peter,

No problem.

The fact that you people are out there helping problem posers ,gladdens me

a
lot.

--
Thanks a lot,
Hari
India



  #13   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Genarating count of unique words in a cell or cells

Just wanted to let you know that this valuable post has helped out myself as
well. We're experimenting with it right now. What would you recommend to
exclude some common words such as "a", "the", "etc" or how would you build
such a list for it to bounce against?

"Tim Williams" wrote:

Hari,

Modify the Find() line:

Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False,
lookat:=xlWhole)

xlWhole will match the complete cell contents and not just a
substring.

Tim.



"Hari" < wrote in message
...
Hi Tim,

Thanx a ton for posting the codes. Just to tell you of why I needed
it, I
analyse Market Research data and I needed count of unique words to
analyse
open ended responses.
For example I am studying/tracking the usage of Software Development
tools.
I ran your code on the following 8 responses (8 rows of data).

hot dog pro
As 400 RPG
adobe photo workshop
microfocus emulators
html
ibm web sphere
vx works
powerhouse

The results Im getting is :-

hot 1
dog 1
pro 1
As 1
400 1
RPG 1
adobe 1
photo 1
workshop 2
microfocus 1
emulators 1
html 1
ibm 1
web 1
sphere 1
vx 1
powerhouse 1

Whats happening is that the SUB is treating "works" which is in the
7th row
same as "workshop" which is in the 3rd row. Consequently the count
of
"workshop" is being shown as 2 while "works" doesnt appear in the
result.
Please tell me whether it would be possible to modify the code in
order to
get the count for "workshop" as 1 and count of "works" as 1.




  #14   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Genarating count of unique words in a cell or cells

Hi Jeff,

What would you recommend to exclude some common words such as "a", "the",
"etc" or how would you build

such a list for it to bounce against?

I am also facing a similar problem. Since, this wasnt related to excel I
didnt broach this topic before. As far as your question of excluding common
words is concerned, that is easy. Because Tim W has provided -- arrReplace =
Array(vbTab, ":", ";", ".", ",", """", Chr(10), Chr(13)) --so if we want to
remove articles, nouns, pronouns etc one could just add to the above list.
But for me the problem is from where to get an authoritative list of nouns,
pronouns, etc. (in soft copy format) which I could just add to the
arrReplace list. I searched Google (but not too hard) and couldnt get one.

To add another dimension to it, (though Im not sure whether it would
affect/matter in your case) if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same. So, how to instruct the
algorithm that consider various parts of the speech as the same. I do have a
solution in the sense that in a sheet (or in a Access database - though I
dont know access) have all the words (with their parts of speech) and give
words with similar parts of speech as same code. But again just like the
previous case I would have to get a Authoritative list of all words in the
english language.(words in common usage - not the esoteric or a filed
specific word). Is such a list available over web?
--
Thanks a lot,
Hari
India

"Jeff Saathoff" wrote in message
...
Just wanted to let you know that this valuable post has helped out myself
as
well. We're experimenting with it right now. What would you recommend to
exclude some common words such as "a", "the", "etc" or how would you build
such a list for it to bounce against?

"Tim Williams" wrote:

Hari,

Modify the Find() line:

Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False,
lookat:=xlWhole)

xlWhole will match the complete cell contents and not just a
substring.

Tim.



"Hari" < wrote in message
...
Hi Tim,

Thanx a ton for posting the codes. Just to tell you of why I needed
it, I
analyse Market Research data and I needed count of unique words to
analyse
open ended responses.
For example I am studying/tracking the usage of Software Development
tools.
I ran your code on the following 8 responses (8 rows of data).

hot dog pro
As 400 RPG
adobe photo workshop
microfocus emulators
html
ibm web sphere
vx works
powerhouse

The results Im getting is :-

hot 1
dog 1
pro 1
As 1
400 1
RPG 1
adobe 1
photo 1
workshop 2
microfocus 1
emulators 1
html 1
ibm 1
web 1
sphere 1
vx 1
powerhouse 1

Whats happening is that the SUB is treating "works" which is in the
7th row
same as "workshop" which is in the 3rd row. Consequently the count
of
"workshop" is being shown as 2 while "works" doesnt appear in the
result.
Please tell me whether it would be possible to modify the code in
order to
get the count for "workshop" as 1 and count of "works" as 1.






  #15   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 863
Default Genarating count of unique words in a cell or cells

if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same


You may not have much company in that opinion. Beautiful is an adjective,
beauty is a noun. You can't take a sentence and that uses one of these words
and replace it with the other and end up with something that is gramatically
correct.




  #16   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Genarating count of unique words in a cell or cells

Hi Myrna,

Just to give you an idea, I analyse MR data.

So if a question is asked -- "Please describe this product" -- and if one
respondent says "It's very Beautiful" and if another respondent says "It's a
thing of beauty".
For me end meaning of both the sentences is same. So, before loading to
SPSS( statistics software) I would assign a code ( numerical equivalent to a
particular verbal response) which is equal for both the responses. (I would
like to perform analysis starting from simple like how many rated product
being a beauty etc. to more complex depending on client requirements).

So, what I meant was to have a dictionary (in soft format having nouns
adjectives of every word) against which I could check and programmatically
manipulate the same. For me in this case (programming has been fine because
of Tim W's help) but now Iam on another barrier which is to get a list of
all words in let's say English language.

Please note the above question was just a sample, a respondent could use
multiple "adjectives" or "nouns" to describe the product. Like for Apple
I-pod they may say "It's a thing of beauty, very cool, so many options...."
etc.

Thanks a lot,
Hari
India

"Myrna Larson" wrote in message
...
if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same


You may not have much company in that opinion. Beautiful is an adjective,
beauty is a noun. You can't take a sentence and that uses one of these
words
and replace it with the other and end up with something that is
gramatically
correct.




  #17   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 863
Default Genarating count of unique words in a cell or cells

Hmmm.... I see your problem. I guess this is why questionnaires usually
consist of "multiple choice" questions rather than "essay" questions <g.

If you can get into a data file used by a dictionary program or spell-checker,
that might be a start.

To answer your question,
On Tue, 25 Jan 2005 20:21:21 +0530, "Hari Prasadh"
wrote:

Hi Myrna,

Just to give you an idea, I analyse MR data.

So if a question is asked -- "Please describe this product" -- and if one
respondent says "It's very Beautiful" and if another respondent says "It's a
thing of beauty".
For me end meaning of both the sentences is same. So, before loading to
SPSS( statistics software) I would assign a code ( numerical equivalent to a
particular verbal response) which is equal for both the responses. (I would
like to perform analysis starting from simple like how many rated product
being a beauty etc. to more complex depending on client requirements).

So, what I meant was to have a dictionary (in soft format having nouns
adjectives of every word) against which I could check and programmatically
manipulate the same. For me in this case (programming has been fine because
of Tim W's help) but now Iam on another barrier which is to get a list of
all words in let's say English language.

Please note the above question was just a sample, a respondent could use
multiple "adjectives" or "nouns" to describe the product. Like for Apple
I-pod they may say "It's a thing of beauty, very cool, so many options...."
etc.

Thanks a lot,
Hari
India

"Myrna Larson" wrote in message
.. .
if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same


You may not have much company in that opinion. Beautiful is an adjective,
beauty is a noun. You can't take a sentence and that uses one of these
words
and replace it with the other and end up with something that is
gramatically
correct.




  #18   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Genarating count of unique words in a cell or cells

Hi,

I guess this is why questionnaires usually
consist of "multiple choice" questions rather than "essay" questions <g.


You hit it right on the nail. If only all questionnares were Closed-end
Multiple choice questions, things would have been so much easier.
Why closed end? Because even in a multiple choice question most of the times
(a very substantial amount) we have one option called "Other" (which is open
ended kind of question as a respondent can mention anything under the sun
within the other option). And *Other* is again to be dealt in the same way
like an Essay question.

For ex. if the question is --" Which of the following reasons, if any,
describe why you did not purchase anything today? (Please check all that
apply)"

a) You didnt find a style you liked
b) You couldn find your size
c) A product didnt fit right
d) The product was too expensive
e) You werent planning to buy/ just browsing today
f) Other (Please specify below)

In the above case option f) is a candidate for Essay kind of manipulative
analysis (Please note, there were many more options between e) and f) but
couldnt type anymore. Above is just for clarifcation/demonstration).

A query. Any idea whether "data file" used by dictionary program or
spell-checker would be availbale somewhere in the web.


Thanks a lot,
Hari
India

"Myrna Larson" wrote in message
...
Hmmm.... I see your problem. I guess this is why questionnaires usually
consist of "multiple choice" questions rather than "essay" questions <g.

If you can get into a data file used by a dictionary program or
spell-checker,
that might be a start.

To answer your question,
On Tue, 25 Jan 2005 20:21:21 +0530, "Hari Prasadh"

wrote:

Hi Myrna,

Just to give you an idea, I analyse MR data.

So if a question is asked -- "Please describe this product" -- and if one
respondent says "It's very Beautiful" and if another respondent says "It's
a
thing of beauty".
For me end meaning of both the sentences is same. So, before loading to
SPSS( statistics software) I would assign a code ( numerical equivalent to
a
particular verbal response) which is equal for both the responses. (I
would
like to perform analysis starting from simple like how many rated product
being a beauty etc. to more complex depending on client requirements).

So, what I meant was to have a dictionary (in soft format having nouns
adjectives of every word) against which I could check and programmatically
manipulate the same. For me in this case (programming has been fine
because
of Tim W's help) but now Iam on another barrier which is to get a list of
all words in let's say English language.

Please note the above question was just a sample, a respondent could use
multiple "adjectives" or "nouns" to describe the product. Like for Apple
I-pod they may say "It's a thing of beauty, very cool, so many
options...."
etc.

Thanks a lot,
Hari
India

"Myrna Larson" wrote in message
. ..
if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same

You may not have much company in that opinion. Beautiful is an
adjective,
beauty is a noun. You can't take a sentence and that uses one of these
words
and replace it with the other and end up with something that is
gramatically
correct.






  #19   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Genarating count of unique words in a cell or cells

Hi Myrna,

Missed to add one thing.

The problem of soft-copy of a dictionary is one thing which is holding me up
in another related post of mine "Developing TEXT scrambler kind of FUNCTIONS
in Excel". Basically these 2 posts are parts of a related sequence in
automatically processing of essay kind of questions

Thanks a lot,
Hari
India

"Hari Prasadh" wrote in message
...
Hi,

I guess this is why questionnaires usually
consist of "multiple choice" questions rather than "essay" questions <g.


You hit it right on the nail. If only all questionnares were Closed-end
Multiple choice questions, things would have been so much easier.
Why closed end? Because even in a multiple choice question most of the
times (a very substantial amount) we have one option called "Other" (which
is open ended kind of question as a respondent can mention anything under
the sun within the other option). And *Other* is again to be dealt in the
same way like an Essay question.

For ex. if the question is --" Which of the following reasons, if any,
describe why you did not purchase anything today? (Please check all that
apply)"

a) You didnt find a style you liked
b) You couldn find your size
c) A product didnt fit right
d) The product was too expensive
e) You werent planning to buy/ just browsing today
f) Other (Please specify below)

In the above case option f) is a candidate for Essay kind of manipulative
analysis (Please note, there were many more options between e) and f) but
couldnt type anymore. Above is just for clarifcation/demonstration).

A query. Any idea whether "data file" used by dictionary program or
spell-checker would be availbale somewhere in the web.


Thanks a lot,
Hari
India

"Myrna Larson" wrote in message
...
Hmmm.... I see your problem. I guess this is why questionnaires usually
consist of "multiple choice" questions rather than "essay" questions <g.

If you can get into a data file used by a dictionary program or
spell-checker,
that might be a start.

To answer your question,
On Tue, 25 Jan 2005 20:21:21 +0530, "Hari Prasadh"

wrote:

Hi Myrna,

Just to give you an idea, I analyse MR data.

So if a question is asked -- "Please describe this product" -- and if one
respondent says "It's very Beautiful" and if another respondent says
"It's a
thing of beauty".
For me end meaning of both the sentences is same. So, before loading to
SPSS( statistics software) I would assign a code ( numerical equivalent
to a
particular verbal response) which is equal for both the responses. (I
would
like to perform analysis starting from simple like how many rated product
being a beauty etc. to more complex depending on client requirements).

So, what I meant was to have a dictionary (in soft format having nouns
adjectives of every word) against which I could check and
programmatically
manipulate the same. For me in this case (programming has been fine
because
of Tim W's help) but now Iam on another barrier which is to get a list of
all words in let's say English language.

Please note the above question was just a sample, a respondent could use
multiple "adjectives" or "nouns" to describe the product. Like for Apple
I-pod they may say "It's a thing of beauty, very cool, so many
options...."
etc.

Thanks a lot,
Hari
India

"Myrna Larson" wrote in message
...
if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same

You may not have much company in that opinion. Beautiful is an
adjective,
beauty is a noun. You can't take a sentence and that uses one of these
words
and replace it with the other and end up with something that is
gramatically
correct.








  #20   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 5,600
Default Genarating count of unique words in a cell or cells

A query. Any idea whether "data file" used by dictionary program or
spell-checker would be availbale somewhere in the web.


Hi Hari,

Have a look at the on-line version of Wordnet, and also look at the extra
possibilities in the secondary dropdown.

It seems you can download the entire application and database - and
interface with your own application!

http://wordnet.princeton.edu/

I hinted in your other post you have a mighty challenge to achieve your
goal, but a fascinating one. Cutting edge "AI" stuff, I think.

Regards,
Peter T




  #21   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Genarating count of unique words in a cell or cells

Hi Peter,

Thnx a lot for your link.
(I came across a research paper on text mining during the weekend -
http://www.sims.berkeley.edu/~hearst...acl99-tdm.html - and it
mentions Wordnet.)

Regards,
Hari
India

"Peter T" <peter_t@discussions wrote in message
...
A query. Any idea whether "data file" used by dictionary program or
spell-checker would be availbale somewhere in the web.


Hi Hari,

Have a look at the on-line version of Wordnet, and also look at the extra
possibilities in the secondary dropdown.

It seems you can download the entire application and database - and
interface with your own application!

http://wordnet.princeton.edu/

I hinted in your other post you have a mighty challenge to achieve your
goal, but a fascinating one. Cutting edge "AI" stuff, I think.

Regards,
Peter T




  #22   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 1
Default Genarating count of unique words in a cell or cells

Tim,

Is it possible to select a column vs a specific cell? I ran your program
below and it worked great. However, I need to run the program against a very
large amount of cells in column A. Is there a way to perform this task
without typing in each cell?(see example) I thought I could type a range
AddWordCount ActiveSheet.Range("A1:A1000") but I receive an error.

example:
AddWordCount ActiveSheet.Range("A1").Value
AddWordCount ActiveSheet.Range("A2").Value
AddWordCount ActiveSheet.Range("A3").Value
AddWordCount ActiveSheet.Range("A4").Value


Thank you in advance for any assistance you can provide.

Sincerley,

Todd Rein


"Tim Williams" wrote:

Hari,

Try this - seems to work OK but you should test it before using.

Regards
Tim


Option Explicit

Sub tester()
AddWordCount ActiveSheet.Range("A1").Value
AddWordCount ActiveSheet.Range("A2").Value
End Sub

Sub AddWordCount(sText As String)

Const COL_WORDS As Integer = 2
Const COL_COUNTS As Integer = 3
Const ROW_START As Integer = 1
Const MAX_ROWS As Integer = 10000

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim tmp As String
Dim lRow As Long
Dim lLastRow As Long
Dim rngSrch As Range, rngWord As Range

'find extent of current count
lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row
If lLastRow = 0 Then lLastRow = 1
Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _
ActiveSheet.Cells(lLastRow, COL_WORDS))

arrReplace = Array(vbTab, ":", ";", ".", ",", _
"""", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")

For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then

On Error Resume Next
Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False)
On Error GoTo 0

If rngWord Is Nothing Then
lLastRow = lLastRow + 1
Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1)
With ActiveSheet.Cells(lLastRow, COL_WORDS)
.Value = tmp
.Offset(0, 1).Value = 1
End With
Else
rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value +
1
End If

End If
Next x


End Sub

"Hari" wrote in message
...
Hi Tim,

(Im sorry, my computer has some bios/date problem, so Im Re-Posting
the
below message after correcting the date/time setting)

Thanks a lot for your code.

I have a small change if possible.

Presently if In A1 I have --- I have measles. I also have TB.
and if in A2 I have --- I want to go to Paris in order to cure my
TB.
Then in B1I get the below result

{I} : 2
{have} : 2
{measles} : 1
{also} : 1
{TB} : 1

and for getting the below result in B2 (by running your sub again by
changing the address)

{I} : 1
{want} : 1
{to} : 3
{go} : 1
{Paris} : 1
{in} : 1
{order} : 1
{cure} : 1
{my} : 1
{TB} : 1





  #23   Report Post  
Posted to microsoft.public.excel.programming
external usenet poster
 
Posts: 63
Default Genarating count of unique words in a cell or cells


Hi Todd,

Probably you want the following. Modify it as required

dim i as integer
i= 1
Do
AddWordCount ActiveSheet.Range("A" & i).Value
i = i + 1
Loop Until i = Range("A65536").End(xlUp).Row + 1

Please note if you have more than 32,767 rows you will have to assign some
probably a long data type to i. Check out in Help.

Thanks a lot,
Hari
India

"Todd Rein" wrote in message
...
Tim,

Is it possible to select a column vs a specific cell? I ran your program
below and it worked great. However, I need to run the program against a
very
large amount of cells in column A. Is there a way to perform this task
without typing in each cell?(see example) I thought I could type a range
AddWordCount ActiveSheet.Range("A1:A1000") but I receive an error.

example:
AddWordCount ActiveSheet.Range("A1").Value
AddWordCount ActiveSheet.Range("A2").Value
AddWordCount ActiveSheet.Range("A3").Value
AddWordCount ActiveSheet.Range("A4").Value


Thank you in advance for any assistance you can provide.

Sincerley,

Todd Rein


"Tim Williams" wrote:

Hari,

Try this - seems to work OK but you should test it before using.

Regards
Tim


Option Explicit

Sub tester()
AddWordCount ActiveSheet.Range("A1").Value
AddWordCount ActiveSheet.Range("A2").Value
End Sub

Sub AddWordCount(sText As String)

Const COL_WORDS As Integer = 2
Const COL_COUNTS As Integer = 3
Const ROW_START As Integer = 1
Const MAX_ROWS As Integer = 10000

Dim x As Integer
Dim arrWords As Variant
Dim arrReplace As Variant
Dim tmp As String
Dim lRow As Long
Dim lLastRow As Long
Dim rngSrch As Range, rngWord As Range

'find extent of current count
lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row
If lLastRow = 0 Then lLastRow = 1
Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _
ActiveSheet.Cells(lLastRow, COL_WORDS))

arrReplace = Array(vbTab, ":", ";", ".", ",", _
"""", Chr(10), Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(sText, " ")

For x = LBound(arrWords) To UBound(arrWords)
tmp = Trim(arrWords(x))
If tmp < "" Then

On Error Resume Next
Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False)
On Error GoTo 0

If rngWord Is Nothing Then
lLastRow = lLastRow + 1
Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1)
With ActiveSheet.Cells(lLastRow, COL_WORDS)
.Value = tmp
.Offset(0, 1).Value = 1
End With
Else
rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value +
1
End If

End If
Next x


End Sub

"Hari" wrote in message
...
Hi Tim,

(Im sorry, my computer has some bios/date problem, so Im Re-Posting
the
below message after correcting the date/time setting)

Thanks a lot for your code.

I have a small change if possible.

Presently if In A1 I have --- I have measles. I also have TB.
and if in A2 I have --- I want to go to Paris in order to cure my
TB.
Then in B1I get the below result

{I} : 2
{have} : 2
{measles} : 1
{also} : 1
{TB} : 1

and for getting the below result in B2 (by running your sub again by
changing the address)

{I} : 1
{want} : 1
{to} : 3
{go} : 1
{Paris} : 1
{in} : 1
{order} : 1
{cure} : 1
{my} : 1
{TB} : 1







Reply
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules

Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
Count the words in a cell Elton Law[_2_] Excel Worksheet Functions 8 May 20th 10 09:54 PM
Is it possible to count specific words in a number of cells? theboytree Excel Discussion (Misc queries) 1 June 15th 06 09:29 AM
How do I count the number of words in a cell? Phil Excel Worksheet Functions 3 May 2nd 06 07:13 PM
count cells with unique numbers Alex Excel Worksheet Functions 1 February 21st 05 07:46 PM
Count the number of words in a cell! Doom3 Excel Worksheet Functions 4 November 23rd 04 06:00 AM


All times are GMT +1. The time now is 03:25 AM.

Powered by vBulletin® Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 ExcelBanter.
The comments are property of their posters.
 

About Us

"It's about Microsoft Excel"