Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi,
Suppose in cell A1 I have a text --- "I had some bread in morning. I had some eggs at night." Is it possible to programmatically get a count of unique words in the above string. Like in column B the unique words are listed one word in each row in the following manner..B1 -- "I" B2 -- "I B2 -- "had B3 -- "some" B4 -- "bread" B5 -- "in" B6 -- "morning" B7 -- "eggs" B8 -- "at" B9 -- "night" Actually I can use excel's text to columns feature ( using space as a delimiter) but automating is a problem. Like I have 500 or so rows of data and out of those rows I want to get count of unique words within those 500 rows ( basically a consolidated unique list..). Now, each row would be having different number of words so, programmatically how would it be implemented. Please guide me if possible. Regards, Hari India |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
You could get the text from the cell and use split() to create an
array of words. Exactly how you did this would depend on what you'd consider as a "word". You might want to use replace() on the string before splitting it to replace newlines, tabs etc with spaces 'eg (untested) dim sTxt as string, arrWords sTxt=activesheet.range("A1").value sTxt=replace(sTxt,vbTab," ") sTxt=replace(sTxt,vbcrlf," ") arrWords=split(sTxt," ") Once you have the array you could loop through it and count the occurences of the words: one way might be to use a dictionary object where the word is the key and the value would be incremented each time the word crops up. Then loop though the dictionary and write the contents to column B. tim "Hari" wrote in message ... Hi, Suppose in cell A1 I have a text --- "I had some bread in morning. I had some eggs at night." Is it possible to programmatically get a count of unique words in the above string. Like in column B the unique words are listed one word in each row in the following manner..B1 -- "I" B2 -- "I B2 -- "had B3 -- "some" B4 -- "bread" B5 -- "in" B6 -- "morning" B7 -- "eggs" B8 -- "at" B9 -- "night" Actually I can use excel's text to columns feature ( using space as a delimiter) but automating is a problem. Like I have 500 or so rows of data and out of those rows I want to get count of unique words within those 500 rows ( basically a consolidated unique list..). Now, each row would be having different number of words so, programmatically how would it be implemented. Please guide me if possible. Regards, Hari India |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Try this. Requires a reference to the Scripting runtime library.
Tim. Sub tester() ActiveSheet.Range("B1").Value = CountWords(ActiveSheet.Range("A1").Value) End Sub Function CountWords(sText As String) As String Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim oDict As New Scripting.Dictionary Dim tmp As String Dim k As Variant Dim sReturn As String arrReplace = Array(vbTab, ":", ";", ".", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") oDict.CompareMode = TextCompare 'case-insensitive For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then oDict(tmp) = IIf(oDict.Exists(tmp), oDict(tmp) + 1, 1) End If Next x sReturn = "" For Each k In oDict.Keys sReturn = sReturn & "{" & k & "} : " & oDict(k) & Chr(10) Next k CountWords = sReturn End Function Tim. "Hari" wrote in message ... Hi, Suppose in cell A1 I have a text --- "I had some bread in morning. I had some eggs at night." Is it possible to programmatically get a count of unique words in the above string. Like in column B the unique words are listed one word in each row in the following manner..B1 -- "I" B2 -- "I B2 -- "had B3 -- "some" B4 -- "bread" B5 -- "in" B6 -- "morning" B7 -- "eggs" B8 -- "at" B9 -- "night" Actually I can use excel's text to columns feature ( using space as a delimiter) but automating is a problem. Like I have 500 or so rows of data and out of those rows I want to get count of unique words within those 500 rows ( basically a consolidated unique list..). Now, each row would be having different number of words so, programmatically how would it be implemented. Please guide me if possible. Regards, Hari India |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Hari,
You could use a worksheet formula: =SUMPRODUCT((ref<"")*(LEN(TRIM(ref))+1-LEN(SUBSTITUTE(ref," ","")))) Regards Peter T "Hari" wrote in message ... Hi, Suppose in cell A1 I have a text --- "I had some bread in morning. I had some eggs at night." Is it possible to programmatically get a count of unique words in the above string. Like in column B the unique words are listed one word in each row in the following manner..B1 -- "I" B2 -- "I B2 -- "had B3 -- "some" B4 -- "bread" B5 -- "in" B6 -- "morning" B7 -- "eggs" B8 -- "at" B9 -- "night" Actually I can use excel's text to columns feature ( using space as a delimiter) but automating is a problem. Like I have 500 or so rows of data and out of those rows I want to get count of unique words within those 500 rows ( basically a consolidated unique list..). Now, each row would be having different number of words so, programmatically how would it be implemented. Please guide me if possible. Regards, Hari India |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Tim,
(Im sorry, my computer has some bios/date problem, so Im Re-Posting the below message after correcting the date/time setting) Thanks a lot for your code. I have a small change if possible. Presently if In A1 I have --- I have measles. I also have TB. and if in A2 I have --- I want to go to Paris in order to cure my TB. Then in B1I get the below result {I} : 2 {have} : 2 {measles} : 1 {also} : 1 {TB} : 1 and for getting the below result in B2 (by running your sub again by changing the address) {I} : 1 {want} : 1 {to} : 3 {go} : 1 {Paris} : 1 {in} : 1 {order} : 1 {cure} : 1 {my} : 1 {TB} : 1 I want to know if it is possible to .. a) To have a consolidated count of words in both the rows. That is the above result to combine the words in both rows 1 and 2 for giving a total result. b) Is it is possible to have the count of words listed in different rows rather than within a single cell (as per help you have used a chr(10) for linefeed character). Accordingly have the counts in Column C for the corresponding unique word. For example for the above example with 2 rows, I visualise the following result :- Column -- A B C Row 1 I have measles. I also have TB. I 3 2 I want to go to Paris in order to cure my TB. Have 2 3 Measles 2 4 also 1 5 TB 2 6 want 1 7 to 3 8 go 1 9 Paris 1 10 in 1 11 order 1 12 cure 1 13 my 1 Actually why am requesting for the above way is I want to analyse count of unique words across 500 rows or so and a consolidated list would help me a lot. - Regards, Hari India "Tim Williams" <saxifrax@pacbell*dot*net wrote in message ... Try this. Requires a reference to the Scripting runtime library. Tim. Sub tester() ActiveSheet.Range("B1").Value = CountWords(ActiveSheet.Range("A1").Value) End Sub Function CountWords(sText As String) As String Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim oDict As New Scripting.Dictionary Dim tmp As String Dim k As Variant Dim sReturn As String arrReplace = Array(vbTab, ":", ";", ".", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") oDict.CompareMode = TextCompare 'case-insensitive For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then oDict(tmp) = IIf(oDict.Exists(tmp), oDict(tmp) + 1, 1) End If Next x sReturn = "" For Each k In oDict.Keys sReturn = sReturn & "{" & k & "} : " & oDict(k) & Chr(10) Next k CountWords = sReturn End Function Tim. "Hari" wrote in message ... Hi, Suppose in cell A1 I have a text --- "I had some bread in morning. I had some eggs at night." Is it possible to programmatically get a count of unique words in the above string. Like in column B the unique words are listed one word in each row in the following manner..B1 -- "I" B2 -- "I B2 -- "had B3 -- "some" B4 -- "bread" B5 -- "in" B6 -- "morning" B7 -- "eggs" B8 -- "at" B9 -- "night" Actually I can use excel's text to columns feature ( using space as a delimiter) but automating is a problem. Like I have 500 or so rows of data and out of those rows I want to get count of unique words within those 500 rows ( basically a consolidated unique list..). Now, each row would be having different number of words so, programmatically how would it be implemented. Please guide me if possible. Regards, Hari India -- Regards, Hari India "Tim Williams" <saxifrax@pacbell*dot*net wrote in message ... Try this. Requires a reference to the Scripting runtime library. Tim. Sub tester() ActiveSheet.Range("B1").Value = CountWords(ActiveSheet.Range("A1").Value) End Sub Function CountWords(sText As String) As String Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim oDict As New Scripting.Dictionary Dim tmp As String Dim k As Variant Dim sReturn As String arrReplace = Array(vbTab, ":", ";", ".", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") oDict.CompareMode = TextCompare 'case-insensitive For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then oDict(tmp) = IIf(oDict.Exists(tmp), oDict(tmp) + 1, 1) End If Next x sReturn = "" For Each k In oDict.Keys sReturn = sReturn & "{" & k & "} : " & oDict(k) & Chr(10) Next k CountWords = sReturn End Function Tim. "Hari" wrote in message ... Hi, Suppose in cell A1 I have a text --- "I had some bread in morning. I had some eggs at night." Is it possible to programmatically get a count of unique words in the above string. Like in column B the unique words are listed one word in each row in the following manner..B1 -- "I" B2 -- "I B2 -- "had B3 -- "some" B4 -- "bread" B5 -- "in" B6 -- "morning" B7 -- "eggs" B8 -- "at" B9 -- "night" Actually I can use excel's text to columns feature ( using space as a delimiter) but automating is a problem. Like I have 500 or so rows of data and out of those rows I want to get count of unique words within those 500 rows ( basically a consolidated unique list..). Now, each row would be having different number of words so, programmatically how would it be implemented. Please guide me if possible. Regards, Hari India |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hari,
Try this - seems to work OK but you should test it before using. Regards Tim Option Explicit Sub tester() AddWordCount ActiveSheet.Range("A1").Value AddWordCount ActiveSheet.Range("A2").Value End Sub Sub AddWordCount(sText As String) Const COL_WORDS As Integer = 2 Const COL_COUNTS As Integer = 3 Const ROW_START As Integer = 1 Const MAX_ROWS As Integer = 10000 Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim tmp As String Dim lRow As Long Dim lLastRow As Long Dim rngSrch As Range, rngWord As Range 'find extent of current count lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row If lLastRow = 0 Then lLastRow = 1 Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _ ActiveSheet.Cells(lLastRow, COL_WORDS)) arrReplace = Array(vbTab, ":", ";", ".", ",", _ """", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then On Error Resume Next Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False) On Error GoTo 0 If rngWord Is Nothing Then lLastRow = lLastRow + 1 Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1) With ActiveSheet.Cells(lLastRow, COL_WORDS) .Value = tmp .Offset(0, 1).Value = 1 End With Else rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value + 1 End If End If Next x End Sub "Hari" wrote in message ... Hi Tim, (Im sorry, my computer has some bios/date problem, so Im Re-Posting the below message after correcting the date/time setting) Thanks a lot for your code. I have a small change if possible. Presently if In A1 I have --- I have measles. I also have TB. and if in A2 I have --- I want to go to Paris in order to cure my TB. Then in B1I get the below result {I} : 2 {have} : 2 {measles} : 1 {also} : 1 {TB} : 1 and for getting the below result in B2 (by running your sub again by changing the address) {I} : 1 {want} : 1 {to} : 3 {go} : 1 {Paris} : 1 {in} : 1 {order} : 1 {cure} : 1 {my} : 1 {TB} : 1 |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Tim,
Thanx a ton for posting the codes. Just to tell you of why I needed it, I analyse Market Research data and I needed count of unique words to analyse open ended responses. For example I am studying/tracking the usage of Software Development tools. I ran your code on the following 8 responses (8 rows of data). hot dog pro As 400 RPG adobe photo workshop microfocus emulators html ibm web sphere vx works powerhouse The results Im getting is :- hot 1 dog 1 pro 1 As 1 400 1 RPG 1 adobe 1 photo 1 workshop 2 microfocus 1 emulators 1 html 1 ibm 1 web 1 sphere 1 vx 1 powerhouse 1 Whats happening is that the SUB is treating "works" which is in the 7th row same as "workshop" which is in the 3rd row. Consequently the count of "workshop" is being shown as 2 while "works" doesnt appear in the result. Please tell me whether it would be possible to modify the code in order to get the count for "workshop" as 1 and count of "works" as 1. (Just in case if u have a doubt on the futility of analysing, in a 'faulty' manner, by counting unique words where separating doesnt make sense - Like "adobe photo workshop" being one tool and separating these 3 - I plan to tackle that by running your sub first to get the initial stage 'frequency' and then I will bind those individual responses "adobe photo workshop" in to "AdobePhotoWorkshop". I will not bind those rows where 2 software tools are mentioned in a single row.) Regards, Hari India "Tim Williams" <saxifrax@pacbell*dot*net wrote in message ... Hari, Try this - seems to work OK but you should test it before using. Regards Tim Option Explicit Sub tester() AddWordCount ActiveSheet.Range("A1").Value AddWordCount ActiveSheet.Range("A2").Value End Sub Sub AddWordCount(sText As String) Const COL_WORDS As Integer = 2 Const COL_COUNTS As Integer = 3 Const ROW_START As Integer = 1 Const MAX_ROWS As Integer = 10000 Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim tmp As String Dim lRow As Long Dim lLastRow As Long Dim rngSrch As Range, rngWord As Range 'find extent of current count lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row If lLastRow = 0 Then lLastRow = 1 Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _ ActiveSheet.Cells(lLastRow, COL_WORDS)) arrReplace = Array(vbTab, ":", ";", ".", ",", _ """", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then On Error Resume Next Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False) On Error GoTo 0 If rngWord Is Nothing Then lLastRow = lLastRow + 1 Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1) With ActiveSheet.Cells(lLastRow, COL_WORDS) .Value = tmp .Offset(0, 1).Value = 1 End With Else rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value + 1 End If End If Next x End Sub "Hari" wrote in message ... Hi Tim, (Im sorry, my computer has some bios/date problem, so Im Re-Posting the below message after correcting the date/time setting) Thanks a lot for your code. I have a small change if possible. Presently if In A1 I have --- I have measles. I also have TB. and if in A2 I have --- I want to go to Paris in order to cure my TB. Then in B1I get the below result {I} : 2 {have} : 2 {measles} : 1 {also} : 1 {TB} : 1 and for getting the below result in B2 (by running your sub again by changing the address) {I} : 1 {want} : 1 {to} : 3 {go} : 1 {Paris} : 1 {in} : 1 {order} : 1 {cure} : 1 {my} : 1 {TB} : 1 |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hari,
Modify the Find() line: Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False, lookat:=xlWhole) xlWhole will match the complete cell contents and not just a substring. Tim. "Hari" < wrote in message ... Hi Tim, Thanx a ton for posting the codes. Just to tell you of why I needed it, I analyse Market Research data and I needed count of unique words to analyse open ended responses. For example I am studying/tracking the usage of Software Development tools. I ran your code on the following 8 responses (8 rows of data). hot dog pro As 400 RPG adobe photo workshop microfocus emulators html ibm web sphere vx works powerhouse The results Im getting is :- hot 1 dog 1 pro 1 As 1 400 1 RPG 1 adobe 1 photo 1 workshop 2 microfocus 1 emulators 1 html 1 ibm 1 web 1 sphere 1 vx 1 powerhouse 1 Whats happening is that the SUB is treating "works" which is in the 7th row same as "workshop" which is in the 3rd row. Consequently the count of "workshop" is being shown as 2 while "works" doesnt appear in the result. Please tell me whether it would be possible to modify the code in order to get the count for "workshop" as 1 and count of "works" as 1. |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Tim,
Thanks a lot for persisting with me. It works great now. Just wanted to SHARE this with you. Presently in the column C am getting count of a particular unique word by summing each instance of its occurence in the Column A. In Column D I wanted to see the count of unique word by summing the number of rows within column A that it appears. What I mean is if in cell A1, A2 the response is :- A1 - "VC++" A2 - "I use VC++, Cobol, Fortran. But mainly I use VC++" A3 - "I use Basic" Then the answers I get in Column C is VC++ 3 I 3 use 3 Cobol 1 Fortran 1 But 1 mainly 1 Basic 1 In column D I desired an output as given below. VC++ 2 I 2 use 2 Cobol 1 Fortran 1 But 1 mainly 1 Basic 1 The difference between column C and D is that if a unique word is found more than once within the same row then it is counted only once. Like in A2 "I", "VC++", "Use" , appears twice within the same row so its WEIGHTAGE in the global count is given only once. So, In effect the count of unique words is done by summing the number of rows within column A that it appears. For accomplishing that in cell D1 I wrote the following formula (array formula) {=SUM(IF(ISERROR(FIND(B2,$A$1:$A$8)),0,1))} and thankfully I am able get the result. I think by perusing newsgroup posts am learning (albeit slowly) to control array formulas a little better. Thanks to all you folks who support us. -- Regards, Hari India |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
You could use a worksheet formula:
=SUMPRODUCT((ref<"")*(LEN(TRIM(ref))+1-LEN(SUBSTITUTE(ref," ","")))) Regards Peter T Ignore - I totally misread "unique" Peter T |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Peter,
No problem. The fact that you people are out there helping problem posers ,gladdens me a lot. -- Thanks a lot, Hari India |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Hari,
Thanks for that, maybe it will come in handy as a "total" word count! Regards, Peter T Hi Peter, No problem. The fact that you people are out there helping problem posers ,gladdens me a lot. -- Thanks a lot, Hari India |
#13
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Just wanted to let you know that this valuable post has helped out myself as
well. We're experimenting with it right now. What would you recommend to exclude some common words such as "a", "the", "etc" or how would you build such a list for it to bounce against? "Tim Williams" wrote: Hari, Modify the Find() line: Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False, lookat:=xlWhole) xlWhole will match the complete cell contents and not just a substring. Tim. "Hari" < wrote in message ... Hi Tim, Thanx a ton for posting the codes. Just to tell you of why I needed it, I analyse Market Research data and I needed count of unique words to analyse open ended responses. For example I am studying/tracking the usage of Software Development tools. I ran your code on the following 8 responses (8 rows of data). hot dog pro As 400 RPG adobe photo workshop microfocus emulators html ibm web sphere vx works powerhouse The results Im getting is :- hot 1 dog 1 pro 1 As 1 400 1 RPG 1 adobe 1 photo 1 workshop 2 microfocus 1 emulators 1 html 1 ibm 1 web 1 sphere 1 vx 1 powerhouse 1 Whats happening is that the SUB is treating "works" which is in the 7th row same as "workshop" which is in the 3rd row. Consequently the count of "workshop" is being shown as 2 while "works" doesnt appear in the result. Please tell me whether it would be possible to modify the code in order to get the count for "workshop" as 1 and count of "works" as 1. |
#14
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Jeff,
What would you recommend to exclude some common words such as "a", "the", "etc" or how would you build such a list for it to bounce against? I am also facing a similar problem. Since, this wasnt related to excel I didnt broach this topic before. As far as your question of excluding common words is concerned, that is easy. Because Tim W has provided -- arrReplace = Array(vbTab, ":", ";", ".", ",", """", Chr(10), Chr(13)) --so if we want to remove articles, nouns, pronouns etc one could just add to the above list. But for me the problem is from where to get an authoritative list of nouns, pronouns, etc. (in soft copy format) which I could just add to the arrReplace list. I searched Google (but not too hard) and couldnt get one. To add another dimension to it, (though Im not sure whether it would affect/matter in your case) if a word"beautiful" and "beauty" appears in the target array, then for me both are one and the same. So, how to instruct the algorithm that consider various parts of the speech as the same. I do have a solution in the sense that in a sheet (or in a Access database - though I dont know access) have all the words (with their parts of speech) and give words with similar parts of speech as same code. But again just like the previous case I would have to get a Authoritative list of all words in the english language.(words in common usage - not the esoteric or a filed specific word). Is such a list available over web? -- Thanks a lot, Hari India "Jeff Saathoff" wrote in message ... Just wanted to let you know that this valuable post has helped out myself as well. We're experimenting with it right now. What would you recommend to exclude some common words such as "a", "the", "etc" or how would you build such a list for it to bounce against? "Tim Williams" wrote: Hari, Modify the Find() line: Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False, lookat:=xlWhole) xlWhole will match the complete cell contents and not just a substring. Tim. "Hari" < wrote in message ... Hi Tim, Thanx a ton for posting the codes. Just to tell you of why I needed it, I analyse Market Research data and I needed count of unique words to analyse open ended responses. For example I am studying/tracking the usage of Software Development tools. I ran your code on the following 8 responses (8 rows of data). hot dog pro As 400 RPG adobe photo workshop microfocus emulators html ibm web sphere vx works powerhouse The results Im getting is :- hot 1 dog 1 pro 1 As 1 400 1 RPG 1 adobe 1 photo 1 workshop 2 microfocus 1 emulators 1 html 1 ibm 1 web 1 sphere 1 vx 1 powerhouse 1 Whats happening is that the SUB is treating "works" which is in the 7th row same as "workshop" which is in the 3rd row. Consequently the count of "workshop" is being shown as 2 while "works" doesnt appear in the result. Please tell me whether it would be possible to modify the code in order to get the count for "workshop" as 1 and count of "works" as 1. |
#15
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
if a word"beautiful" and "beauty" appears in the
target array, then for me both are one and the same You may not have much company in that opinion. Beautiful is an adjective, beauty is a noun. You can't take a sentence and that uses one of these words and replace it with the other and end up with something that is gramatically correct. |
#16
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Myrna,
Just to give you an idea, I analyse MR data. So if a question is asked -- "Please describe this product" -- and if one respondent says "It's very Beautiful" and if another respondent says "It's a thing of beauty". For me end meaning of both the sentences is same. So, before loading to SPSS( statistics software) I would assign a code ( numerical equivalent to a particular verbal response) which is equal for both the responses. (I would like to perform analysis starting from simple like how many rated product being a beauty etc. to more complex depending on client requirements). So, what I meant was to have a dictionary (in soft format having nouns adjectives of every word) against which I could check and programmatically manipulate the same. For me in this case (programming has been fine because of Tim W's help) but now Iam on another barrier which is to get a list of all words in let's say English language. Please note the above question was just a sample, a respondent could use multiple "adjectives" or "nouns" to describe the product. Like for Apple I-pod they may say "It's a thing of beauty, very cool, so many options...." etc. Thanks a lot, Hari India "Myrna Larson" wrote in message ... if a word"beautiful" and "beauty" appears in the target array, then for me both are one and the same You may not have much company in that opinion. Beautiful is an adjective, beauty is a noun. You can't take a sentence and that uses one of these words and replace it with the other and end up with something that is gramatically correct. |
#17
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hmmm.... I see your problem. I guess this is why questionnaires usually
consist of "multiple choice" questions rather than "essay" questions <g. If you can get into a data file used by a dictionary program or spell-checker, that might be a start. To answer your question, On Tue, 25 Jan 2005 20:21:21 +0530, "Hari Prasadh" wrote: Hi Myrna, Just to give you an idea, I analyse MR data. So if a question is asked -- "Please describe this product" -- and if one respondent says "It's very Beautiful" and if another respondent says "It's a thing of beauty". For me end meaning of both the sentences is same. So, before loading to SPSS( statistics software) I would assign a code ( numerical equivalent to a particular verbal response) which is equal for both the responses. (I would like to perform analysis starting from simple like how many rated product being a beauty etc. to more complex depending on client requirements). So, what I meant was to have a dictionary (in soft format having nouns adjectives of every word) against which I could check and programmatically manipulate the same. For me in this case (programming has been fine because of Tim W's help) but now Iam on another barrier which is to get a list of all words in let's say English language. Please note the above question was just a sample, a respondent could use multiple "adjectives" or "nouns" to describe the product. Like for Apple I-pod they may say "It's a thing of beauty, very cool, so many options...." etc. Thanks a lot, Hari India "Myrna Larson" wrote in message .. . if a word"beautiful" and "beauty" appears in the target array, then for me both are one and the same You may not have much company in that opinion. Beautiful is an adjective, beauty is a noun. You can't take a sentence and that uses one of these words and replace it with the other and end up with something that is gramatically correct. |
#18
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi,
I guess this is why questionnaires usually consist of "multiple choice" questions rather than "essay" questions <g. You hit it right on the nail. If only all questionnares were Closed-end Multiple choice questions, things would have been so much easier. Why closed end? Because even in a multiple choice question most of the times (a very substantial amount) we have one option called "Other" (which is open ended kind of question as a respondent can mention anything under the sun within the other option). And *Other* is again to be dealt in the same way like an Essay question. For ex. if the question is --" Which of the following reasons, if any, describe why you did not purchase anything today? (Please check all that apply)" a) You didnt find a style you liked b) You couldn find your size c) A product didnt fit right d) The product was too expensive e) You werent planning to buy/ just browsing today f) Other (Please specify below) In the above case option f) is a candidate for Essay kind of manipulative analysis (Please note, there were many more options between e) and f) but couldnt type anymore. Above is just for clarifcation/demonstration). A query. Any idea whether "data file" used by dictionary program or spell-checker would be availbale somewhere in the web. Thanks a lot, Hari India "Myrna Larson" wrote in message ... Hmmm.... I see your problem. I guess this is why questionnaires usually consist of "multiple choice" questions rather than "essay" questions <g. If you can get into a data file used by a dictionary program or spell-checker, that might be a start. To answer your question, On Tue, 25 Jan 2005 20:21:21 +0530, "Hari Prasadh" wrote: Hi Myrna, Just to give you an idea, I analyse MR data. So if a question is asked -- "Please describe this product" -- and if one respondent says "It's very Beautiful" and if another respondent says "It's a thing of beauty". For me end meaning of both the sentences is same. So, before loading to SPSS( statistics software) I would assign a code ( numerical equivalent to a particular verbal response) which is equal for both the responses. (I would like to perform analysis starting from simple like how many rated product being a beauty etc. to more complex depending on client requirements). So, what I meant was to have a dictionary (in soft format having nouns adjectives of every word) against which I could check and programmatically manipulate the same. For me in this case (programming has been fine because of Tim W's help) but now Iam on another barrier which is to get a list of all words in let's say English language. Please note the above question was just a sample, a respondent could use multiple "adjectives" or "nouns" to describe the product. Like for Apple I-pod they may say "It's a thing of beauty, very cool, so many options...." etc. Thanks a lot, Hari India "Myrna Larson" wrote in message . .. if a word"beautiful" and "beauty" appears in the target array, then for me both are one and the same You may not have much company in that opinion. Beautiful is an adjective, beauty is a noun. You can't take a sentence and that uses one of these words and replace it with the other and end up with something that is gramatically correct. |
#19
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Myrna,
Missed to add one thing. The problem of soft-copy of a dictionary is one thing which is holding me up in another related post of mine "Developing TEXT scrambler kind of FUNCTIONS in Excel". Basically these 2 posts are parts of a related sequence in automatically processing of essay kind of questions Thanks a lot, Hari India "Hari Prasadh" wrote in message ... Hi, I guess this is why questionnaires usually consist of "multiple choice" questions rather than "essay" questions <g. You hit it right on the nail. If only all questionnares were Closed-end Multiple choice questions, things would have been so much easier. Why closed end? Because even in a multiple choice question most of the times (a very substantial amount) we have one option called "Other" (which is open ended kind of question as a respondent can mention anything under the sun within the other option). And *Other* is again to be dealt in the same way like an Essay question. For ex. if the question is --" Which of the following reasons, if any, describe why you did not purchase anything today? (Please check all that apply)" a) You didnt find a style you liked b) You couldn find your size c) A product didnt fit right d) The product was too expensive e) You werent planning to buy/ just browsing today f) Other (Please specify below) In the above case option f) is a candidate for Essay kind of manipulative analysis (Please note, there were many more options between e) and f) but couldnt type anymore. Above is just for clarifcation/demonstration). A query. Any idea whether "data file" used by dictionary program or spell-checker would be availbale somewhere in the web. Thanks a lot, Hari India "Myrna Larson" wrote in message ... Hmmm.... I see your problem. I guess this is why questionnaires usually consist of "multiple choice" questions rather than "essay" questions <g. If you can get into a data file used by a dictionary program or spell-checker, that might be a start. To answer your question, On Tue, 25 Jan 2005 20:21:21 +0530, "Hari Prasadh" wrote: Hi Myrna, Just to give you an idea, I analyse MR data. So if a question is asked -- "Please describe this product" -- and if one respondent says "It's very Beautiful" and if another respondent says "It's a thing of beauty". For me end meaning of both the sentences is same. So, before loading to SPSS( statistics software) I would assign a code ( numerical equivalent to a particular verbal response) which is equal for both the responses. (I would like to perform analysis starting from simple like how many rated product being a beauty etc. to more complex depending on client requirements). So, what I meant was to have a dictionary (in soft format having nouns adjectives of every word) against which I could check and programmatically manipulate the same. For me in this case (programming has been fine because of Tim W's help) but now Iam on another barrier which is to get a list of all words in let's say English language. Please note the above question was just a sample, a respondent could use multiple "adjectives" or "nouns" to describe the product. Like for Apple I-pod they may say "It's a thing of beauty, very cool, so many options...." etc. Thanks a lot, Hari India "Myrna Larson" wrote in message ... if a word"beautiful" and "beauty" appears in the target array, then for me both are one and the same You may not have much company in that opinion. Beautiful is an adjective, beauty is a noun. You can't take a sentence and that uses one of these words and replace it with the other and end up with something that is gramatically correct. |
#20
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
A query. Any idea whether "data file" used by dictionary program or
spell-checker would be availbale somewhere in the web. Hi Hari, Have a look at the on-line version of Wordnet, and also look at the extra possibilities in the secondary dropdown. It seems you can download the entire application and database - and interface with your own application! http://wordnet.princeton.edu/ I hinted in your other post you have a mighty challenge to achieve your goal, but a fascinating one. Cutting edge "AI" stuff, I think. Regards, Peter T |
#21
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Peter,
Thnx a lot for your link. (I came across a research paper on text mining during the weekend - http://www.sims.berkeley.edu/~hearst...acl99-tdm.html - and it mentions Wordnet.) Regards, Hari India "Peter T" <peter_t@discussions wrote in message ... A query. Any idea whether "data file" used by dictionary program or spell-checker would be availbale somewhere in the web. Hi Hari, Have a look at the on-line version of Wordnet, and also look at the extra possibilities in the secondary dropdown. It seems you can download the entire application and database - and interface with your own application! http://wordnet.princeton.edu/ I hinted in your other post you have a mighty challenge to achieve your goal, but a fascinating one. Cutting edge "AI" stuff, I think. Regards, Peter T |
#22
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Tim,
Is it possible to select a column vs a specific cell? I ran your program below and it worked great. However, I need to run the program against a very large amount of cells in column A. Is there a way to perform this task without typing in each cell?(see example) I thought I could type a range AddWordCount ActiveSheet.Range("A1:A1000") but I receive an error. example: AddWordCount ActiveSheet.Range("A1").Value AddWordCount ActiveSheet.Range("A2").Value AddWordCount ActiveSheet.Range("A3").Value AddWordCount ActiveSheet.Range("A4").Value Thank you in advance for any assistance you can provide. Sincerley, Todd Rein "Tim Williams" wrote: Hari, Try this - seems to work OK but you should test it before using. Regards Tim Option Explicit Sub tester() AddWordCount ActiveSheet.Range("A1").Value AddWordCount ActiveSheet.Range("A2").Value End Sub Sub AddWordCount(sText As String) Const COL_WORDS As Integer = 2 Const COL_COUNTS As Integer = 3 Const ROW_START As Integer = 1 Const MAX_ROWS As Integer = 10000 Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim tmp As String Dim lRow As Long Dim lLastRow As Long Dim rngSrch As Range, rngWord As Range 'find extent of current count lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row If lLastRow = 0 Then lLastRow = 1 Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _ ActiveSheet.Cells(lLastRow, COL_WORDS)) arrReplace = Array(vbTab, ":", ";", ".", ",", _ """", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then On Error Resume Next Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False) On Error GoTo 0 If rngWord Is Nothing Then lLastRow = lLastRow + 1 Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1) With ActiveSheet.Cells(lLastRow, COL_WORDS) .Value = tmp .Offset(0, 1).Value = 1 End With Else rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value + 1 End If End If Next x End Sub "Hari" wrote in message ... Hi Tim, (Im sorry, my computer has some bios/date problem, so Im Re-Posting the below message after correcting the date/time setting) Thanks a lot for your code. I have a small change if possible. Presently if In A1 I have --- I have measles. I also have TB. and if in A2 I have --- I want to go to Paris in order to cure my TB. Then in B1I get the below result {I} : 2 {have} : 2 {measles} : 1 {also} : 1 {TB} : 1 and for getting the below result in B2 (by running your sub again by changing the address) {I} : 1 {want} : 1 {to} : 3 {go} : 1 {Paris} : 1 {in} : 1 {order} : 1 {cure} : 1 {my} : 1 {TB} : 1 |
#23
Posted to microsoft.public.excel.programming
|
|||
|
|||
Genarating count of unique words in a cell or cells
Hi Todd, Probably you want the following. Modify it as required dim i as integer i= 1 Do AddWordCount ActiveSheet.Range("A" & i).Value i = i + 1 Loop Until i = Range("A65536").End(xlUp).Row + 1 Please note if you have more than 32,767 rows you will have to assign some probably a long data type to i. Check out in Help. Thanks a lot, Hari India "Todd Rein" wrote in message ... Tim, Is it possible to select a column vs a specific cell? I ran your program below and it worked great. However, I need to run the program against a very large amount of cells in column A. Is there a way to perform this task without typing in each cell?(see example) I thought I could type a range AddWordCount ActiveSheet.Range("A1:A1000") but I receive an error. example: AddWordCount ActiveSheet.Range("A1").Value AddWordCount ActiveSheet.Range("A2").Value AddWordCount ActiveSheet.Range("A3").Value AddWordCount ActiveSheet.Range("A4").Value Thank you in advance for any assistance you can provide. Sincerley, Todd Rein "Tim Williams" wrote: Hari, Try this - seems to work OK but you should test it before using. Regards Tim Option Explicit Sub tester() AddWordCount ActiveSheet.Range("A1").Value AddWordCount ActiveSheet.Range("A2").Value End Sub Sub AddWordCount(sText As String) Const COL_WORDS As Integer = 2 Const COL_COUNTS As Integer = 3 Const ROW_START As Integer = 1 Const MAX_ROWS As Integer = 10000 Dim x As Integer Dim arrWords As Variant Dim arrReplace As Variant Dim tmp As String Dim lRow As Long Dim lLastRow As Long Dim rngSrch As Range, rngWord As Range 'find extent of current count lLastRow = ActiveSheet.Cells(MAX_ROWS, COL_WORDS).End(xlUp).Row If lLastRow = 0 Then lLastRow = 1 Set rngSrch = Range(ActiveSheet.Cells(ROW_START, COL_WORDS), _ ActiveSheet.Cells(lLastRow, COL_WORDS)) arrReplace = Array(vbTab, ":", ";", ".", ",", _ """", Chr(10), Chr(13)) For x = LBound(arrReplace) To UBound(arrReplace) sText = Replace(sText, arrReplace(x), " ") Next x arrWords = Split(sText, " ") For x = LBound(arrWords) To UBound(arrWords) tmp = Trim(arrWords(x)) If tmp < "" Then On Error Resume Next Set rngWord = rngSrch.Find(What:=tmp, MatchCase:=False) On Error GoTo 0 If rngWord Is Nothing Then lLastRow = lLastRow + 1 Set rngSrch = rngSrch.Resize(rngSrch.Rows.Count + 1, 1) With ActiveSheet.Cells(lLastRow, COL_WORDS) .Value = tmp .Offset(0, 1).Value = 1 End With Else rngWord.Offset(0, 1).Value = rngWord.Offset(0, 1).Value + 1 End If End If Next x End Sub "Hari" wrote in message ... Hi Tim, (Im sorry, my computer has some bios/date problem, so Im Re-Posting the below message after correcting the date/time setting) Thanks a lot for your code. I have a small change if possible. Presently if In A1 I have --- I have measles. I also have TB. and if in A2 I have --- I want to go to Paris in order to cure my TB. Then in B1I get the below result {I} : 2 {have} : 2 {measles} : 1 {also} : 1 {TB} : 1 and for getting the below result in B2 (by running your sub again by changing the address) {I} : 1 {want} : 1 {to} : 3 {go} : 1 {Paris} : 1 {in} : 1 {order} : 1 {cure} : 1 {my} : 1 {TB} : 1 |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
Count the words in a cell | Excel Worksheet Functions | |||
Is it possible to count specific words in a number of cells? | Excel Discussion (Misc queries) | |||
How do I count the number of words in a cell? | Excel Worksheet Functions | |||
count cells with unique numbers | Excel Worksheet Functions | |||
Count the number of words in a cell! | Excel Worksheet Functions |