Home |
Search |
Today's Posts |
#1
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
Hi all,
I need a function to get the most frequent word in a cell. The length of the output must be at least 3. e.g. if a cell contents was "Vick rushed for 145 yards. Michael Vick handled things in regulation. established in 1881 ; Yards Gear for the Holidays car yards in crisis" , the output should be "yards". Any idea? Thank you. Umby |
#2
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
Umby
Try this Function MFW(Rng As Range) As String Dim arrWords As Variant Dim RangeText As String Dim i As Long Dim CurrCount As Long Dim MaxCount As Long Dim MaxWord As String RangeText = UCase(Rng.Text) RangeText = Replace(RangeText, ".", "") RangeText = Replace(RangeText, ",", "") RangeText = Replace(RangeText, ";", "") RangeText = Replace(RangeText, ":", "") arrWords = Split(RangeText, " ") For i = LBound(arrWords) To UBound(arrWords) If Len(arrWords(i)) = 3 Then CurrCount = (Len(RangeText) - _ Len(Replace(RangeText, arrWords(i), ""))) / Len(arrWords(i)) If CurrCount MaxCount Then MaxWord = arrWords(i) MaxCount = CurrCount End If End If Next i MFW = MaxWord End Function Note that I had to strip out punctuation because "yards." is not the same as "yards". I don't make any representation that the ones I elimate will be sufficient. Also, I converted everything to caps. -- Dick Kusleika MVP - Excel www.dicks-clicks.com Post all replies to the newsgroup. "Umby" wrote in message .. . Hi all, I need a function to get the most frequent word in a cell. The length of the output must be at least 3. e.g. if a cell contents was "Vick rushed for 145 yards. Michael Vick handled things in regulation. established in 1881 ; Yards Gear for the Holidays car yards in crisis" , the output should be "yards". Any idea? Thank you. Umby |
#3
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
thats a nice little function, however, if more than word appears with the same frequency, it only returns the last one found with the maxvalue ------------------------------------------------ ~~ Message posted from http://www.ExcelTip.com/ ~~View and post usenet messages directly from http://www.ExcelForum.com/ ~~Now Available: Financial Statements.xls, a step by step guide to creating financial statements |
#4
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
bdcrisp
Quite right. Except I think it returns the first one, not the last. If you want a semicolon delimited list of the most frequent words, you would have to adjust the function to this Function MFW(Rng As Range) As String Dim arrWords As Variant Dim RangeText As String Dim i As Long Dim CurrCount As Long Dim MaxCount As Long Dim MaxWord As String RangeText = UCase(Rng.Text) RangeText = Replace(RangeText, ".", "") RangeText = Replace(RangeText, ",", "") RangeText = Replace(RangeText, ";", "") RangeText = Replace(RangeText, ":", "") arrWords = Split(RangeText, " ") For i = LBound(arrWords) To UBound(arrWords) If Len(arrWords(i)) = 3 Then CurrCount = (Len(RangeText) - _ Len(Replace(RangeText, arrWords(i), ""))) / Len(arrWords(i)) If CurrCount = MaxCount Then If InStr(1, MaxWord, arrWords(i) & ";") = 0 Then MaxWord = MaxWord & arrWords(i) & ";" End If MaxCount = CurrCount End If End If Next i MFW = Left(MaxWord, Len(MaxWord) - 1) End Function -- Dick Kusleika MVP - Excel www.dicks-clicks.com Post all replies to the newsgroup. "bdcrisp" wrote in message ... thats a nice little function, however, if more than word appears with the same frequency, it only returns the last one found with the maxvalue ------------------------------------------------ ~~ Message posted from http://www.ExcelTip.com/ ~~View and post usenet messages directly from http://www.ExcelForum.com/ ~~Now Available: Financial Statements.xls, a step by step guide to creating financial statements |
#5
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
Thank you very much!
Umby -----Original Message----- Umby Try this Function MFW(Rng As Range) As String Dim arrWords As Variant Dim RangeText As String Dim i As Long Dim CurrCount As Long Dim MaxCount As Long Dim MaxWord As String RangeText = UCase(Rng.Text) RangeText = Replace(RangeText, ".", "") RangeText = Replace(RangeText, ",", "") RangeText = Replace(RangeText, ";", "") RangeText = Replace(RangeText, ":", "") arrWords = Split(RangeText, " ") For i = LBound(arrWords) To UBound(arrWords) If Len(arrWords(i)) = 3 Then CurrCount = (Len(RangeText) - _ Len(Replace(RangeText, arrWords(i), ""))) / Len(arrWords(i)) If CurrCount MaxCount Then MaxWord = arrWords(i) MaxCount = CurrCount End If End If Next i MFW = MaxWord End Function Note that I had to strip out punctuation because "yards." is not the same as "yards". I don't make any representation that the ones I elimate will be sufficient. Also, I converted everything to caps. -- Dick Kusleika MVP - Excel www.dicks-clicks.com Post all replies to the newsgroup. "Umby" wrote in message . .. Hi all, I need a function to get the most frequent word in a cell. The length of the output must be at least 3. e.g. if a cell contents was "Vick rushed for 145 yards. Michael Vick handled things in regulation. established in 1881 ; Yards Gear for the Holidays car yards in crisis" , the output should be "yards". Any idea? Thank you. Umby . |
#6
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
"Umby" ha scritto nel messaggio .. .
| Hi all, | I need a function to get the most frequent word in a cell. The length of the | output must be at least 3. | e.g. if a cell contents was "Vick rushed for 145 yards. Michael Vick handled | things in regulation. established in 1881 ; Yards Gear for the Holidays car | yards in crisis" , the output should be "yards". | | Any idea? in microsoft.public.it.office.excel thread: cercare parole date: dic 10, 2003 Public Function MFWord(buf1 As Range) Dim buf2 As Object, buf3 As Object Dim smax As String, nmax As Long Set regexp = CreateObject("VBScript.RegExp") regexp.Global = True regexp.IgnoreCase = True regexp.Pattern = "(\w{3,})" Set buf2 = regexp.Execute(buf1.Text) For Each occ In buf2 regexp.Pattern = occ & "\b|" & occ & "$" Set buf3 = regexp.Execute(buf1.Text) If buf3.Count nmax Then smax = occ: nmax = buf3.Count Next MsgBox smax & " (" & nmax & ")" MFWord = smax End Function .f fernando cinquegrani Microsoft MVP http://www.prodomosua.it |
#7
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
Thank you for your function, however it doesn't consider
some character. e.g. if cell value = "mip@it mip@it mip@it win win" , output = "mip" Regards Umby -----Original Message----- "Umby" ha scritto nel messaggio .. . | Hi all, | I need a function to get the most frequent word in a cell. The length of the | output must be at least 3. | e.g. if a cell contents was "Vick rushed for 145 yards. Michael Vick handled | things in regulation. established in 1881 ; Yards Gear for the Holidays car | yards in crisis" , the output should be "yards". | | Any idea? in microsoft.public.it.office.excel thread: cercare parole date: dic 10, 2003 Public Function MFWord(buf1 As Range) Dim buf2 As Object, buf3 As Object Dim smax As String, nmax As Long Set regexp = CreateObject("VBScript.RegExp") regexp.Global = True regexp.IgnoreCase = True regexp.Pattern = "(\w{3,})" Set buf2 = regexp.Execute(buf1.Text) For Each occ In buf2 regexp.Pattern = occ & "\b|" & occ & "$" Set buf3 = regexp.Execute(buf1.Text) If buf3.Count nmax Then smax = occ: nmax = buf3.Count Next MsgBox smax & " (" & nmax & ")" MFWord = smax End Function .f fernando cinquegrani Microsoft MVP http://www.prodomosua.it . |
#8
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
"Umby" ha scritto nel messaggio ...
| Thank you for your function, however it doesn't consider | some character. | e.g. if cell value = "mip@it mip@it mip@it win win" , | output = "mip" | Public Function MFWord(buf1 As Range) Dim buf2 As Variant, n as long Dim smax As String, nmax As Long Dim occ As Variant, acc As Variant Set regexp = CreateObject("VBScript.RegExp") regexp.Global = True regexp.IgnoreCase = True regexp.Pattern = "([a-z_&@]{3,})" Set buf2 = regexp.Execute(buf1.Text) For Each occ In buf2 n = 0 For Each acc In buf2 If occ = acc Then n = n + 1 If n nmax Then smax = occ: nmax = n Next Next 'MsgBox smax & " (" & n & ")" MFWord = smax End Function the pattern now includes the characters from a to z (a-z) and the symbols _, & and @ .f fernando cinquegrani Microsoft MVP http://www.prodomosua.it |
#9
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
"Umby" wrote...
Thank you for your function, however it doesn't consider some character. e.g. if cell value = "mip@it mip@it mip@it win win" , output = "mip" ... -----Original Message----- "Umby" ha scritto nel messaggio . .. |Hi all, |I need a function to get the most frequent word in a cell. The length of the |output must be at least 3. |e.g. if a cell contents was "Vick rushed for 145 yards. Michael Vick handled |things in regulation. established in 1881 ; Yards Gear for the Holidays car |yards in crisis" , the output should be "yards". | |Any idea? in microsoft.public.it.office.excel thread: cercare parole date: dic 10, 2003 ... Rewrite the function to be more flexible. Also eliminate the MsgBox call in a function - never a good idea. Function mcw( _ textstr As String, _ Optional sep As String = " ", _ Optional matchpat As String = "\b\w{3,}\b", _ Optional caseinsens As Boolean = True _ ) As String '------------------------------------------------------------- Dim wc() As Long, m As Object, mc As Object, re As Object Dim k As Long, n As Long, wcmax As Long, t As String t = textstr Set re = CreateObject("VBScript.RegExp") re.Pattern = matchpat re.IgnoreCase = caseinsens re.Global = True Set mc = re.Execute(textstr) If mc.Count = 0 Then Exit Function ReDim wc(1 To mc.Count) For Each m In mc n = n + 1 k = Len(t) re.Pattern = "\b" & m.Value & "\b" t = re.Replace(t, "") wc(n) = (k - Len(t)) / Len(m.Value) If wc(n) wcmax Then wcmax = wc(n) Next m For k = 1 To n If wc(k) = wcmax Then mcw = mcw & sep & mc.Item(k - 1).Value Next k Erase wc Set m = Nothing Set mc = Nothing Set re = Nothing mcw = Mid(mcw, Len(sep) + 1) End Function Change the second argument to use a word separator other than space. Change the third and fourth arguments to use different definitions of 'word' as specified by VBScript regular expressions. -- Never attach files. Snip unnecessary quoted text. Never multipost (though crossposting is usually OK). Don't change subject lines because it corrupts Google newsgroup archives. |
#10
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
"Harlan Grove" ha scritto nel messaggio ...
| Rewrite the function to be more flexible. correct | Also eliminate the MsgBox call in a function - never a good idea. 'MsgBox smax & " (" & n & ")" see the original thread in microsoft.public.it.office.excel | Change the second argument to use a word separator other than space. how? " ", ",",";","!","?","."...... | Change the third and fourth arguments to use different definitions of 'word' as specified | by VBScript regular expressions. in VBScript regular expressions there is only one definition of 'word'. parentheses denotes a 'subexpression'. .f |
#11
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
"Harlan Grove" ha scritto nel messaggio ...
| Change the second argument to use a word separator other than space. oops. sorry! now i understand. .f |
#12
Posted to microsoft.public.excel.programming
|
|||
|
|||
Most popular word
"fernando cinquegrani" wrote...
"Harlan Grove" ha scritto . . . .... | Change the third and fourth arguments to use different definitions of |'word' as specified by VBScript regular expressions. in VBScript regular expressions there is only one definition of 'word'. parentheses denotes a 'subexpression'. I don't mean a word as VBScript would define the term (\b\w+\b), I mean a word as the user wants to define it. My udf defaults to \b\w{3,}\b which is clost to the VBScript definition, but if the user wants only letters, then s/he could use mcw(somestring,,"\b[A-Za-z]{3,}\b"). |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
Similar Threads | ||||
Thread | Forum | |||
most popular keyword | Excel Worksheet Functions | |||
IWC Aquatimer Watch, Popular Wristwatch | Excel Discussion (Misc queries) | |||
Most popular names in a list | Excel Discussion (Misc queries) | |||
Most popular items in a list | Excel Worksheet Functions |