View Single Post
  #8   Report Post  
Posted to microsoft.public.excel.programming
Tom Ogilvy Tom Ogilvy is offline
external usenet poster
 
Posts: 27,285
Default Developing TEXT scrambler kind of FUNCTIONS in Excel

http://www.source-code.biz/snippets/vbasic/4.htm

has some soundex code.

--
Regards,
Tom Ogilvy

"Robin Hammond" wrote in message
...
Hari,

Here's a thought on something you might have a look at, although I have

only
seen it described in SQL's books online rather than trying to use it

myself.
Not sure if it will work for you but it might be worth investigating.

There are two functions: SoundEx and Difference, defined in T-SQL that

might
help. This is from BOL:
SOUNDEX converts an alpha string to a four-character code to find
similar-sounding words or names. The first character of the code is the
first character of character_expression and the second through fourth
characters of the code are numbers. Vowels in character_expression are
ignored unless they are the first letter of the string. String functions

can
be nested.

e.g. in SQL
select soundex('visual basic'), soundex('viswl basic')
returns an identical result

select difference('visual basic','viswl basic')
returns 4, which apparently means they are as similar as possible.

There's a function on John Walkenbach's site written by Richard Yanco that
purports to convert text to a Soundex value that you might be able to test
for use in Excel.

And, strangely enough, if you do a google groups search on Soundex for the
*Excel* groups, there appear to be a few discussions of less complex, but
similar challenges to the one you are trying to crack.

HTH, and let us know if it works,

Robin Hammond
www.enhanceddatasystems.com


"Tim Williams" <saxifrax at pacbell dot net wrote in message
...

Hari,

I'd second the opinions of the other posters: it seems a better approch

to
maintain a list of common mis-spellings and use replace() on your

original
data.

Tim


"Peter T" <peter_t@discussions wrote in message
.. .
Hi Hari,

You may well eventually develop some amazing routine that appears to do

what
you want. But I find it difficult to imagine how it will ever be

foolproof,
leading to a false sense of confidence and false results.

Have you tried working with the Spell checker. Even manually I don't

suppose
it would take too long to run through 1,000 rows, particularly once it
has
been "trained" to your topic (can be automated to some extent). There

is
also AutoCorrect, I notice it does nothing with pasted cells until you
F2/enter. Not sure if it can be automated - haven't tried.

Not the answer you are looking for - just a thought.

Regards,
Peter T

"Hari Prasadh" wrote in message
...
Hi Tom,

2 aspects to it

a) While generating a unique list, I get the frequency count of each
word's
appearance also. Please note a single person / response in column A

could
mention more than 1 tool. Now suppose in column A I have responses

from
3000
people and if I see the frequency then for the software tools

question
I
might get "Visual basic" having let's say 100 appearances, which is
reasonable number for it to be added to the unique list. Now, since

there
will be typos so I might get a count of "Visul basic" being 2 and

count
of
"Viswl Basic" being 1 and similarly .... lots and lots of such FALSE
instances of "Visual Basic" which have very low counts. Now this
happens
for
Each software tool. I cannot use these false instances as part of
unique
list as it wouldnt serve my purpose. As, once am through with the

mapping
I
would assign a numeric code to "Visual Basic" and load the data in

SPSS
(Stats software) and run some statistics on it. If I upload the false
instances of visual basic also then they would play havoc with my
stats.

b) The point made in a) gets compounded because every time its a new
market
research study. Like if am tracking software tools today, tomorrow I

might
be on to tracking responses to a question like " How would you

describe
the
Denim area at this Superstore". So every time it will be preparation

of
a
new and unique list and accordingly the column A also changes. So, I
cannot
invest my time in manipulating column I by having a "messy kind of

list".
That might offset the whole point of automation.

--
Thanks a lot,
Hari
India


"Tom Ogilvy" wrote in message
...
It seems to me, if you build your list of interest from the text of

the
entries, you will already have a list of misspellings. It would be

far
easier to construct a cross walk table from that than what you are
asking
(I
would think).

--
Regards,
Tom Ogilvy


"Hari Prasadh" wrote in message
...
Hi,

This is the code which I have written for automatic mapping of
data.

If some sample data (10 rows and 2 columns ) is required, please

tell
me
and
I would be happy to paste that as well in the future post.

Option Explicit
Public basearray() As String
Public arrWords As Variant

Sub readingarrayofuniquewords()

Dim p As Integer
Dim BaseArrLength As Integer

Range("i65536").Select
Selection.End(xlUp).Select
p = Selection.Row - 2

ReDim basearray(p)
For BaseArrLength = 0 To p
basearray(BaseArrLength) = Cells(BaseArrLength + 2, "i")
Next BaseArrLength

End Sub

Sub Upcoding()

Dim R As String
Dim z As Integer
Dim g As Integer
Dim hu As Integer
Dim hg As Integer
Dim msgboxresult As String
Dim tempwithspace As String
Dim tempwithoutspace As String
Dim flag As Integer

Application.ScreenUpdating = False

msgboxresult = MsgBox("Columns B through F will be cleared" &

vbLf
&

"
Press
no if you want to exit out of the macro", vbYesNo, " Warning")
If msgboxresult = vbNo Then Exit Sub

Range("B2:G65536").Select
Selection.ClearContents


Range("A65536").Select
Selection.End(xlUp).Select
R = ActiveCell.Row

Call readingarrayofuniquewords

For z = 2 To R

flag = 0
Splitwords ActiveSheet.Range("A" & z).Value
Range("B" & z).Select
'here put the function for combining elements of array


For hu = UBound(arrWords) To LBound(arrWords) Step -1

For hg = hu To UBound(arrWords) Step 1

If ActiveCell.Column 7 Then
flag = 1
Exit For
End If

tempwithspace =
MergingElementsOfArrayWithSpace(arrWords,
hu, hg)
tempwithoutspace =
MergingElementsOfArrayWithoutSpace(arrWords, hu, hg)


For g = LBound(basearray) To UBound(basearray)


If UCase(tempwithspace) = UCase(basearray(g))

Then


ActiveCell.Value = basearray(g)
' u have to put the logic for more than

4
then
exit
loop below
ActiveCell.Offset(0,

1).Range("A1").Select
Exit For

ElseIf UCase(tempwithoutspace) =
UCase(basearray(g))
Then

ActiveCell.Value = basearray(g)
' u have to put the logic for more than

4
then
exit
loop below
ActiveCell.Offset(0,

1).Range("A1").Select
Exit For

End If

Next g

Next hg

If flag = 1 Then
Exit For
End If

Next hu

Next z

Application.ScreenUpdating = True

End Sub

Sub Splitwords(sText As String)

Dim x As Integer
Dim arrReplace As Variant

arrReplace = Array(vbTab, ":", ";", ".", ",", "-", Chr(10),
Chr(13))
For x = LBound(arrReplace) To UBound(arrReplace)
sText = Replace(sText, arrReplace(x), " ")
Next x

arrWords = Split(Application.WorksheetFunction.Trim(sText), "

")

End Sub

Function MergingElementsOfArrayWithoutSpace(concatarray As

Variant,
hi
As
Integer, ti As Integer) As Variant
Dim tmp As String
Dim f As Integer

tmp = ""

'see whether the range ti - hi to ti is correct or if it is _
to be increased by 1.

For f = ti - hi To ti
tmp = tmp & concatarray(f)
Next f

MergingElementsOfArrayWithoutSpace =
Application.WorksheetFunction.Trim(tmp)

End Function

Function MergingElementsOfArrayWithSpace(concatarray As Variant,

hi
As
Integer, ti As Integer) As Variant
Dim tmp As String
Dim f As Integer

tmp = ""

'see whether the range ti - hi to ti is correct or if it is _
to be increased by 1.

For f = ti - hi To ti
tmp = tmp & concatarray(f) & " "
Next f

MergingElementsOfArrayWithSpace =
Application.WorksheetFunction.Trim(tmp)

End Function


--
Thanks a lot,
Hari
India