Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.programming
|
|||
|
|||
![]()
Hi,
To put things in perspective, I analyse Market research data. Let's say I have some string type data starting from Cell A2 to Last Cell in column A Also let's say I have some string type data starting from Cell I2 to Last Cell in column I. The data in a sample cell of column A (let's say cell A2) would be something like " I use C++ , Visual Basic and Win2K Server at my workplace. At home I dabble with C++ and Qualcomm". Basically column A would be complete sentences and out of that sentence I would be interested only in some of the words. Like if I'm tracking usage of software tools (and if am not interested in Operating systems) then for me only C++ and VB would be my point of interest. This is where Column I plays its part. With full help from NewsGroup (Tim Williams - "Generating count of unique words in a cell or cells" ) I have been able to get a nice piece of module which enables me to get a count of unique words ( frequency of a word in Column A) . After running the module , I scan the results and expunge those words which are not point of interest in my study. Like based on the above example - the words "I" , "use" , "Win2K server" and "Qualcomm" etc. would be removed. I then take the remaining list of unique words and paste them in column I (starting from row 2 ). Hence, in column I would have a list of RELEVANT words only. The part which I explained above, I naively refer to as Text Mining. After this I developed a macro ( by copying snippets of syntax from variety of sources and Recording feature). This macro basically compares the CELLS in Column A to Column I and display the Matching words in Column B through E. What I mean is cells in columns B thru G display a list of words which appear in the corresponding cell of Column A AND also appears within any cell in Column I. Taking the above example cell B2 would say "C++" and Cell C2 would say "VB" because Column I would not be having rest of the words which are there in Cell A2. (cell D2 and E2 would be left blank. Please note if there were no matches then B thru E will be left blank.) Presently the problem is the text in column A would be having TYPOS. Like somebody may say in cell A2 "I use Visula Basic" and another person may say in cell A3 "I use Visul Basic". Now, I wont be getting any data matches in Column B because column I would be having "Visual Basic" but not "Visula Basic" or Visul Basic". So, I want to develop a TEXT Scrambler function(S) which can :- a) First function - SCRAMBLE a single letter of the word in Column I . That is if Column I has "Visual basic" then any 2 adjacent non empty letters are swapped. That is function should be capable of giving out results like "Visula Basic" , "Visual Baisc" and similar permutations of adjacent letters only. I hope that at a time only "one" transformation of adjacent letters would be sufficient. (first letter might not be permuted as my understanding is that people dont commit typing errors in their first letter.) I dont want to swap the "space" between 2 words, that is in a particular transformation I would just swap any 2 adjacent LETTERS of a particular WORD within the STRING. b) Second function - MISS or remove a single letter of the word in column I. That is if a particular cell in column I has "Visual Basic" then it could give me permutations like "Viual Basic" , "Visual Baic" etc. c) Third function - SUBSTITUTE a single letter of the word in column I with any of the other 25 letters of the English alphabet. That is if column I has "Visual basic" then it would be able to give me "Vidual Basic" , "Visual Nasic" and similar permutations. d) Fourth function - Am being too ambitious but.... Would like to have a function which can combine the effects of a), b) c) simultaneously though each of them are individually transformed only once. (Would doing this be disastrous from computing resources point of view ?) I want all the above to be FUNCTIONS and not macros . I'm aware about the difference between 2 only to the extent that in case of a function I can write a statement like :- If StringSubsetFromColumnA = ScrambledCellofColumnI(..,...) Then CellinColumnB = UnscrambledcellofColumnI End if I hope I have been able to express my needs correctly. Im posting my present unscrambled macro in the follow-up post to this as I didnt want to make my post too big. (Not posting everything in one mail, is that a correct practice in Newsgroups ?) Thanks a lot, Hari India |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Need Help Developing Formula | Excel Worksheet Functions | |||
Developing Macros | Excel Discussion (Misc queries) | |||
Developing custom chart add-in | Charts and Charting in Excel | |||
Excel spreadsheet/template for developing a retail price calculation | Excel Programming | |||
developing an app for the Mac on a w2k box with office 2k | Excel Programming |