Home |
Search |
Today's Posts |
#1
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]() I have two HUGE customer lists. The first one (DATA Sheet) is about 6000 rows And contains varies information (zip, address, name, etc) the second sheet (WORK) Is about 35000+ long and is similar to DATA sheet. I am to compare EACH of the 35000+ customer of sheet WORK individually to see if there is a match with ANY customer(s) in sheet DATA. Sheet DATA is simply a comparison sheet and WORK is the only that I highlight (the row of the respective customer) if I get a match. I could do this manually but it would take a very long time. The good news is that I know (empirically) that over 95% of all the customers in WORK will NOT match to ANY of the customers in DATA. So if I can filter out the junk, then I can quickly narrow the number of matches I need to do by hand and thus save a lot of time. My first attempt was to do a search/filter by zip code. I have a macro that goes to each row in WORK and searches for any matching zips in DATA. It makes sense that if a certain row/customer in WORK does not match with ANY of the customers in DATA then there is no match (and thus no need to check name, address, other fields etc) A macro basically eliminated all the customers that do not match by zip. Those that do match by zip in sheet WORK have entire row highlighted a certain color. I successfully finished this first step of filtering/narrowing the work to be done by hand searching …(see below for macro code)… However I am still left with PLENTY of customers! And of these I also know (empirically) that MOST (90%) of the matched by zip will NOT match by name. So if I can get a macro that can search each highlighted row (already matched by zip) And within this set see which ones match by name (1st char of name, see below..), then I can further narrow down my search significantly to the point where I can search the rest by hand very quickly. For example, John Doe in 76016 is NOT Brian Dawson in 76016. So even though they match by zip there do not match by name and are not the same customer. So a further filter (if someone can help me with writing a search by name macro) will eliminate this because it will see that John and Brian are not the same name. However many customers names are not consistent. Sometimes they are in Acronyms in one sheet and fully spelled out in another sheet. So I would actually want a search by first character of the customer name field. To do this I would have to access the substring manipulation of Excel Macro. Again I am not familiar with this and a quick search of the board has not lead to any case similar to mine. I want a search/match by first character of the customer names field in order to be on the safe side and avoid false negatives. For example, James Doe in 75063 is NOT John Doe in 75063, but this second search by name (1st char of name) will not pick up on that, BUT that’s OK, because even a filter out by A-Z beginning character will have reduced the number to search by hand by a factor of 26. There are several ways to accomplish this. One is to search within the set of customers already matched by zip code. Another is to rewrite the zip code match macro (see below..) and alter it so that when it goes around to match each individual row/customer in WORK to customers in DATA by zip, it ALSO at the same time simultaneously matches them by first characters of the customer names field. Ex: a pseudo code would be …. If(currentWorkRow.zip = current DataRow.zip && currentWorkRow.Name(substring(firstchar)) = currentDataRow.Name(substring(firstchar)) THEN MATCH ELSE NO MATCH AND GO TO NEXT ROW IN WORK UNTIL REACH END I’m not sure how to implement the substring char in Excel Macro. I asked around and got some hints but it doesn’t really help me because I’m not fluent in writing macros and its above my head (the hints are listed below) I think in pseudo code it would work as something like this: for each row in sheet2 { if currentrow is white/nonhighlighted, leave alone and go to next row if currentrow is highlighted { for each row in sheet1 { if (sheet1.currentrow.zipcode = sheet2.currentrow.zipcode && sheet1.currentrow.firstcharname = sheet2.currentrow.firstcharname ) then currentrow is highlighted light yellow } } } Not sure if the logical is intact... I realize this is a lot, but I don’t even know where to start. If someone could let me in the right direction (ie give links, threads, websites that I could read up on and learn how to implement this search that I want to code) that would be Great! Thanks, Bo -- bxc2739 ------------------------------------------------------------------------ bxc2739's Profile: http://www.excelforum.com/member.php...o&userid=32538 View this thread: http://www.excelforum.com/showthread...hreadid=536410 |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Need to Improve Code Copying/Pasting Between Workbooks | Excel Discussion (Misc queries) | |||
Matching Multiple Sheets | Excel Worksheet Functions | |||
matching contents | Excel Discussion (Misc queries) | |||
Matching contents | Excel Discussion (Misc queries) | |||
Vlook up for matching data in two seperate sheets | Excel Discussion (Misc queries) |