Home |
Search |
Today's Posts |
|
#1
![]()
Posted to microsoft.public.excel.misc
|
|||
|
|||
![]() "Harlan Grove" wrote: If plant abbreviations were case-insensitive, there'd be 26^3 = 17576 different possible plant IDs. So the total number of distinct combinations of plant ID and year would be 17576 times the number of years. 3 years puts you well over 45000 possible distinct combinations. I realize that you didn't know this, but there are only 40 three-character plant abbreviations and only 25 model years. These plant abbreviations are not random three-letter combinations, either. Each stands for the city it is in, such as LAX is the abbreviation for Los Angeles Airport. Copy and paste from where? What are the designated lists? The Copy & Paste is coming from a pre-established data entry process (also on an Excel spreadsheet) where the copied values are themselves summaries. The sheets cannot be linked due to metaphysical restrictions. I realize that Copy & Paste adds a difficult twist to the task, but it is the established approved business process for the time being which cannot be changed. My task is to validate the *results* of this Copy & Paste activity. Letting users make corrections to tens of thousands of entries is one of the ways I'd define user-hostile. Say there were anomalies in 10% of rows, and it took users 10 seconds on average to correct each anomaly. For 45000 rows, that's total 45000 seconds, or 12.5 hours (not including breaks). Drop the anomaly rate to 1%, and that's still 1 hour 15 minutes. Your math is good, but the vast majority of it has already transpired prior to my context. The original spreadsheet has Data Validation and other (simpler) error checking tools in it that eliminate 99% of the issues before they ever get to my spreadsheet. The only thing I am faced with are a)simple typos, or b) the occasional invalid calulcated value due to one or more invalid input values. Believe me, the work isn't nearly as bad as your original math would seem to make it. Besides, if there were indeed so many errors to deal with, then the point about calling out the errors more efficiently for the user would be moot. That said, this sort of task (generating thousands of distinct combinations) is MUCH, MUCH better done by computers than humans. How would your users correct these anomalies? Just randomly choose combinations not found elsewhere? If so, that'd be MUCH SIMPLER for the computer to do. Exactly, and that's my task at hand. Why, in heaven's name, that Microsoft doesn't believe that results in cells need to be validated is simply beyond me. After all, the whole purpose of a spreasheet is to provide the user with the confidence of knowing that a 'computer' performed the task better than s/he ever could dream of. Duplicate entries are a wholly common issue in a vast array of human activities, ranging from holiday mailing lists to indexed database ID's. The fact that in order to check for duplicates in calculated cells a user is required to write programming code is, to me, ludicrous. IOW, both columns B and C blank should be skipped, but if either column isn't blank, blanks in the other SHOULD trigger error checking. The case of blanks, for me, is a non-issue. The data that is being represented in my context is that of a database record, not so much a numerical formula. (It's just that Excel naturally requires us to write formulas to make it 'work.') A blank in a row - a *complete* blank, thus, is not an error. It simply means that there were no criteria to provide the target calculation. How would your users be expected to 'fix' the nonunique entries? If random replacement of values is OK, FAR BETTER to let the computer do it than burden your users with the task. The 'computer,' in this sense, is Microsoft Excel - which is a far cry from a powerful relational database. Granted, this is how the customer is attempting to use it, however, so a happy medium has to be reached. If the Excel routine discovers a duplicate from a cell that was in a column of values which was orignally a part of an input spreadsheet that is nearly 300 columns wide in terms of the relational attributes, only the user will be able to determine depending upon the information s/he sees how to fix the error. A duplicate could very easily mean that the *original* member of the list is incorrect and the so-called 'duplicate' is actually the authentic entry. Or, the fact that there is even a duplicate at all could inform the user than there is something wrong with values entered on the original sheet which aren't even being ported to my worksheet, but are still causes of the problem. And on and on... Yes, I do wish that we hadn't gone so far down this path in Excel, but the scripting tool that was created over three years ago to import this data was written against Excel and modifying it is, unfortunately, out of the question. Thanks, Greg |
Reply |
Thread Tools | Search this Thread |
Display Modes | |
|
|
![]() |
||||
Thread | Forum | |||
Data Validation based on other cells | Excel Worksheet Functions | |||
Data Validation on a calculated cell | Excel Discussion (Misc queries) | |||
How can "data validation" be enforced in calculated cells | Excel Worksheet Functions | |||
Display cells with data validation | Excel Discussion (Misc queries) | |||
Data validation and empty cells | Excel Discussion (Misc queries) |