View Single Post
  #2   Report Post  
Posted to microsoft.public.excel.programming
Dave Ring Dave Ring is offline
external usenet poster
 
Posts: 20
Default Sorting Algorithm required!

Sorry for the terse code. Radix sorts are not easy to understand. The
array base is indeed set to 1. LO and HI should be the indices of the
first and last strings to sort. The LS, RS and DS arrays make up a
stack for storing sublists, and the maximum depth of the stack is
(arbitrarily) set to 500. You could set it deeper, but if you're
usually sorting < 100,000 strings, I doubt you'll need to.

Why is there a stack? Radix sorts process keys one digit at a time. If
the keys are all the same length, you can use a backward radix sort that
sorts the whole list based on the last character, then on the next to
last, and so on. This works, because each sorting pass rearranges based
on a more significant digit. Unfortunately, working backwards is a mess
with variable length strings. But if we work forward, starting with the
most significant digit, we can't sort the whole list by the next digit
-- it's less significant and will mess up the order we've already
created. So we have to break up the list into sublists and put the
extra ones on the stack while we deal with the first one. At each
successive digit, we create and stack sublists, until the lists are
short enough (<25 keys) that it's more efficient to finish them with
insertion sort.

If you want to understand radix sorting, the classic paper is
"Engineering Radix Sort" by McIlroy, Bostic and McIlroy; it should be
easy to find on the web. If you just want to use it, give it a shot and
if it doesn't work right, let me know.

Dave

agarwaldvk wrote:
Hi Dave

I am just going through the Radix sort algorithm and I am bit stumped
here!

Could you please the following couple of points before I go ahead with
my attempts to understand how the thing works :-

1. Is this on the assumption that the option base for arrays is to be
set to 1.
2. Are the LO and HI variables the indices for the first and last
strings to sort. In other words are these supposed to be 1 and the
count of the strings to sort.
3. What is the idea of setting LS, RS and DS arrays from 1 to 500? Is
there any specific reason to set them from 1 to 500? Further, are these
always supposed to be between 1 and 500 or are these supposed to be from
1 to the count of strings to be sorted?


You got to be a genius. Gee, this code looks so small but boy is it
complicated or maybe I am probably not good enough???


One last thing, is the colon (2 vertical dots - couldn't show it here
since it converts to the smiley symbol) a valid statement delimiter in
VBA code?


Best regards


Deepak Agarwal