What is the fastest sort?
Assuming that we have to make a
comparison. Fair assumption is it necessary?
A decision tree is a binary tree in which each node represents an if statement and the children represent the branches of the from the if.
node: if (predicate)
/ \
predicate = T predicate = F
if (next predicate) if (next predicate)
The bottom layer is all the possible out comes.
The bottom layer of the sort decision tree has n! elements. Why?
Each internal node makes a single comparison
Note that log(n!) ≥ log{ n/2n/2} = (n/2) log (n/2)
Therefore comparison based searches are at least n lg n or Ω(n/2 log(n/2)
Can we beat O(n lg n)? Yes but we have to abandon comparison. Say we are sorting integers and know the Maximum key, N. Also we do not care about how much space we will use. Then we could but integers of the unsorted sequence into the array at the rank of the integer value. Then we can unload the array back into the sequence. The time of the sort is O(n+N).
Algorithm bucketSort(S)
Input: Sequence , S, of items with integer keys between 0 and N-1
Output: Sequence S sorted
Let B be and array size N initially empty
for each item x in S do // n iterations of the loop size of S
let k be the key of x
remove x form S and insert it into B[k]
for i = 0 to N-1 do // N iterations the size of B
remove x from B[i] and insert at the end of S
O(n+N)
B can be buckets of link-list for equal key items.
Preserve the original order of equal key items. These sorts are important for sorting at multiple levels for example lexicographically sorting.
Note: Heap is not stable while merge sort can be made stable. Depending how the pivot is choosen quicksort can be made stable.
Bucket-sort can be made stable called radix-sort. Say each item has two keys (k, l) we want to sort on, for example all two letter words or two digits numbers, the k key is to have preference over the l key. Which key do we sort over first, k or l? Answer: Sort most important key last, meaning k.
Radix sort is an efficient sort algorithm for lexicographically sorting. Multiple Bucket sorts on the different keys.