NEW:A note on specifications.
PK2 -- SML Assignment 2
Introduction and History
Radix sort was originally used for mechanically sorting punch
cards. The principle used is to classify the given numbers first according
to the least significant digit, then according to the second-least
significant digit, and so on. For this to work, one needs to use a
stable classification method: the order between two equal
numbers must be preserved.
The textbook by Cormen et al.,
Introduction to
Algorithms describes radix sort. For the second edition, look on
pages 170 to 173. For other editions, you'll simply have to look up
Radix Sort in the table of contents.
You need not read up anything about counting sort,
nor use a sorting algorithm to classify the numbers
into the bins. You can choose any number of bins,
corresponding to the base in which your algorithm views the given
numbers, but indicate that base in your specification. The objective
of this assignment is to assess the subtleties in efficiently
implementing radix sort for lists and properly examining its actual
efficiency.
Lab Instructions
- Read the relevant section of the textbook.
- Implement radix sort for sorting where keys are compared in
non-decreasing order. That is, any element is less than or equal to
its successor, if any. The
algorithm should take a list
L of natural numbers with maximum d digits in the
chosen base:
radixSort L d : int list -> int -> int list
such that the running time for n elements in the list is
Θ(n).
Example: radixSort [9821, 5674, 7890, 1234,
638] 4 = [638, 1234, 5674, 7890, 9821], assuming the
algorithm has 10 bins, because all the n=5 given numbers
have maximum d=4 digits in base 10.
- Compare radix sort against merge-sort or quick-sort implementation via
experiments on the same random lists of 100, 200, ..., 1000
elements. To generate a random list, use Random.rangelist
(minimalElement, maximalElement) (numberOfElements,
Random.newgen()).
The interval between the smallest and largest
possible values of a random list should be at least 100 times larger than the
number of elements of that list, otherwise there will be too many duplicates in the generated lists. Many duplicates would mean that one examines an extreme
case rather than a general, random case. To measure a running time, use the
function timeIt in the file
ranTime.sml,
which takes a function f and an argument a and returns the
running time for f applied to a..
- Do measurements for different sizes of input data and present results in
one diagram with curves. (Tip: use
gnuplot.)
- Are the results as expected? Why / Why not?
Hand-in
Your report complying with
ethics rules
of the course should be handed in latest on Friday the 20th of April
before 17:00 via the online
course manager system, and should contain:
- SML functions for question 2 from above, in an sml file, running under
Moscow ML version 2.01, and commented, complying with
coding
convention.
- Answer, in English, for question 3 from above, in a pdf, txt, or html file. Code for quick sort and merge sort can be found here