Mini-Project: Counting Sort

One of my favourite algorithms I used to teach in CSCI 3110 is Counting Sort. When using only pairwise element comparisons to sort a list of values, \(O(n \lg n)\) is the fastest running time achievable. There exists a lower bound that proves this. However, if the values we're trying to sort are of the right type, we can use other operations than comparisons to determine the correct order. Counting Sort does not use any comparisons at all to sort the values it is given, but it works only for sorting integers or, more generally, for sorting a sequence of records by integer keys. The running time of Counting Sort is proportional to the number of elements to be sorted and the integer range from which the keys are taken. Specifically, if there are \(n\) elements to be sorted and the keys they have are integers between \(l\) and \(r\), then Counting Sort takes \(O(n + r - l)\) time. In particular, it is common that \(l = 0\) and \(r = n\). In this case, Counting Sort sorts the input values in \(O(n)\) time.

Here is how it works: We start by building an array that counts how many elements have we each key. This requires an array of counts for all possible key values between \(l\) and \(r\). Building this array takes \(O(n + r - l)\) time. I'll show the details below. Now, if we want to know where an element with key \(i\) is to be stored in the sorted output, let us focus on the first element \(x\) with key \(i\) for now. Clearly, all elements with keys \(l\) through \(i - 1\) are stored before before \(x\). So the position of \(x\) in the output array is equal to the total number of elements withh keys \(l\) through \(i - 1\). We can compute this value by summing the individual counts of elements with keys \(l\) through \(i - 1\).

Another useful property of a sorting algorithm is stability: Two elements with the same key appear in the same order in the sorted array as they appeared in the input array. This is useful, for example, when using Counting Sort as a building block to implement Radix Sort, which allows us to sort integers in the range between \(0\) and \(n^c\), for any constant \(c\), in \(O(n)\) time. The right implementation of Counting Sort is stable. Here is such an implementation in Python. The function takes the array a of values to be sorted and a function key as arguments. For every value x in a, key(x) returns the key of x by which we are to sort the array elements.

def counting_sort(a, key):
    # Determine the range of keys of the values in the array to be sorted
    min_key, max_key = None, None
    for x in a:
        k = key(x)
        min_key = k if min_key is None else min(k, min_key)
        max_key = k if max_key is None else max(k, max_key)

    # We need a dummy key one less than the actual minimum key
    min_key -= 1

    # Count how often every key occurs in the input
    counts = [0] * (max_key - min_key + 1)
    for x in a:
        counts[key(x) - min_key] += 1

    # Sum the counts to determine for each key value how many elements there are
    # with smaller keys  
    for i in range(0, max_key - min_key):
        counts[i+1] += counts[i]

    # Finally construct the sorted array
    sorted = [None] * len(a)
    for x in a:
        k = key(x)
        sorted[counts[k-1]] = x
        counts[k-1] += 1

    return sorted

Again, the construction of the counts and sorted arrays seems to require many element-by-element array updates, something we cannot do efficiently using immutable arrays. The construction of the counts array could be translated into a fairly natural implementation using accumArray, because it does essentially the same as counting the number of occurrences of the different characters in an input string, which was the example we used to illustrate the use of accumArray. What is more challenging to implement is updating the counts array after placing each element into the sorted array, the line counts[k-1] += 1. This really seems to be something we cannot easily do using immutable arrays.

The point of this project is to illustrate that we can still obtain an equally efficient algorithm by simply looking at the problem differently, through the eyes of a functional programmer. We start by observing that we have lists in Haskell, so there is no need to only count the number of elements in the input with a given key. Instead, we can use accumArray to build an array whose entries are the lists of elements with each key. We then take these lists in order of increasing keys and concatenate them. This obviously gives a sorted list. Here's the code that does this:

IntegerSort.hs

import Data.Array

integerSort :: (a -> Int) -> [a] -> [a]
integerSort key xs = concat $ elems buckets
  where
    -- Determine the range of keys of the values in the array to be sorted.
    minKey = minimum [key x | x <- xs]
    maxKey = maximum [key x | x <- xs]

    -- Construct an array of buckets that group the elements in xs by their
    -- keys.
    buckets = accumArray (flip (:)) [] (minKey, maxKey) [(key x, x) | x <- xs]

Et voilà, we have ourselves a linear-time integer sorting algorithm. First, we determine the minimum and maximum key values as we did in our Python code. The real work is done in the single line that constructs the buckets array. It builds an array with index range between minKey and maxKey. Initially, each entry in this array is the empty list. We then convert each entry x in xs into a pair consisting of its key and x itself. By passing this list to accumArray, each element x is added to the bucket corresponding to its key. In this case, "added" is exactly the right term because the function flip (:) takes the current list representing this bucket and the element x and adds x to the beginning of this list:

GHCi

>>> flip (:) [2,3] 1
[1,2,3]

Once we have the array of buckets, we extract its elements using elems. Since the entries in the array are lists, this is a list of lists. We concatenate these lists in order using concat, and that is our sorted list.

Let's try it out. It seems to work:

GHCi

>>> :l IntegerSort.hs
[1 of 1] Compiling Main             ( IntegerSort.hs, interpreted )
Ok, one module loaded.
>>> import System.Random
>>> gen <- getStdGen
>>> xs = take 20 (randomRs (1,10) gen) :: [Int]
>>> xs
[9,1,4,4,2,10,6,4,10,3,10,6,8,1,2,8,3,6,2,10]
>>> integerSort id xs
[1,1,2,2,2,3,3,4,4,4,6,6,6,8,8,9,10,10,10,10]

In this GHCi session, I first loaded the IntegerSort.hs file to gain access to our integerSort function. I then imported the System.Random module to gain access to a random number generator. gen <- getStdGen assigns the standard random number generator to gen. randomRs (1,10) gen generates an infinite list of integers between 1 and 10. I take the first 20 elements from this list to get the list xs to be sorted. As the next line shows, this list is unsorted and, given that it contains 20 numbers between 1 and 10, contains duplicates. In this case, each number is its own key, so we use the identity function id as the key by which to sort the numbers. The output of integerSort id xs is the list of all values in xs in sorted order.

There's a small bug in our code though, one that the above example wasn't able to reveal because there is no way to distinguish between the first 10 in xs and the last 10 in xs. Is our sorting algorithm really stable? To check this, let's convert each element in xs into a pair that annotates each entry with its position in xs:

GHCi

>>> ps = zip xs [1..]
>>> ps
[(9,1),(1,2),(4,3),(4,4),(2,5),(10,6),(6,7),(4,8),(10,9),(3,10),(10,11),(6,12),(8,13),(1,14),
(2,15),(8,16),(3,17),(6,18),(2,19),(10,20)]

Now let's sort the elements in ps by their first components. Since the pairs in ps are sorted by their second components, a stable sorting algorithm should arrange all the pairs with the same first component by increasing second components.

GHCi

>>> integerSort fst ps
[(1,14),(1,2),(2,19),(2,15),(2,5),(3,17),(3,10),(4,8),(4,4),(4,3),(6,18),(6,12),(6,7),(8,16),
(8,13),(9,1),(10,20),(10,11),(10,9),(10,6)]

Hmm, all pairs with the same keys (the same first components) are ordered by decreasing second component. In other words, our algorithm arranges all elements with the same key in the reverse order from the order that they appeared in in the input. Why's that? Consider what the algorithm does for the four pairs with key 10. In ps, these values occur in the order [(10,6),(10,9),(10,11),(10,20)]. The construction of buckets starts with the bucket corresponding to key 10 being empty initially, and it inspects the elements in ps in order. Thus, it first sees the pair (10,6) and prepends it to the bucket corresponding to key 10. This bucket is now the list [(10,6)]. Next it encounters the pair (10,9). Prepending in to the bucket corresponding to key 10 produces the list [(10,9),(10,6)]. You see what is happening. After processing the element (10,11), we have the bucket [(10,11),(10,9),(10,6)], and finally the bucket [(10,20),(10,11),(10,9),(10,6)]. Since integerSort produces its final result by concatenating the buckets in order, the sublist of elements with key 10 is the list [(10,20),(10,11),(10,9),(10,6)]. How do we fix this? We need to be careful about this if we want to maintain the efficiency of our algorithm.

An obvious, and really bad, suggestion is to append each element to its bucket instead of prepending it. Let's try this:

IntegerSort.hs

import Data.Array

integerSort :: (a -> Int) -> [a] -> [a]
integerSort key xs = concat $ elems buckets
  where
    -- Determine the range of keys of the values in the array to be sorted.
    minKey = minimum [key x | x <- xs]
    maxKey = maximum [key x | x <- xs]

    -- Construct an array of buckets that group the elements in xs by their
    -- keys.
    buckets = accumArray (++) [] (minKey, maxKey) [(key x, [x]) | x <- xs]

The only change is in the highlighted line, where we turn each value x into a singleton list [x], and accumArray appends this singleton list to the bucket with key key x using the list concatenation operator (++). This achieves the appending of x to the end of the bucket. It surely does the right thing:

GHCi

>>> :r
[1 of 1] Compiling Main             ( IntegerSort.hs, interpreted )
Ok, one module loaded.
>>> xs = [9,1,4,4,2,10,6,4,10,3,10,6,8,1,2,8,3,6,2,10]
>>> ps = zip xs [1..]
>>> integerSort fst ps
[(1,2),(1,14),(2,5),(2,15),(2,19),(3,10),(3,17),(4,3),(4,4),(4,8),(6,7),(6,12),(6,18),(8,13),
(8,16),(9,1),(10,6),(10,9),(10,11),(10,20)]

All elements with the same key appear in the same order in the output as they appear in the input. So why is this a bad implementation? The problem is that list concatenation takes linear time in the length of the first of the two lists to be concatenated:

(++) :: [a] -> [a] -> [a]
[]     ++ ys = ys
(x:xs) ++ ys = x : (xs ++ ys)

Thus, if all elements in the input have the same key, the algorithm now takes quadratic time! For each of the last \(n/2\) elements, the bucket already contains at least \(n/2\) elements, so appending each of these elements to the bucket takes linear time.

So we really want to prepend each element to its bucket, because that's a constant-time operation. We can ensure that the elements in each bucket occur in the order in which they occur in the output by either processing the input elements from last to first, by reversing the input list:

IntegerSort.hs

import Data.Array

integerSort :: (a -> Int) -> [a] -> [a]
integerSort key xs = concat $ elems buckets
  where
    -- Determine the range of keys of the values in the array to be sorted.
    minKey = minimum [key x | x <- xs]
    maxKey = maximum [key x | x <- xs]

    -- Construct an array of buckets that group the elements in xs by their
    -- keys.
    buckets = accumArray (flip (:)) [] (minKey, maxKey) [(key x, x) | x <- reverse xs]

GHCi

>>> :r
[1 of 1] Compiling Main             ( IntegerSort.hs, interpreted )
Ok, one module loaded.
>>> xs = [9,1,4,4,2,10,6,4,10,3,10,6,8,1,2,8,3,6,2,10]
>>> ps = zip xs [1..]
>>> integerSort fst ps
[(1,2),(1,14),(2,5),(2,15),(2,19),(3,10),(3,17),(4,3),(4,4),(4,8),(6,7),(6,12),(6,18),(8,13),
(8,16),(9,1),(10,6),(10,9),(10,11),(10,20)]

Or we can construct the reversed buckets as our original implementation did and then reverse them as part of concatenating the buckets:

IntegerSort.hs

import Data.Array

integerSort :: (a -> Int) -> [a] -> [a]
integerSort key xs = concatMap reverse $ elems buckets
  where
    -- Determine the range of keys of the values in the array to be sorted.
    minKey = minimum [key x | x <- xs]
    maxKey = maximum [key x | x <- xs]

    -- Construct an array of buckets that group the elements in xs by their
    -- keys.
    buckets = accumArray (flip (:)) [] (minKey, maxKey) [(key x, x) | x <- xs]

GHCi

>>> :r
[1 of 1] Compiling Main             ( IntegerSort.hs, interpreted )
Ok, one module loaded.
>>> xs = [9,1,4,4,2,10,6,4,10,3,10,6,8,1,2,8,3,6,2,10]
>>> ps = zip xs [1..]
>>> integerSort fst ps
[(1,2),(1,14),(2,5),(2,15),(2,19),(3,10),(3,17),(4,3),(4,4),(4,8),(6,7),(6,12),(6,18),(8,13),
(8,16),(9,1),(10,6),(10,9),(10,11),(10,20)]

Either of these two implementations takes linear time because it takes constant time to add each element to its bucket and reversing a list takes linear time, as can be seen from the following implementation:

reverse :: [a] -> [a]
reverse = go []
  where
    go rev []     = rev
    go rev (x:xs) = go (x:rev) xs