# Algorithm of the Week: Radix Sort

Algorithms always depend on the input. We saw that general purpose sorting algorithms like insertion sort, bubble sort and quicksort can be very efficient in some cases and inefficient in others. Indeed, insertion and bubble sort are considered slow, with a best-case complexity of O(n^{2}),
but they are quite effective when the input is fairly sorted. So, when
you have a sorted array and you add some “new” values to the array you
can sort it quite effectively with insertion sort. On the other hand,
quicksort is considered one of the best general purpose sorting
algorithms, but while it’s a great algorithm when the data is randomized,
it’s practically as slow as bubble sort when the input is almost or
fully sorted.

Now we see that the effectiveness of algorithms depends greatly on the input. For input that is almost sorted, insertion sort may be preferred instead of quicksort, which is generally a faster algorithm.

Because the input is so important for an algorithm's efficiency, we may ask if there are any sorting algorithms that are faster than O(n.log(n)), which is the average-case complexity for merge sort and quicksort. And the answer is yes there are faster, linear complexity algorithms, that can sort data faster than quicksort, merge sort and heapsort. But there are some constraints!

Everything sounds great but we can’t sort any particular data with linear complexity, so the question is what rules must the input follow in order to be sorted in linear time?

Such an algorithm that is capable of sorting data in linear O(n) time is radix sort and the domain of the input is restricted – it must consist only of integers.

## Overview

Let’s say we have an array of integers which is not sorted. Because it consists only of integers and because array keys are integers in programming languages we can implement radix sort.

First for each value of the input array we put the value of “1” on
the key-th place of the temporary array as explained on the following
diagram.

If there are repeating values in the input array, we increment the
corresponding value in the temporary array. After “initializing” the
temporary array with one pass (with linear complexity) we can sort the
input.

## Implementation

Implementing radix sort is in fact very easy, which is great. The thing is that old-school programming languages weren’t very flexible and we needed to initialize the entire temporary array. That leads to another problem – we must know the interval of values from the input. Fortunately, modern programming languages and libraries are more flexible so we can initialize our temporary array even if we don’t know the interval of input values, as in the example bellow. Indeed, PHP is flexible enough to build-up arrays in the memory without knowing their size in advance.

$list = array(4, 3, 5, 9, 7, 2, 4, 1, 6, 5); function radix_sort($input) { $temp = $output = array(); $len = count($input); for ($i = 0; $i < $len; $i++) { $temp[$input[$i]] = ($temp[$input[$i]] > 0) ? ++$temp[$input[$i]] : 1; } ksort($temp); foreach ($temp as $key => $val) { if ($val == 1) { $output[] = $key; } else { while ($val--) { $output[] = $key; } } } return $output; } // 1, 2, 3, 4, 4, 5, 5, 6, 7, 9 print_r(radix_sort($list));The problem is that PHP needs ksort – which is completely foolish as we’re trying to sort an array using “another” sorting method, but to overcome this you must know the interval of values in advance and initialize a temporary array with 0s, as in the example bellow.

define(MIN, 1); define(MAX, 9); $list = array(4, 3, 5, 9, 7, 2, 4, 1, 6, 5); function radix_sort(&$input) { $temp = array(); $len = count($input); // initialize with 0s $temp = array_fill(MIN, MAX-MIN+1, 0); foreach ($input as $key => $val) { $temp[$val]++; } $input = array(); foreach ($temp as $key => $val) { if ($val == 1) { $input[] = $key; } else { while ($val--) { $input[] = $key; } } } } // 4, 3, 5, 9, 7, 2, 4, 1, 6, 5 var_dump($list); radix_sort(&$list); // 1, 2, 3, 4, 5, 5, 6, 7, 8, 9 var_dump($list);Here the input is modified during the sorting process and it’s used as a result.

## Complexity

The complexity of radix sort is linear, which in terms of omega means
O(n). That is a great benefit in performance compared to O(n.log(n)) or
even worse with O(n^{2}) as we can see in the following chart.

## Why Use Radix Sort

1. It’s fast

Radix sort is very fast compared to other sorting algorithms as we
saw on the diagram above. This algorithm is very useful in practice
because in practice we often sort sets of integers.

### 2. It’s easy to understand and implement

Even a beginner can understand and implement radix sort, which is great. You need no more than a few loops to implement it.

## Why NOT using radix sort

1. Works only with integers

If you’re not sure about the input, you're better off not using radix sort. We
may think that our input consists only of integers and we can go for
radix sort, but what if in the future someone passes floats or strings
to our routine.

### 2. Requires additional space

Radix sort needs additional space – at least as much as the input.

## Final Words

Radix sort is restricted by the input’s domain, but I must say that in practice there are tons of cases where only integers are sorted. This is when we get some data from the db based on primary keys – typically primary in database tables are integers as well. So practically there are lots of cases of sorting integers, so radix sort may be one very, very useful algorithm and it is so cool that it is also easy to implement.

Published at DZone with permission of {{ articles[0].authors[0].realName }}, DZone MVB. (source)

Opinions expressed by DZone contributors are their own.

{{ nComments() }}