DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. Algorithm of the Week: Radix Sort

Algorithm of the Week: Radix Sort

Stoimen Popov user avatar by
Stoimen Popov
·
Mar. 20, 12 · Interview
Like (1)
Save
Tweet
Share
37.46K Views

Join the DZone community and get the full member experience.

Join For Free

Algorithms always depend on the input. We saw that general purpose sorting algorithms like insertion sort, bubble sort and can be very efficient in some cases and inefficient in others. Indeed, and bubble sort are considered slow, with a best-case complexity of O(n2), but they are quite effective when the input is fairly sorted. So, when you have a sorted array and you add some “new” values to the array you can sort it quite effectively with insertion sort. On the other hand, quicksort is considered one of the best general purpose sorting algorithms, but while it’s a great algorithm when the data is randomized, it’s practically as slow as bubble sort when the input is almost or fully sorted.

Now we see that the effectiveness of algorithms depends greatly on the input. For input that is almost sorted, insertion sort may be preferred instead of quicksort, which is generally a faster algorithm.

Because the input is so important for an algorithm's efficiency, we may ask if there are any sorting algorithms that are faster than O(n.log(n)), which is the average-case complexity for merge sort and quicksort. And the answer is yes there are faster, linear complexity algorithms, that can sort data faster than quicksort, merge sort and heapsort. But there are some constraints!

Everything sounds great but we can’t sort any particular data with linear complexity, so the question is what rules must the input follow in order to be sorted in linear time?

Such an algorithm that is capable of sorting data in linear O(n) time is radix sort and the domain of the input is restricted – it must consist only of integers.

Overview

Let’s say we have an array of integers which is not sorted. Because it consists only of integers and because array keys are integers in programming languages we can implement radix sort.

First for each value of the input array we put the value of “1” on the key-th place of the temporary array as explained on the following diagram.





If there are repeating values in the input array, we increment the corresponding value in the temporary array. After “initializing” the temporary array with one pass (with linear complexity) we can sort the input.


Implementation

Implementing radix sort is in fact very easy, which is great. The thing is that old-school programming languages weren’t very flexible and we needed to initialize the entire temporary array. That leads to another problem – we must know the interval of values from the input. Fortunately, modern programming languages and libraries are more flexible so we can initialize our temporary array even if we don’t know the interval of input values, as in the example bellow. Indeed, PHP is flexible enough to build-up arrays in the memory without knowing their size in advance.

$list = array(4, 3, 5, 9, 7, 2, 4, 1, 6, 5);
 
function radix_sort($input)
{
    $temp = $output = array();
	$len = count($input);
 
    for ($i = 0; $i < $len; $i++) {
		$temp[$input[$i]] = ($temp[$input[$i]] > 0) 
			? ++$temp[$input[$i]]
			: 1;
    }
 
    ksort($temp);
 
    foreach ($temp as $key => $val) {
		if ($val == 1) {
			$output[] = $key; 
		} else {
			while ($val--) {
				$output[] = $key;
			}
        }
    }
 
    return $output;
}
 
// 1, 2, 3, 4, 4, 5, 5, 6, 7, 9
print_r(radix_sort($list));

The problem is that PHP needs ksort – which is completely foolish as we’re trying to sort an array using “another” sorting method, but to overcome this you must know the interval of values in advance and initialize a temporary array with 0s, as in the example bellow.

define(MIN, 1);
define(MAX, 9);
$list = array(4, 3, 5, 9, 7, 2, 4, 1, 6, 5);
 
function radix_sort(&$input)
{
    $temp = array();
	$len = count($input);
 
	// initialize with 0s
    $temp = array_fill(MIN, MAX-MIN+1, 0);
 
    foreach ($input as $key => $val) {
    	$temp[$val]++;
    }
 
    $input = array();
    foreach ($temp as $key => $val) {
	if ($val == 1) {
		$input[] = $key;
	} else {
		while ($val--) {
			$input[] = $key;
		}
	}
    }
}
 
// 4, 3, 5, 9, 7, 2, 4, 1, 6, 5
var_dump($list);
 
radix_sort(&$list);
 
// 1, 2, 3, 4, 5, 5, 6, 7, 8, 9
var_dump($list);

Here the input is modified during the sorting process and it’s used as a result.

Complexity

The complexity of radix sort is linear, which in terms of omega means O(n). That is a great benefit in performance compared to O(n.log(n)) or even worse with O(n2) as we can see in the following chart.

Why Use Radix Sort


1. It’s fast

Radix sort is very fast compared to other sorting algorithms as we saw on the diagram above. This algorithm is very useful in practice because in practice we often sort sets of integers.


2. It’s easy to understand and implement

Even a beginner can understand and implement radix sort, which is great. You need no more than a few loops to implement it.

Why NOT using radix sort


1. Works only with integers

If you’re not sure about the input, you're better off not using radix sort. We may think that our input consists only of integers and we can go for radix sort, but what if in the future someone passes floats or strings to our routine.


2. Requires additional space

Radix sort needs additional space – at least as much as the input.

Final Words

Radix sort is restricted by the input’s domain, but I must say that in practice there are tons of cases where only integers are sorted. This is when we get some data from the db based on primary keys – typically primary in database tables are integers as well. So practically there are lots of cases of sorting integers, so radix sort may be one very, very useful algorithm and it is so cool that it is also easy to implement.

Merge sort Algorithm Data structure Sorting

Published at DZone with permission of Stoimen Popov, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Mr. Over, the Engineer [Comic]
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Distributed Stateful Edge Platforms
  • Implementing Infinite Scroll in jOOQ

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: