For the conclusion, just skip to the last section.
One of the more recent Android ADT updates added Android Lint. Lint will check your Android project for simples things that you can change to improve your app. Lint notifies you about things like:
- Unused resources
- Simple code changes to improve performance
- Inefficiencies in XML layouts such as nested weights or unused layouts
And a few other things. When I ran Lint the other day, I found one other performance tweak that was suggested by Lint. I was told that “for maps where keys are of type integer, it’s typically more efficient to use the Android SparseArray API”.
But when should A SparseArray be Used?
After checking out the optimization a bit more, I couldn’t find a ton of data that says when a SpraseArray should be used over a Hashmap. Should it be used in every case where a Hashmap could have been used? What about if the array is small? How about if the range of values is under 500?
These are the kinds of questions I tried to answer with the experiment below. But before I jump right into the experiment details, it’s good to have some background information on Hashmaps and SparsArrays.
What is a Hashmap?
A Hashmap is a data structure that holds “key value” pairs. It has a constant time, or O(1) time requirement for retrieving an element from the array. This means that it takes the same time to pull any element from the array, regardless of the size. This is possible by using a hashing function, which generates the array indices, given the input key.
A Hashmap is used (in this case) to map Integer keys to an object.
What is a SparseArray?
To understand what a SparseArray is, it is helpful to understand how a standard array works. An array is really just a continuous block of memory. The size of this block corresponds to how many elements the array can hold. For example, if a object requires 10 bytes of storage, and the array is created that can hold 5 of these objects, then the array has 50 byte memory block. Each of the objects in the array is given a 10 byte segment. By using the array index, you are actually moving through the array’s memory, 10 bytes at a time.
A problem comes into play when you want specific indices to map to specific objects, and they don’t cover every indices. For example, if you want an object at indices 4 and 5, but you don’t care about indices 1, 2, or 3, then you wind up wasting memory with an array.
Why use a SparseArray over a Hashmap?
It seems that the SparseArray is a more efficient solution than using a Hashmap to map Integers to objects. The theory is that the SparseArray can add and retrieve elements quicker than a HashMap, in this case, by removing the hashing function processing time.
As you can see from the SparseArray page, and the Lint warning, the statements here are pretty vague.
Is a SparseArray better than a Hashmap all the time? If not, when does a HashMap become more efficient than the SparseArray?
To compare the two data structures, I designed a quick and simple test (and app). The bench marks gauge how long it takes to:
- Insert x random values (with a range from 0 to y)
- Retrieve x random values (with a range from 0 to y)
To handle this experiment, I wrote a simple Android app that contains a basic UI, and a few Async Tasks to do the above experiment (available on Github soon).
For the experiment, I tested two different variables:
- The array size from 1, 10, 100,…, to 1,000,000
- The range of numbers within the array from 1, 10, 100, …, to 100,000,000
The app was run on a Samsung Galaxy 2 with Android version 2.3.4.
The SparseArray is a very quick data structure for sizes of 10,000 and less. If you are pulling data from your array considerably more than adding to it, you can get away with a size of 100,000. It starts to slow down a bit with the jump to 1,000, but nothing too surprising.
The Executive Summary
The Hashmap and the SparseArray have very comparable performance up to the 10,000 size mark. The 100,000 size mark is where some interesting results are found. For retrieving elements within a 100,000 size array, performance deteriorates quickly once the range of integers exceeds 1,000.
So it seems the Hashmap and the SparseArray are very similar for data structure sizes under 1,000, with any range you would like. The performance gains at a size of 1,000 would likely be very small, if any.
Both data structures have difference performances when the size has been increased to the 10,000 mark. At this level (in yellow in the above graphs), the Hashmap has greater performance with adding objects, while the SparseArray has greater performance when retrieving objects.
At a size of 100,000 (in red in the above graphs), the Hashmap loses performance very quickly, at the 100 range mark. However the SpraseArray is still able to retrieve elements in this range with comparable efficiency. It also breaks down after the 1,000 range limit has been hit through.
So with all the fun stuff above, I walk away with a few new theories:
- In Android (and mobile programming in general) having more than 10,000 objects in a data structure in memory seems to be an upper limit.
- For smaller arrays (under 1,000) the performance of the SparseArray and the Hashmap are very comparable
- For arrays larger than 10,000, if you are adding more than retrieving elements, try the SparseArray.
- For arrays larger than 100,000, give the SparseArray a shot… or fix your application design so you don’t have any insanely large in-memory data structures.