BigList: a Scalable High-Performance List for Java
Join the DZone community and get the full member experience.
Join For FreeAs memory gets cheaper and cheaper, our applications can keep more data readily available in main memory, or even all as in case of in-memory databases. To make real use of the growing heap memory, appropriate data structures must be used. Interesting enough, there seem to be no specialized implementations for lists - by far the most used collection.
This article introduces BigList, a list designed for handling large collections where large means that all data still fit completely in the heap memory. The article will show the special requirements for handling large collections, how BigList is implemented and how it compares to other list implementations.
1. Requirements
What are the special requirements we need to handle large collections efficiently?
- Memory:
- Sparing use of memory: The list should need little memory for its own implementation so memory can be used for storing application data.
- Specialized versions for primitives: It must be possible to store common primitives like ints in a memory saving way.
- Avoid copying large data blocks: If the list grows or shrinks, only a small part of the data must be copied around, as this operation becomes expensive and needs the same amount of memory again.
- Data sharing: copying collections is a frequent operation which should be efficiently possible even if the collection is large. An efficient implementation requires some sort of data sharing as copying all elements is per se a costly operation.
- Performance:
- Good performance for normal operations like reading, storing, adding or removing single elements.
- Great performance for bulk operations like adding or removing multiple elements.
- Predictable overhead of operations, so similar operations should need a similar amount of time without excessive worst case scenarios.
If an implementation does not offer these features, some operations will not only be slow for really large collections, but will becomse just not feasible because memory or CPU usage will be too exhaustive.
Introduction to BigList
BigList is a member of the Brownies Collections library which also includes GapList, the fastest list implementation known. GapList is a drop-in replacement for ArrayList, LinkedList, or ArrayDequeue and offers fast access by index and fast insertion/removal at the beginning and at the end at the same time.
GapList however has not been designed to cope with large collections, so adding or removing elements can make it necessary to copy a lot of elements around which will lead to performance problems.
Also copying a large collection becomes an expensive operation, both in term of time and memory consumption. It will simply not be possible to make a copy of a large collections if not the same amount of memory is available a second time. And this is a common operation as you often want to return a copy of an internal list through your API which has no reference the original list.
BigList addresses both problems. The first problem is solved by storing the collection elements in fixed size blocks. Add or remove operations are then implemented to use only data from one block. The copying problem is solved by maintaining a reference count on the fixed size blocks which allows to implement a copy-on-write approach. For efficient access to the fixed size blocks, they are maintained in a specialized tree structure.
2. BigList Details
Each BigList instance stores the following information:
- Elements are stored in in fixed-size blocks
- A single block is implemented as GapList with a reference count for sharing
- All blocks are maintained in a tree for fast access
- Access information for the current block is cached for better performance
The following illustration shows these details for two instances of BigList which share one block.
2.1 Use of Blocks
Elements are stored in in fixed-size blocks with a default block size of 1000. Where this default may look pretty small, it is most of the time a good choice because it guarantees that write operation only need to move few elements. Read operations will profit from locality of reference by using the currently cached block to be fast.
It is however possible to specify the block size for each created BigList instance. All blocks except the first are allocated with this fixed size and will not grow or shrink. The first block will grow to the specified block size to save memory for small lists.
If a block has reached its maximum size and more data must be stored within, the block needs to be split up in two blocks before more elements can be stored. If elements are added to the head or tail of the list, the block will only be filled up to a threshold of 95%. This allows inserts into the block without the immediate need for split operations.
To save memory, blocks are also merged. This happens automatically if two adjacent blocks are both filled less than 35% after a remove operation.
2.2 Locality of Reference
For each operation on BigList, the affected block must be determined first. As predicted by locality of reference, most of the time the affected block will be the same as for the last operation.
The implementation of BigList has therefore been designed to profit from locality of reference which makes common operations like iterating over a list very efficient.
Instead of always traversing the block tree to determine the block needed for an operation, lower and upper index of the last used block are cached. So if the next operation happens near to the previous one, the same block can be used again without need to traverse the tree.
2.3 Reference Counting
To support a copy-on-write approach, BigList stores a reference count for each fixed size blocks indicating whether this block is private or shared.
Initially all lists are private having a reference count of 0, so that modification are allowed. If a list is copied, the reference count is incremented which prohibits further modifications. Before a modification then can be made, the block must be copied decrementing the block's reference count and setting the reference count of the copy to 0. The reference count of a block is then decremented by the finalizer of BigList.
3. Benchmarks
To prove the excellence of BigList in time and memory consumption, we compare it with some other List implementations. And here are the nominees:
Type |
Library |
Description |
BigList |
brownie-collections |
List optimized for storing large number of elements. Elements stored in fixed size blocks which are maintained in a tree. |
GapList |
brownie-collections |
Fastest list implementation known. Fast access by index and fast insertion/removal at end and at beginning. |
ArrayList |
JDK |
Maintains elements in a single array. Fast access by index, fast insertion/removal at end, but slow at beginning. |
LinkedList |
JDK |
Elements stored using a linked list. Slow access by index. Memory overhead for each element stored. |
TreeList |
commons-collections |
Elements stored in a tree. All operations are not really fast, but there are no very slow operations. Memory overhead for each element stored. |
FastTable |
javolution |
Elements stored in a "fractal"-like data structure. Good performance and use of memory. However no bulk operations and collection does not shrink. |
3.1 Handling Objects
In the first part of the benchmark, we compare memory consumption and performance of the different list implementations.
Let's first have a look at the memory consumption. The following table shows the bytes used to hold a list with 1'000'000 null elements:
BigList |
GapList |
ArrayList |
LinkedList |
TreeList |
FastTable |
|
32 bit |
4'298'466 |
4'021'296 |
4'861'992 |
8'000'028 |
18'000'028 |
4'142'892 |
64 bit |
8'544'254 |
8'042'552 |
9'723'964 |
16'000'044 |
26'000'044 |
8'222'988 |
We can see that BigList, GapList, ArrayList, and FastTable only add small overhead to the stored elements, where as Linkedlist needs twice the memory and TreeList even more.
Now to the performance. Here are the results of 9 benchmarks which have been run for each of the 6 candidates with JDK 8 in a 32 bit Windows environment and a list of 1'000'000 elements:
The result table can be read as follows:
- the fastest candidate for each test has a relative performance indicator of 1
- the value for the other candidates indicate how many times they have been slower, so a factor of 3 means that this implementation was 3 times slower than the best one
- The different factor are colored like this:
- factor 1: green (best)
- factor <5: blue (good)
- factor <25: yellow (moderate)
- factor >25: red (poor)
If we look the benchmark result, we can see that the performance of BigList is best for all expect two benchmarks. The only moderate result is produces in getting elements in a totally random order. This could be expected as there is no locality of reference which can be exploited, so for each access, the block tree must be traversed to find the correct block.
Luckily this is a rare use case in real applications. And the benchmark "Get local" shows that performance is back to good as soon as elements next to each other must be retrieved - as it is the case if we iterate over a range.
3.2 Handling Primitives
In the second part of the benchmark, we want see how big the savings are if we use a data structure specialized for storing primitives compared to strong wrapped objects. For this reason, we compare IntBigList and BigList<Integer>.
The following table shows memory needed to store 1'000'000 integer values:
BigList<Integer> |
IntBigList |
|
32 bit |
16'298'454 |
4'534'840 |
64 bit |
28'544'234 |
4'570'432 |
Obviously it is easy to save a lot of memory. In a 32 bit environment, IntBigList just needs 25% percent of memory, in a 64 bit environment only 14%! These figures become plausible if you recall that a simple object needs 8 bytes in a 32 bit, but already 16 bytes in a 64 bit environment, where as a primitive integer value always only needs 4 bytes.
The measurable performance gain is not so impressive, it is something below 10% for simple get operations and something above 10% for add and remove operations. These numbers show that the JVM is impressively fast in creating wrapper objects and boxing and unboxing primitive values. We must however also consider that each created object will need to be garbage collected once and therefore adds to the total load of the JVM.
4. Summary
BigList is a scalable high-performance list for storing large collections. Its design guarantees that all operations will be predictable and efficient both in term of performance and memory consumption, even copying large collections is tremendous fast.
Benchmarks haven proven this and shown that BigList outperform other known list implementations.
The library also offers specialized implementations for primitive types like IntBigList which save much memory and provide superior performance.
BigList for handling objects and the specializations for handling primitives are part of the Brownies Collections library and can be downloaded from http://www.magicwerk.org/collections.
Opinions expressed by DZone contributors are their own.
Comments