This article illustrates the general time - memory tradeoff with the example of different hash table implementations in Java. The more memory a hash table takes, the faster each operation (e. g. getting a value by key or putting an entry) is performed.
Hash maps with
values were tested. Memory measure is relative usage over theoretical minimum. For example, 1000 entries of int key and value take at least (4 (size of int) + 4) * 1000 = 8000 bytes. If the hash map implementation takes 20 000 bytes, it's memory overuse is (20 000 - 8000) / 8000 = 1.5. Each implementation was benchmarked on 9 different load levels (load factors). On each load level, each map was filled with 10 numbers of entries, logariphmically evenly distributed bewteen 1000 and 10 000 000 (to study caching effects). Then, for the same implementation and load level, memory metrics and average operation throughputs are averaged independently, over 3 smallest sizes (small sizes), 3 largest sizes (large sizes) and all 10 sizes from 1000 to 10 000 000 (all sizes). Implementations:
- Higher Frequency Trading Collections (HFTC)
- High Performance Primitive Collections (HPPC)
- fastutil collections
- Goldman Sachs Collections (GS)
- Trove Collections
- Mahout Collections
java.util.HashMapas a reference
Get value by key (successful)
Only looking at these charts, you can suppose that HFTC, Trove and Mahout on the one size, fastutil, HPPC and GS on the another use the same hash table algorithm. (In fact, it is not quite true.) Sparser hash table on average performs less lookups during key search, therefore less memory reads, therefore the operation finishes earlier. Notice, that on small sizes the largest maps are the fastest for all implementations, but on large and all sizes there isn't visible progress starting from memory overuse ~4. That's because when the total memory taken by the map goes beyond CPU cache capacity, cache misses become more often when the map is getting larger. This effect compensates algorithic trend.
Update (increment) value by key
Update operation behaves pretty similar to get(). fastutil wasn't benchmarked, because there aren't fairly performant method for this task in it's API.
Put an entry (key was absent)
In this case, maps were gradually filled in with the entries from the size 0 to the target size (1000 - 10 000 000). Rehash shouldn't occur, because maps were constructed with the target size provided. For small sizes, plots still looks like hyperbolas, but I can't explain so dramatic change on large sizes and differences between HFTC and other primitive implementations.
Internal iteration (forEach)
Iteration is getting slower with memory usage growth. Interesting thing about external iteration: for all open hash table implementations throughtput depends only memory usage, not even on load factor (which differs for implementations for the same memory usage). Also, forEach throughput don't depend on open hash table size.
External iteration (via iterator or cursor)
External iteration performance is more varying than internal, because there is more freedom for optimization. HFTC and Trove employ own iteration interfaces, other libraries use standard
Raw benchmark results from which the pictures were built with a link to the benchmarking code and information about the runsite in description.