Java 8: ConcurrentHashMap Atomic Updates
Here's a great look at what Java 8 brought to ConcurrentHashMaps, including close looks at the methods at your disposal and their performance impacts.
Join the DZone community and get the full member experience.Join For Free
Whilst doing some refactoring on updates to ConcurrentHashMap values, I came across these great articles ...
... and was inspired to try to develop the theme a bit further.
Pre-Java 8, we had various ways to try to perform atomic operations on the values of Concurrent collections as described by Dima.
For example, a simple counter:
Improved Methods in Java 8
- computeIfAbsent(): If the value is threadsafe and can be safely updated outside the method, or you intend to synchronize on the value whilst updating it, or if you just want to be certain of getting a new or existing value without having to check for null.
- compute(): If the value is not threadsafeand must be updated inside the method with a remapping function to ensure the entire operation is atomic. This gives you the most control over the computation, but also the responsibility to handle the possibility that there is no existing value inside your remapping function.
- merge(): Like compute(), you provide a remapping function to be performed on the existing value — if any. You also supply an initial value to be used if there is no existing value. This is a convenience because, with compute(), you have to, in the remapping function, handle the possibility of there being no existing value. Here, though, you do not have access to the key in the remapping function, unlike with compute().
The Method Signatures
Unlike the existing methods putIfAbsent() and replace(), instead of taking objects for the values, the arguments represent a Function or BiFunction, respectively, which can be implemented by a lambda or a method reference or even an inner class. This could be useful, as it means you don't have to instantiate an object that may well not be needed. You pass in some code that will only be executed if needed.
All three methods (plus computeIfPresent) are guaranteed to be atomic in ConcurrentHashMap. The way that works is that the method synchronizes on the new or existing Entry in the HashTable whilst executing.
It's important to notice that with computeIfAbsent(), a compound operation is not necessarily atomic because updates are performed outside of the method.
I try to think of them like, say, 'volatile' and AtomicInteger. Volatile guarantees visibility, but not atomicity. You won't get data races but could still get a race condition. If atomicity is required, then you use AtomicInteger, etc. to avoid race conditions
Another consideration is — are the values truly independent, or do they depend on an existing value or an external factor like 'order of arrival'?
This warning comes in the Oracle documentation:
The entire method invocation is performed atomically. Some attempted update operations on this map by other threads may be blocked while computation is in progress, so the computation should be short and simple, and must not attempt to update any other mappings of this Map.
computeIfAbsent() takes a key and a mapping function and returns a value:
computeIfAbsent() is simplest to use.
compute() takes a key and a value and a remapping function and returns a value:
compute() is more flexible because you have access to the key and the existing value in the remapping function and can update the existing value — or create a new one — inside the method atomically.
merge() takes a key and an initialize value to be returned if the key is not present and a remapping function.
Here, the initial value is combined with the existing value to calculate the new value.
All three methods should return either the existing or a new value to the calling code —therefore, you don't have to check for null. It's important to remember that ConcurrentHashMap is actually a HashTable, not a HashMap. Therefore, your function cannot return a null value to the map.
That would throw a NullPointerException.
merge() is a bit tricky in this respect. If the remapping function is executed (because the key exists) but returns null, then the existing value is removed and null returned to the caller. So, it shouldn't throw NullPointerException unless the key does not exist and the initialize value is also null
You don't have to be intending to actually do any computation. Here, I just want to make sure I get an Exceutor on which to stream market data snapshots for dollar/loonie whether it already exists or not and without having to check for null values. (This assumes mappings are never removed by my code.):
Here is a sample of the results running five different versions of an update operation through a Monte Carlo simulator for 1 million trials. We are simulating the tossing of a fair coin. One would expect the results to be almost exactly 50:50.
- computeIfAbsent().add() with a DoubleAdder
- compute() with a Double
- computeIfAbsent() to get a Double, increment it and put it back in the map
- computeIfAbsent() to get a Double, increment it and replace it in the map - in a loop
Look at the huge difference in the variance between updates one, two, four, and five (similar) and update three. To me, that suggests that there has been interleaving of threads in update three and the operation was not atomic. Also, the probabilities no longer add up to unity.
There doesn't seem to be any significant difference in the performance regarding the time taken to perform the updates, but I wondered about memory usage. Remember, one of the advantages of passing lambdas or method references to a method whose arguments are functional interfaces is that you are passing code, not objects. Therefore, if the code is not required, then no object needs to be instantiated.
I couldn't see any real difference in memory consumption or garbage collection between merge() and compute() in VisualVM, but then Doubles are pretty small objects. There did seem to be significantly more Doubles being created using merge(), though. Perhaps 25% more.
Opinions expressed by DZone contributors are their own.