ConcurrentHashMap isn't always enough

Dima Leah

Dec. 08, 16 · Interview

Likes (12)

Comment

Save

47.9K Views

When Java developers come to a task of writing a a new class which should have a Map datastructure field, accessed simultaneously by several threads, they usually try to solve the synchronization issues invloved in such a scenario by simply making the map an instance of ConcurrentHashMap .

public class Foo {

private Map<String, Object> theMap = new ConcurrentHashMap<>();

	// the rest of the class goes here...

}

In many cases it works fine just because the contract of ConcurrentHashMap takes care of the potential synchronization issues related to reading/writing to the map. But there are cases where it's not enough, and a developer gets race conditions which are hard to predict, and even harder to find/debug and fix.

Let's have a look, at the next example:

public class Foo {

	private Map<String, Object> theMap = new ConcurrentHashMap<>();

	public Object getOrCreate(String key) {
		Object value = theMap.get(key);
		if (value == null) {
			value = new Object();
			theMap.put(key, value);
		}
		return value;
	}

}

Here we have a "simple" getter ( getOrCreate(String key) ), which gets a key and returns the value assosiated with the given key in theMap . If there is no mapping for the key, the method creates a new value, inserts it into theMap and returns it.

So far so good. But what happens when 2 (or more) threads call the getter with the same key when there is no mapping for the key in theMap? In such a case we might receive a race condition:
Suppose thread t1 enters the function and comes to line 7. Its value is null . At this point thread t2 enters the function and also comes to line 7. Its value is also obviously null . Therefore from this point the two threads will enter the if statement and execute lines 8 and 9, thus creating two different new Objects. Upon returning from the getter each thread will get a different Object instance, violating programmer's wrong assumption that by using ConcurrentHashMap "everything is synchronized" and therefore two different threads should get the same value for the same key.

To solve this issue we can synchronize the entire method, thus making it atomic:

public class Foo {

	private Map<String, Object> theMap = new ConcurrentHashMap<>();

	public synchronized Object getOrCreate(String key) {
		Object value = theMap.get(key);
		if (value == null) {
			value = new Object();
			theMap.put(key, value);
		}
		return value;
    }

}

But this is a bit ugly, and uses Foo instace's monitor, which may affect performance if there are other methods in this class which are synchronized. Also a common rule of thumb is to try to eliminate using synchronized methods as much as possible.

A much better approach should be using Java 8 Map's computeIfAbsent(K key, Function mappingFunction), which, in ConcurrentHashMap's implementation runs atomically:

public class Foo {

	private Map<String, Object> theMap = new ConcurrentHashMap<>();

	public Object getOrCreate(String key) {
		return theMap.computeIfAbsent(key, k -> new Object());
	}

}

The atomicity of computeIfAbsent(..) assures that only one new Object will be created and put into theMap, and it'll be the exact same instance of Object that will be returned to all threads calling the getOrCreate function.
Here, not only the code is correct, it's also cleaner and much shorter.

The point of this example was to introduce a common pitfall of blindly relying on ConcurrentHashMap as a majical synchronzed datastructure which is threadsafe and therefore should solve all our concurrency issues regarding multiple threads working on a shared Map. ConcurrentHashMap is, indeed, threadsafe. But it only means that all read/write operations on such map are internally synchronized. And sometimes it's just not enough for our concurrent environment needs, and we have to use some special treatment which will guarantee atomic execution. A good practice will be to use one of the atomic methods implemented by ConcurrentHashMap, i.e: computeIfAbsent(..), putIfAbsent(..), etc.

IT Java (programming language) dev Execution (computing) Atomicity (database systems) Task (computing) Programmer (hardware) Monitor (synchronization)

Opinions expressed by DZone contributors are their own.

Related

Trending

ConcurrentHashMap isn't always enough

Related

Partner Resources