March 13, 2018
On the example of three concurrent map implementations, we will see that we either have low scalability, the risk of deadlocks or unnecessary garbage creation. Locks lead to low scalability when we use only one, or the risk to deadlock when using multiple locks. Compare and swap operations and immutable data structures on the other side lead to unnecessary object creation which causes more garbage collection cycles.
So what technique to choose? If we have a small data structure we can use an immutable class. And if our data structure is not performance critical, we can use a single lock. In all other cases, I would start with multiple locks since it is easier to implement than compare and swap operations. But let us first start with the easiest case, a hash map synchronized with a single lock.
As the first example we look at a concurrent map created by Collections.synchronizedMap:
private Map<Integer,Integer> map = Collections.synchronizedMap(new HashMap<Integer,Integer>());
In our examples we use the compute and put method to update our concurrent maps:
public void update12() { map.compute( 1, ( key ,value ) -> { map.put( 2 , 1); return value; }); }
When we look at the implementation of the compute method we see it is a wrapper which puts a synchronization block around an underlying map:
public V compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction) { synchronized (mutex) {return m.compute(key, remappingFunction);} }
The synchronization block acts as a lock. Since we are only using one monitor, the object in the mutex variable, we have only one lock. The lock makes sure that only one thread at a time accesses the underlying map. This makes sure that threads do not see an inconsistent state and that only one thread at a time modifies the data structure. But it also leads to low scalability, since the threads must wait for each other.
As next example let us look at a map implementation using one lock per hash array element, the ConcurrentHashMap.
private Map<Integer,Integer> map = new ConcurrentHashMap<Integer,Integer>();
The ConcurrentHashMap stores all its data in a hash array. The ConcurrentHashMap uses those array elements as locks when we update the map. Both put and compute methods use locks. Let us again look at the compute method to see how this is done:
public V compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction) { // Parameter checks omitted int h = spread(key.hashCode()); // Spreads higher bits of hash to lower V val = null; int delta = 0; int binCount = 0; for (Node<K,V>[] tab = table;;) { Node<K,V> f; int n, i, fh; if (tab == null || (n = tab.length) == 0) tab = initTable(); else if ((f = tabAt(tab, i = (n - 1) & h)) == null) { // Handling of missing elements omitted } else if ((fh = f.hash) == MOVED) tab = helpTransfer(tab, f); else { synchronized (f) { // Statements omitted // The remappingFunction method gets called in this block } } // Incrementing of hash map size omitted return val; }
Here again a synchronized block, in line 18, acts as the lock. But this time multiple locks are used since the synchronization happens one multiple objects held by the variable f. The variable f contains the return value from the method tabAt, line 12. This method returns the hash array element at index (n-1) & h. The operation index (n-1) & gives you the hash code of the key modulo the size of the hash array.
So which lock is used depend on the key we use. If we use a new method which updates the values in reverse order like the following update21 method, we create a deadlock.
public void update21() { map.compute( 2, ( key ,value ) -> { map.put( 1 , 1); return value; }); }
update21 locks array element 2, the key for the compute call and requests a lock on element 1, the key used for the put call. But update12 holds a lock on array element 1, by the compute call, and requests a lock on array element 2 while, for the put call. If we call both methods in parallel two threads are requesting a lock held by another thread: A deadlock.
The example gives us a hint what rules to follow to avoid deadlock. The first rule:
try to minimize the number of potential locking interactions, and follow and document a lock ordering protocol for locks that may be acquired together. — Java Concurrency in Practice - Brian Goetz, et al.
We violated this rule in our example. Our update21 method even accessed the locks in reverse order on purpose. And the second:
Release all locks before invoking unknown code. — Is Parallel Programming Hard, And, If So, What Can You Do About It? - Paul E. McKenney
That the compute method of the ConcurrentHashMap does not release all its locks before calling the callback function is a clear disadvantage of the compute method. At least it is documented in the javadoc of the method.
As the third example we look at a map implementation using a compare and swap operation for updating, the ConcurrentSkipListMap:
private Map<Integer,Integer> map = new ConcurrentSkipListMap<Integer,Integer>();
All modern CPU provide instructions to atomically compare and swap or increment and get values. Those operations are used internally by the JVM to implement synchronized locks.
The compare and swap method checks the value of a variable and only updates it when the value is the same as the expected value. Compare and swap operations can be called in Java by using either the classes in java .util.concurrent.atomic or sun.mis.Unsafe or in JDK 9 VarHandles. Let us look at the compute method to see how this operation is used.
public V compute(K key, BiFunction<? super K, ? super V, ? extends V> remappingFunction) { // Parameter checks omitted for (;;) { Node<K,V> n; Object v; V r; if ((n = findNode(key)) == null) { // Handling of missing elements omitted } else if ((v = n.value) != null) { @SuppressWarnings("unchecked") V vv = (V) v; if ((r = remappingFunction.apply(key, vv)) != null) { if (n.casValue(vv, r)) return r; } else if (doRemove(key, vv) != null) break; } } return null; }
In line 12 the compare and swap operation is called through the method casValue. This method only updates the value of the variable n to the new value r when its current value is still vv. The variable n holds the node for our key, set in line 6. The new value r is calculated using the remappingFunction in line 11. And the variable vv holds the old value of the node n. If the method casValue successfully updates the value of n it returns true otherwise false.
So we first remember the old value of the node we want to update. Then we create the new value we want to set. And then we update the node only when it is still containing the old value. So we make sure that no other thread has updated our node in between. We repeat this till we successful update our node through the for loop in line 4.
The casValue method uses the sun.misc.Unsafe method internally to call the compare and swap operation:
boolean casValue(Object cmp, Object val) { return UNSAFE.compareAndSwapObject(this, valueOffset, cmp, val); }
The sun.misc.Unsafe compareAndSwapObject method is a very low-level method. It takes as parameters the object which contains the variable to update, a memory offset of the variable to update and the expected value cmp and the new value val.
This is a very performant way to make sure that we never overwrite a value written by another thread. But it leads to unnecessary object creation when the update fails. The next technique creates even more garbage, immutable classes.
Immutable classes do not change their value in the middle of an operation without using synchronized blocks. By avoiding synchronization blocks you avoid deadlocks. And since you are always working with an unchangeable consistent state you avoid race conditions. But modifying immutable state consists of copying the immutable object. So each modification leads to the creation of garbage. Read here more about how to use immutable classes for concurrent programming.
Immutable data is easy to implement and use. But each modification requires a copy of the data structure. So you only use it when your data structure is small enough or the reads vastly outnumber the writes. In all other cases, you need to choose between compare and swap operations and locks. I typically prefer locks since they are easier to use.
All maps we looked at are synchronous. When we call a compute method the method is executed in the calling thread. Therefore we need to coordinate the different threads calling the method in parallel. Each of those mechanisms like locks or compare and swap operations have it specific tradeoffs.
In the next blog post, we will look at the tradeoffs for a map with can be updated asynchronously. I would be glad to hear from you about which synchronization mechanism you use in your application.
© 2020 vmlens Legal Notice Privacy Policy