How to define thread-safety in Java?

May 20, 2020

Category: The package java.util.concurrent.atomic

Thread-safety is typically defined as works correctly even when used by multiple threads.

In Java Concurrency in Practice, for example, thread-safety is defined as:

A class is thread-safe if it behaves correctly when accessed from multiple threads, regardless of the scheduling or interleaving of the execution of those threads by the runtime environment, and with no additional synchronization or other coordination on the part of the calling code.

and in Effective Java:

Unconditionally thread-safe: Instances of this class are mutable, but the class has sufficient internal synchronization that its instances can be used concurrently without the need for any external synchronization. Examples include Random and ConcurrentHashMap.

But all those definitions have the disadvantage that they do not tell us how. They do not tell us how the methods behave when called from multiple threads and they do not tell us how to use the methods in a thread-safe way.

The latter point is especially problematic because of classes like java.util.concurrent.atomic.LongAccumulator.

So I rather follow the Art of Multiprocessor Programming. by using a correctness property to define thread-safety. And since there are two correctness properties used in Java, e.g. linearizable respectively atomic methods and quiescent methods, we can define thread-safety as:

A class is thread-safe if all its public methods are either atomic or quiescent.

When is a method atomic?

A method is atomic when the method call appears to take effect instantaneously, at some time between the start and the end of the method call. So other threads either see the state before or after the method call but no intermediate state. Threads reading the state see the most recently written state. Or more formally, reading method calls create a happens-before relation to previous writing method calls.

Atomic methods lead to a correctness property called linearizability. Maurice Herlihy and Jeannette Wing introduced the notion of linearizability in 1990 in the paper Linearizability: a correctness condition for concurrent objects

Types of atomic classes

Stateless

Classes without state are automatically atomic. Since they have no state, different threads can never see any intermediate state. Examples for stateless classes include java.lang.Math or java.util.Arrays. A more complicated example is the class com.google.gson.Gson, see this blog post.

Immutable

Immutable classes are similar to classes without state automatically atomic. Since their state does not change, different threads can never see an inconsistent intermediate state. Examples of immutable classes are java.lang.String or java.lang.Integer.

Using locks

The next technique to make methods is to use locks. You can either use the Java intrinsic locks, e.g. synchronized blocks or the locks from the package java.util.concurrent.locks.

The easiest way is to use a single monitor and surround each public method of the class with a synchronized block using this monitor. The methods synchronizedSet, synchronizedList and so on from the class java.util.Collections, for example, use this technique to make the methods of an underlying collection atomic.

The following shows part of the source code of the class SynchronizedList. This class is used by the method synchronizedList to create a class with atomic methods:

static class SynchronizedList<E>
        extends SynchronizedCollection<E>
        implements List<E> {
  final List<E> list;
  public E get(int index) {
     synchronized (mutex) {return list.get(index);}
  }
  public E set(int index, E element) {
     synchronized (mutex) {return list.set(index, element);}
  }
  // other methods omitted
}     
       

Besides those intrinsic locks, Java provides reentrant exclusive, read-write, and stamped locks in the package java.util.concurrent.locks.

Lock-free

Current processors provide machine code instructions to atomically compare and swap or fetch and add a variable. Those instructions are used by the JVM to implement locks. And you can use those instructions directly through the classes in the package java.util.concurrent.atomic and starting with JDK 9 through the class java.lang.invoke.VarHandle .

The class java.util.concurrent.ConcurrentSkipListMap is an example for a class using those operations to implement atomic methods. And the class java.util.concurrent.ConcurrentHashMap is an example of the combination of those machine code instructions with locks.

Other meanings of atomic

Atomic is an overloaded term. It is used to describe database transactions through the ACID property, e.g. Atomicity Consistency Isolation Durability. And it is used to describe the effect of machine code instructions. The meanings are similar but with subtle differences. In our context an atomic method means the following:

  1. all or nothing: other threads either see the state before or after the method call but no intermediate state
  2. visibility: Threads reading the state see the most recently written state. Or more formally, reading method calls create a happens-before relation to previous writing method calls.

When is a method quiescent?

The second type of method, quiescent methods, is typically used to collect statistics. Suppose you want to track how often a specific method was called. You have multiple threads incrementing a counter every time they execute a specific method. And when all threads are stopped you collect the result.

This is how a quiescent method works.

When a quiescent method gets called at a time of quiescent, e.g. no other method calls are pending, it sees the result of all previous method calls.

The classes java.util.concurrent.atomic.DoubleAccumulator, java.util.concurrent.atomic.DoubleAdder, java.util.concurrent.atomic.LongAccumulator and java.util.concurrent.atomic.LongAdder are examples of classes with quiescent methods.

Quiescent consistency was first introduced implicitly in 1994 by James Aspnes, Maurice Herlihy, and Nir Shavit in the paper Counting networks.

How do those methods behave and how to use them correctly?

In the beginning, I complained that the typical definition of thread-safety does not tell us how thread-safe classes behave and how we can use them. So what does our new definition tells us about how our class works when called from multiple threads?

Both types, atomic and quiescent allows us to map the concurrent method calls to an equivalent sequential one. And instead of reasoning about the concurrent method calls, we can reason about the equivalent sequential method flow.

Atomic methods always map to a sequential method flow. But the usage of atomic methods still can lead to a race condition. Typically errors happen when we call multiple non-commutative atomic methods from the same thread. The following method leads to a race condition since the method get and put of java.util.concurrent.ConcurrentHashMap are not commutative:

public void update(ConcurrentHashMap<Integer, Integer> map) {
  Integer result = map.get(1);
    if(result == null) {
      map.put(1, 1);
    } else {
      map.put(1, result + 1);
  }
}

To implement an update without race condition you should use the method compute:

public void update(ConcurrentHashMap<Integer,Integer>  map) {
  map.compute(1, (key, value) -> {
    if (value == null) {
       return 1;
     } 
      return value + 1;
  });
}

Quiescent methods on the other side only map to a sequential flow when they are called at a time of quiescent, e.g. no other method calls are pending. So a typical scenario for quiescent methods looks like this:

LongAdder longAdder = new LongAdder();
ExecutorService service = Executors.newCachedThreadPool();
service.submit( () -> { longAdder.increment(); } );
service.submit( () -> { longAdder.increment(); } ); 
service.shutdown();
service.awaitTermination(10, TimeUnit.SECONDS);
longAdder.longValue();

Summary

A class is thread-safe if all its public methods are either atomic or quiescent.

A method is atomic when the method call appears to take effect instantaneously, at some time between the start and the end of the method call. So other threads either see the state before or after the method call but no intermediate state. Threads reading the state see the most recently written state. Or more formally, reading method calls create a happens-before relation to previous writing method calls.

A method is quiescent when it sees the result of all previous method calls at a time of quiescent, e.g. no other method calls are pending.

© 2020 vmlens Legal Notice Privacy Policy