Java Concurrency: Copy On Write

September 01, 2019

Category: The package java.util.concurrent.atomic

Copy on write is a technique which allows you to update a data structure in a thread-safe way. The main advantage of copy on write is that reading threads get never blocked.

Why do we need this technique? And how to use this technique correctly?

Why copy on write?

In the following, I want to implement a thread-safe class representing an address. To make the example short the address consists only of the street, the city, and the phone number:

You can download the source code of all examples from GitHub here.

public class MutableAddress {
	private volatile String street;
	private volatile String city;
	private volatile String phoneNumber;
	public MutableAddress(String street, String city, String phoneNumber) {
		this.street = street;
		this.city = city;
		this.phoneNumber = phoneNumber;
	}
	public String getStreet() {
		return street;
	}
	public String getCity() {
		return city;
	}
	public void updatePostalAddress(String street ,String city ) {
		this.street = street;
		this.city = city;
	}
	@Override
	public String toString() {
		return "street=" + street + 
		",city=" + city + 
		",phoneNumber=" + phoneNumber;
	}
}

I use volatile fields, line 2 till 4, to make sure that the threads always see the current values, as explained in greater detail here.

To check if this class is thread-safe I use the following test:

public class ConcurrencyTestReadWrite {
  private final MutableAddress address = new MutableAddress("E. Bonanza St." 
	, "South Park" , "456 77 99");
  private String readAddress;
  @Interleave(ConcurrencyTestReadWrite.class)
  private void updatePostalAddress() {
  	address.updatePostalAddress("Evergreen Terrace" , "Springfield");
   }
  @Interleave(ConcurrencyTestReadWrite.class)
  private void read() {
	readAddress = address.toString();
  }	
  @Test
  public void test() throws InterruptedException {
   Thread first  = new Thread( () ->    {  updatePostalAddress();  } ) ;
   Thread second = new Thread( () ->   {  read();  } ) ;
   first.start();
   second.start();
   first.join();
   second.join();	
   assertTrue(  "readAddress:" + readAddress  ,  
	readAddress.equals(
	"street=E. Bonanza St.,city=South Park,phoneNumber=456 77 99")  || 
	readAddress.equals(
	"street=Evergreen Terrace,city=Springfield,phoneNumber=456 77 99") );	
  }
}

I need two threads to test if the class is thread-safe, created in line 15 and 16. I start those two threads, line 17 and 18. And then wait till both are ended using thread join, line 19 and 20. After both threads are stopped I check if the read address equals either the value before or after the update, line 21 till 25.

To test all thread interleavings I use the annotation Interleave, line 5 and 9, from vmlens. The Interleave annotation tells vmlens to test all thread interleavings for the annotated method. Running the test we see the following error:

java.lang.AssertionError: readAddress:
	street=Evergreen Terrace,city=South Park,phoneNumber=456 77 99

We read a mixture between the initial address, e.g. the city South Park and the updated address e.g. the street Evergreen Terrace. To see what went wrong let us look at the vmlens report:

So first the writing thread, thread id 13, updates the street. Then the reading thread, thread id 14, reads the street, city and phone number. Thereby reading the already updated street but the initial city.

Copy on write

To solve this bug I use the copy on write technique. The idea is to create a new copy of the object when writing. Then change the values in the newly created object and publish the copied object. Since I need to copy the object I can make it immutable. The address using the copy on write technique then consists of the following two classes:

First, the immutable class to represent the current address:

public class AddressValue {
	private final String street;
	private final String city;
	private final String phoneNumber;
	public AddressValue(String street, String city, 
				String phoneNumber) {
		super();
		this.street = street;
		this.city = city;
		this.phoneNumber = phoneNumber;
	}
	public String getStreet() {
		return street;
	}
	public String getCity() {
		return city;
	}
	public String getPhoneNumber() {
		return phoneNumber;
	}
}

Second, the mutable class to implement the copy on write technique:

public class AddressUsingCopyOnWrite {
	private volatile AddressValue addressValue;
	private final Object LOCK = new Object();
	@Override
	public String toString() {
		AddressValue local = addressValue;
		return "street=" + local.getStreet() +
		",city=" + local.getCity() + 
		",phoneNumber=" + local.getPhoneNumber();
	}
	public AddressUsingCopyOnWrite(String street, String city, String phone) {
		this.addressValue = new AddressValue( street,  city,  phone);
	}
	public void updatePostalAddress(String street ,String city ) {
		synchronized(LOCK){
			addressValue = new AddressValue(  
			street,  city,  addressValue.getPhoneNumber() );
		}
	}
	public void updatePhoneNumber( String phoneNumber) {
		synchronized(LOCK){
			addressValue = new AddressValue(  
			addressValue.getStreet(), addressValue.getCity(),  phoneNumber );
		}	
	}
}

An update now consists of creating a new copy of AddressValue, line 16 and 17 for updating the postal address and line 22 and 23 to update the phone number.

Using those two classes the tests succeeds, making the address thread-safe.

Why using a local variable when reading

As you see in the toString method I store the addressValue variable in the local variable local, line 6. Why?

Let us see what happens when we directly access the variable addressValue instead of using a local variable:

public String toStringNotThreadSafe() {
	return "street=" + addressValue.getStreet() + 
	",city=" + addressValue.getCity() + 
	",phoneNumber=" + addressValue.getPhoneNumber();
}

Running the test we see the following error:

java.lang.AssertionError: readAddress:
	street=E. Bonanza St.,city=Springfield,phoneNumber=456 77 99

So we again read an inconsistent address. We can again see in the vmlens report what went wrong:

The reading thread, thread id 14, first reads the variable addressValue to get the street. Then the writing thread, thread id 14, update the variable addressValue. Now the reading threads reads the variable addressValue to get the city and phone number. So the reading thread reads partially the initial and partially the updated address.

Why synchronized block when writing

The second part to make the copy on write technique thread-safe is a synchronized block when we write to the variable addressValue. Why?

Let us see what happens when we remove the synchronized block

public void updatePostalAddress(String street ,String city ) {
			addressValue = new AddressValue(  street,  city,  
				addressValue.getPhoneNumber() );
}
public void updatePhoneNumber( String phone) {
			addressValue = new AddressValue(  addressValue.getStreet(),  
				addressValue.getCity(),  phone );
}

Running the test we see the following:

[INFO] BUILD SUCCESS

No error. The test still succeeds.

To see why we need the synchronized block we need a different test We need to test what happens when we update different parts of our address from different threads. So we use the following test:

public class ConcurrencyTestTwoWrites {
   private final AddressUsingCopyOnWriteWithoutSynchronized address = 
    new AddressUsingCopyOnWriteWithoutSynchronized("E. Bonanza St." 
    , "South Park" , "456 77 99"); 
  @Interleave(ConcurrencyTestTwoWrites.class)
  private void updatePostalAddress() {
   address.updatePostalAddress("Evergreen Terrace" , "Springfield");
  }
  @Interleave(ConcurrencyTestTwoWrites.class)
  private void updatePhoneNumber() {
   address.updatePhoneNumber("99 55 2222");
  } 
  @Test
  public void test() throws InterruptedException {
   Thread first  = new Thread( () -> {  updatePostalAddress();} ) ;
   Thread second = new Thread( () -> {  updatePhoneNumber();  } ) ;
   first.start();
   second.start();
   first.join();
   second.join(); 
   assertEquals(  "street=Evergreen Terrace,
   city=Springfield,phoneNumber=99 55 2222" , 
   address.toString() );
  }
}

In this test, the first thread updates the postal address, line 15 and the second thread updates the phone number, line 16. After both threads are stopped I check if the read address contains the new phone number and postal address, line 21 till 23.

If we run this test we see the following error:

org.junit.ComparisonFailure: 
	expected:<...ngfield,phoneNumber=[99 55 2222]> 
	but was:<...ngfield,phoneNumber=[456 77 99]>

The problem is that without synchronization a thread overrides the update from another thread leading to a race condition. By surrounding every write to the variable addressValue we avoid this race and this test also succeeds.

Comparison to read-write locks

Using copy on write, only writing threads get blocked by other writing threads. All other combinations are non-blocking. So reading threads get never blocked and writing threads are not blocked by a reading thread.

Compare this to read-write locks where reading threads get blocked by writing threads. And where writing threads not only get blocked by other writing threads but also by reading threads.

Conclusion

Copy on write let us update a class in a thread-safe way. The main advantage of this technique is that reading threads never block and that writing threads only get blocked by other writing threads.

When you use this technique make sure that you always use a local variable when reading and a synchronized block when writing.

Make your application thread safe

LEARN MORE