Sunday, January 22, 2012

Volatile Keyword in Java

Multithreading is one of the most important software technologies for boosting the performance and scalability of all types of software.   However, it comes at a prisecomplexity.  The pain of concurrent programming can be alleviated by a framework like Hadoop.  However, as a Java programmer using threads, you need to deal with three interwined issues:
  1. Atomicity
  2. Visibility
  3. Ordering
Working Memory

Java Memory Model defines an abstract relation between threads and main memory. Every thread is defined to have a working memory (an abstraction of caches and registers) in which to store values. The model guarantees a few properties surrounding the interactions of instruction sequences corresponding to methods and memory cells corresponding to fields. Most rules are phrased in terms of when values must be transferred between the main memory and per-thread working memory.

Common variables such as instance fields, static fields and array elements in heap memory can be shared between threads.  At any time, these variables can be kept in one of the following locations:
  • Register
  • L1-L3 caches
  • Main memory
  • Hard disk (passivation and activation of Java Beans; paging or swapping)
When multiple threads are all running unsynchronized code that reads and writes common variables, then:
  • Arbitrary interleavings
  • Atomicity failures
  • Race conditions
  • Visibility failures
may result in execution patterns.
 
Atomicity

The language specification guarantees that reading or writing a single variable is atomic unless the variable is of type long or double.  This includes fields serving as references to other objects.  In other words, this implies that every thread accessing a field of any type except long or double will read its current value before continuing, instead of (potentially) using a cached value. This atomicity guarantee can be extended to longs or doubles, if you declare them volatile.  We'll discuss this more later.

Visibility

Visibility discusses under what conditions the effects of one thread are visible to another. The effects of interest here are writes to fields, as seen via reads of those fields.  While the atomicity guarantee ensures that a thread will not see a random value when reading atomic data, it does not gurantee that a value written by one thread will be visible to another:
  • Synchronization is required for reliable communication between threads as well as for mutual exclusion[8]
If a variable is declared volatile, this signals that the variable will be accessed by multiple threads, and also gives visibility guarantees.

Ordering

Ordering describes under what conditions the effects of operations can appear out of order to any given thread. The main ordering issues surround reads and writes associated with sequences of assignment statements.

If a program has no data races, then all executions of the program will appear to be sequentially consistent.  If JLS were to use sequential consistency as its memory model, many of the compiler and processor optimizations would be illegal.  For example, JLS allows the following statements to be reordered:


This provides essential flexibility for compilers and machines. Exploitation of such opportunities (via pipelined superscalar CPUs, multilevel caches, load/store balancing, interprocedural register allocation, and so on) is responsible for a significant amount of the massive improvements in execution speed seen in computing over the past decade.

In other words, not only may concurrent executions be interleaved, but they may also be reordered and otherwise manipulated in an optimized form that bears little resemblance to their source code. As compiler and run-time technology matures and multiprocessors become more prevalent, such phenomena become more common. They can lead to surprising results for programmers with backgrounds in sequential programming who have never been exposed to the underlying execution properties of allegedly sequential code. This can be the source of subtle concurrent programming errors. In almost all cases, there is an obvious, simple way to avoid contemplation of all the complexities arising in concurrent programs due to optimized execution mechanics: Use synchronization. There are multiple ways to achieve synchronization.  Below we'll disucusss using volatile keyword in Java for limited cases.

Volatile

In terms of atomicity, visibility, and ordering, declaring a field as volatile is nearly identical in effect to using a little fully synchronized class protecting only that field via get/set methods, as in:
final class VFloat { 
  private float value; 

  final synchronized void set(float f) { value = f; } 
  final synchronized float get() { return value; } 
}

This may invovle low-level memory barrier machine instructions to keep value representations in synch across threads.  However, it involves no locking.   In the following sections, we will discuss what're the good occasions for you to use volatile and what're the dangers of misusing it.

When to Use Volatile

Declaring fields as volatile can be useful when you do not need locking for any other reason, yet values must be accurately accessible across multiple threads. This may occur when[5]:
  • The field need not obey any invariants with respect to others.
  • Writes to the field do not depend on its current value.
  • No thread ever writes an illegal value with respect to intended semantics.
  • The actions of readers do not depend on values of other non-volatile fields.
Below we provide some examples for such usages:

Use volatile in DCL

Double-checked locking (DCL) is OK as of Java 5 provided that you make the instance reference volatile.

// Works with acquire/release semantics for volatile in Java 5
class Foo {
  private volatile Helper helper = null;
  public Helper getHelper() {
  if (helper == null) {
    synchronized(this) {
      if (helper == null)
        helper = new Helper();
      }
    }
    return helper;
  }
}

In Java 5, JLS ensures that the unsycnrhonized volatile read must happen after the write has taken place, and the reading thread will see the correct values of all fields on Helper.

Use volatile in Control Flag

// Cooperative thread termination with a volatile field
public class StopThread {
    private static volatile boolean stopRequested;

    public static void main(String[] args)
            throws InterruptedException {
        Thread backgroundThread = new Thread(new Runnable() {
            public void run() {
                int i = 0;
                while (!stopRequested)
                    i++;
            }
        });
        backgroundThread.start();

        TimeUnit.SECONDS.sleep(1);
        stopRequested = true;
    }
}

Volatile declarations on control flags are needed to ensure that result flag values are visible across threads.
  
Dangers of Using volatile Keyword

Composite operations such as the "++" operation on volatile variables both read and write the variable.  So, they are not atomic.

// Need to add synchronized modifier to the volatile variable
// Once you’ve done that, you can and should remove the volatile modifier 
// from nextSeuqenceNumber[8].
private static int nextSequenceNubmer = 0;
public static synchronized int nextSeuqenceNumber() {
  return nextSequenceNumber++;
}

Ordering and visibility effects surround only the single access or update to the volatile field itself. Declaring a reference field as volatile does not ensure visibility of non-volatile fields that are accessed via this reference. Similarly, declaring an array field as volatile does not ensure visibility of its elements.   In other words, it is unsafe to call arr[x] = y on an array (even if declared volatile) in one thread and then expect arr[x] to return y from another thread.  See [1] for possible ways of fixing this issue.

Because no locking is involved, declaring fields as volatile is likely to be cheaper than using synchronization, or at least no more expensive. However, if volatile fields are accessed frequently inside methods, their use is likely to lead to slower performance than would locking the entire methods.

References
  1. Volatile Arrays in Java
  2. Dangers of Volatile Keyword
  3. The Volatile Keyword in Java 5
  4. The Volatile Keyword in Java
  5. Synchronization and Thread Safety in Java
  6. Double-Checked Locking and How to Fix it
  7. Synchronization and Java Memory Model
  8. Effective Java by Joshua Bloch
  9. The Java Language Specification

No comments: