volatile in Java sets the following properties on a variable:

  1. Prevents instruction reordering around reads AND writes of the variable
    • No operation before a volatile write can be reordered to after it
    • No operation after a volatile read can be reordered to before it
  2. Prevents caching of the variable’s value for cross-thread visibility
    • Forces reads from main memory
    • Forces writes to flush to main memory immediately
  3. Guarantees atomicity for 64bit variables (long/double) on 32-bit JVMs
    • Writes to two 32-bit addresses are made to behave atomically by the JVM

Let’s explore what each of these statements mean.

Prevents instruction reordering

The code we write can be executed out of order by the JVM & CPU.
This can cause issues in multithreaded environments where state variables are shared between threads.

Let’s take a look at this implementation of a Singleton made thread safe with double check locking.

public class Singleton {
    private static Singleton instance;
    
    private Singleton() {}
    
    public static Singleton getInstance() {
        if (instance == null) {              // Check 1
            synchronized (Singleton.class) {
                if (instance == null) {      // Check 2
                    instance = new Singleton(); // This is not atomic
                }
            }
        }
        return instance;
    }
}

The double check avoids synchronisation overhead after the singleton is initialised! only the first few calls to getInstance() need to contend for the lock. Once it is fully initilaised the method becomes lock free! A thread would only have to lock if instance == null.

But there is a flaw in this code.
Lets say we have two threads, A and B. Thread A calls getInstance() and executes line 10, this starts the following sequence:

  1. Allocate memory on the heap for the object
  2. Initialise the object
  3. Assign the memory address to the variable

However, the JVM or CPU may reorder these instructions for performance, and the actual sequence becomes:

    1. Allocate memory on the heap for the object
    1. Assign the memory address to the variable
    1. Initialise the object

Before the object is initialised, the memory address has already been assigned to the variable.
Then, there is a context switch and thread B takes over. Now thread B calls getInstance() and at the first check sees that instance is not null! So it proceeds to retrieve the malformed object and uh bad stuff happens.

This is where volatile comes in, since volatile prevents instruction reordering, it guarantees that the Singleton object is fully initialised before being assigned to instance.

public class Singleton {
    private static volatile Singleton instance;
// ...
}

With this, thread B would not wrongly think that instance had been initialised and would try to acquire the lock that thread A was holding on to.

Prevents caching of the variable’s value for cross-thread visibility

Each CPU core has its own cache. When a thread writes to a variable, it might only update its local cache, not main memory. Other threads reading from their own caches might never see the update!

Let’s look at a simple shutdown flag pattern:

public class TaskRunner {
    private static boolean stopRequested = false;  // NOT volatile
    
    public static void main(String[] args) throws InterruptedException {
        Thread backgroundThread = new Thread(() -> {
            int i = 0;
            while (!stopRequested) {  // Reads from this thread's cache
                i++;
            }
            System.out.println("Stopped after " + i + " iterations");
        });
        
        backgroundThread.start();
        Thread.sleep(1000);
        stopRequested = true;  // Writes to main thread's cache only!
        System.out.println("Requested stop");
    }
}

Here’s what happens:

  1. Background thread starts and caches stopRequested = false in its CPU core’s cache
  2. Main thread sleeps for 1 second, then sets stopRequested = true in its cache
  3. Background thread keeps reading from its own cache, which still shows false
  4. The program never terminates! The background thread never sees the update.

Without volatile, there’s no guarantee that writes from one thread are visible to other threads. Each thread can work with its own cached copy indefinitely.

The solution:

private static volatile boolean stopRequested = false;  // Now volatile!

With volatile:

  • The main thread’s write to stopRequested is immediately flushed to main memory
  • The background thread’s read of stopRequested always fetches from main memory (bypassing its stale cache)
  • The background thread sees the update and stops correctly!

Guarantees atomicity for 64-bit variables

On 32-bit JVMs, reading or writing a long or double (64-bit values) is not atomic without volatile. The JVM performs the operation as two separate 32-bit reads/writes, leading to “word tearing.”

Let’s see how that might look like:

// Assumming this is running on a 32-bit JVM
public class Counter {
    private static long count = 0;  // NOT volatile, 64-bit value
    
    public static void main(String[] args) {
        // Thread 1: Sets count to all 1s
        Thread writer1 = new Thread(() -> {
            while (true) {
                count = 0xFFFFFFFFFFFFFFFFL;  // All bits = 1
            }
        });
        
        // Thread 2: Sets count to all 0s
        Thread writer2 = new Thread(() -> {
            while (true) {
                count = 0x0000000000000000L;  // All bits = 0
            }
        });
        
        // Thread 3: Reads count
        Thread reader = new Thread(() -> {
            while (true) {
                long value = count;
                if (value != 0 && value != 0xFFFFFFFFFFFFFFFFL) {
                    System.out.println("Torn read!");
                    break;
                }
            }
        });
        
        writer1.start();
        writer2.start();
        reader.start();
    }
}

On a 32-bit JVM, here’s what can happen:

Thread 1 writes 0xFFFFFFFFFFFFFFFF:

  1. Write to low 32 bits address: 0xFFFFFFFF
  2. Write to high 32 bits address: 0xFFFFFFFF

But Thread 2 interrupts and writes 0x0000000000000000:

  1. Write to low 32 bits address: 0x00000000
  2. Thread 3 reads here!
  3. Write to high 32 bits address: 0x00000000

Thread 3 reads a “torn” value:

  • Low 32 bits: 0x00000000 (from Thread 2)
  • High 32 bits: 0xFFFFFFFF (from Thread 1)
  • **Result: 0xFFFFFFFF00000000

The solution:

private static volatile long count = 0;  // Now volatile!

With volatile, the JVM uses special instructions (like LOCK CMPXCHG8B on x86) to ensure the entire 64-bit read or write happens atomically. Thread 3 will only ever see 0x0000000000000000 or 0xFFFFFFFFFFFFFFFF, never a mixed value.

Note: This atomicity issue only affects long and double on 32-bit architectures. Other types (int, float, references) are already atomic by nature. So is long and double on 64-bit.