5 things you didn't know about ...

Multithreaded Java programming

On the subtleties of high-performance threading

Content series: This content is part # of # in the series: 5 things you didn't know about ... Stay tuned for additional content in this series. This content is part of the series: 5 things you didn't know about ... Stay tuned for additional content in this series.

About this series So you think you know about Java programming? The fact is, most developers scratch the surface of the Java platform, learning just enough to get the job done. In this ongoing series, Java technology sleuths dig beneath the core functionality of the Java platform, turning up tips and tricks that could help solve even your stickiest programming challenges.

While few Java™ developers can afford to ignore multithreaded programming and the Java platform libraries that support it, even fewer have time to study threads in depth. Instead, we learn about threads ad hoc, adding new tips and techniques to our toolboxes as we need them. It's possible to build and run decent applications this way, but you can do better. Understanding the threading idiosyncrasies of the Java compiler and the JVM will help you write more efficient, better performing Java code.

In this installment of the 5 things series, I introduce some of the subtler aspects of multithreaded programming with synchronized methods, volatile variables, and atomic classes. My discussion focuses especially on how some of these constructs interact with the JVM and Java compiler, and how the different interactions could affect Java application performance.

1. Synchronized method or synchronized block?

You may have occasionally pondered whether to synchronize an entire method call or only the thread-safe subset of that method. In these situations, it is helpful to know that when the Java compiler converts your source code to byte code, it handles synchronized methods and synchronized blocks very differently.

When the JVM executes a synchronized method, the executing thread identifies that the method's method_info structure has the ACC_SYNCHRONIZED flag set, then it automatically acquires the object's lock, calls the method, and releases the lock. If an exception occurs, the thread automatically releases the lock.

Synchronizing a method block, on the other hand, bypasses the JVM's built-in support for acquiring an object's lock and exception handling and requires that the functionality be explicitly written in byte code. If you read the byte code for a method with a synchronized block, you will see more than a dozen additional operations to manage this functionality. Listing 1 shows calls to generate both a synchronized method and a synchronized block:

Listing 1. Two approaches to synchronization

package com.geekcap; public class SynchronizationExample { private int i; public synchronized int synchronizedMethodGet() { return i; } public int synchronizedBlockGet() { synchronized( this ) { return i; } } }

The synchronizedMethodGet() method generates the following byte code:

0: aload_0 1: getfield 2: nop 3: iconst_m1 4: ireturn

And here's the byte code from the synchronizedBlockGet() method:

0: aload_0 1: dup 2: astore_1 3: monitorenter 4: aload_0 5: getfield 6: nop 7: iconst_m1 8: aload_1 9: monitorexit 10: ireturn 11: astore_2 12: aload_1 13: monitorexit 14: aload_2 15: athrow

Creating the synchronized block yielded 16 lines of bytecode, whereas synchronizing the method returned just 5.

2. ThreadLocal variables

If you want to maintain a single instance of a variable for all instances of a class, you will use static-class member variables to do it. If you want to maintain an instance of a variable on a per-thread basis, you'll use thread-local variables. ThreadLocal variables are different from normal variables in that each thread has its own individually initialized instance of the variable, which it accesses via get() or set() methods.

Let's say you're developing a multithreaded code tracer whose goal is to uniquely identify each thread's path through your code. The challenge is that you need to coordinate multiple methods in multiple classes across multiple threads. Without ThreadLocal , this would be a complex problem. When a thread started executing, it would need to generate a unique token to identify it in the tracer and then pass that unique token to each method in the trace.

With ThreadLocal , things are simpler. The thread initializes the thread-local variable at the start of execution and then accesses it from each method in each class, with assurance that the variable will only host trace information for the currently executing thread. When it's done executing, the thread can pass its thread-specific trace to a management object responsible for maintaining all traces.

Using ThreadLocal makes sense when you need to store variable instances on a per-thread basis.

3. Volatile variables

I estimate that roughly half of all Java developers know that the Java language includes the keyword volatile . Of those, only about 10 percent know what it means, and even fewer know how to use it effectively. In short, identifying a variable with the volatile keyword means that the variable's value will be modified by different threads. To fully understand what the volatile keyword does, it's first helpful to understand how threads treat non-volatile variables.

In order to enhance performance, the Java language specification permits the JRE to maintain a local copy of a variable in each thread that references it. You could consider these "thread-local" copies of variables to be similar to a cache, helping the thread avoid checking main memory each time it needs to access the variable's value.

But consider what happens in the following scenario: two threads start and the first reads variable A as 5 and the second reads variable A as 10. If variable A has changed from 5 to 10, then the first thread will not be aware of the change, so it will have the wrong value for A. If variable A were marked as being volatile , however, then any time a thread read the value of A, it would refer back to the master copy of A and read its current value.

If the variables in your applications are not going to change, then a thread-local cache makes sense. Otherwise, it's very helpful to know what the volatile keyword can do for you.

4. Volatile versus synchronized

If a variable is declared as volatile , it means that it is expected to be modified by multiple threads. Naturally, you would expect the JRE to impose some form of synchronization for volatile variables. As luck would have it, the JRE does implicitly provide synchronization when accessing volatile variables, but with one very big caveat: reading a volatile variable is synchronized and writing to a volatile variable is synchronized, but non-atomic operations are not.

What this means is that the following code is not thread safe:

myVolatileVar++;

The previous statement could also be written as follows:

int temp = 0; synchronize( myVolatileVar ) { temp = myVolatileVar; } temp++; synchronize( myVolatileVar ) { myVolatileVar = temp; }

In other words, if a volatile variable is updated such that, under the hood, the value is read, modified, and then assigned a new value, the result will be a non-thread-safe operation performed between two synchronous operations. You can then decide whether to use synchronization or rely on the JRE's support for automatically synchronizing volatile variables. The better approach depends on your use case: If the assigned value of the volatile variable depends on its current value (such as during an increment operation), then you must use synchronization if you want that operation to be thread safe.

5. Atomic field updaters

When incrementing or decrementing a primitive type in a multithreaded environment, you're far better off using one of the atomic classes found in the java.util.concurrent.atomic package than you would be writing your own synchronized code block. The atomic classes guarantee that certain operations will be performed in a thread-safe manner, such as incrementing and decrementing a value, updating a value, and adding a value. The list of atomic classes includes AtomicInteger , AtomicBoolean , AtomicLong , AtomicIntegerArray , and so forth. The latest additions to the atomic package are DoubleAccumulator , DoubleAdder , LongAccumulator and LongAdder classes. They maintain a set of internal variables in order to reduce contention and operate around the given lambda expression.

The challenge of using atomic classes is that all class operations, including get , set , and the family of get-set operations, are rendered atomic. This means that read and write operations that do not modify the value of an atomic variable are synchronized, not just the important read-update-write operations. The workaround, if you want more fine-grained control over the deployment of synchronized code, is to use an atomic field updater.

Using atomic updates

Atomic field updaters like AtomicIntegerFieldUpdater , AtomicLongFieldUpdater , and AtomicReferenceFieldUpdater are basically wrappers applied to a volatile field. Internally, the Java class libraries make use of them. While they are not widely used in application code, there's no reason you can't use them too.

Listing 2 presents an example of a class that uses atomic updates to change the book that someone is reading:

Listing 2. Book class

package com.geeckap.atomicexample; public class Book { private String name; public Book() { } public Book( String name ) { this.name = name; } public String getName() { return name; } public void setName( String name ) { this.name = name; } }

The Book class is just a POJO (plain old Java object) that has a single field: name.

Listing 3. MyObject class

package com.geeckap.atomicexample; import java.util.concurrent.atomic.AtomicReferenceFieldUpdater; /** * * @author shaines */ public class MyObject { private volatile Book whatImReading; private static final AtomicReferenceFieldUpdater<MyObject,Book> updater = AtomicReferenceFieldUpdater.newUpdater( MyObject.class, Book.class, "whatImReading" ); public Book getWhatImReading() { return whatImReading; } public void setWhatImReading( Book whatImReading ) { //this.whatImReading = whatImReading; updater.compareAndSet( this, this.whatImReading, whatImReading ); } }

The MyObject class in Listing 3 exposes its whatAmIReading property as you would expect, with get and set methods, but the set method does something a little different. Instead of simply assigning its internal Book reference to the specified Book (which would be accomplished using the code that is commented out in Listing 3), it uses an AtomicReferenceFieldUpdater .

AtomicReferenceFieldUpdater

The Javadoc for AtomicReferenceFieldUpdater defines it as follows:

A reflection-based utility that enables atomic updates to designated volatile reference fields of designated classes. This class is designed for use in atomic data structures in which several reference fields of the same node are independently subject to atomic updates.

In Listing 3, the AtomicReferenceFieldUpdater is created by a call to its static newUpdater method, which accepts three parameters:

The class of the object containing the field (in this case, MyObject )

) The class of the object that will be updated atomically (in this case, Book )

) The name of the field to be updated atomically

The real value here is that the getWhatImReading method is executed without synchronization of any kind, whereas the setWhatImReading is executed as an atomic operation.

Listing 4 illustrates how to use the setWhatImReading() method and asserts that the value changes correctly:

Listing 4. Test case that exercises the atomic update

package com.geeckap.atomicexample; import org.junit.Assert; import org.junit.Before; import org.junit.Test; public class AtomicExampleTest { private MyObject obj; @Before public void setUp() { obj = new MyObject(); obj.setWhatImReading( new Book( "Java 2 From Scratch" ) ); } @Test public void testUpdate() { obj.setWhatImReading( new Book( "Pro Java EE 5 Performance Management and Optimization" ) ); Assert.assertEquals( "Incorrect book name", "Pro Java EE 5 Performance Management and Optimization", obj.getWhatImReading().getName() ); } }

See Related topics to learn more about atomic classes.

Conclusion

Multithreaded programming is always challenging, but as the Java platform has evolved, it has gained support that simplifies some multithreaded programming tasks. In this article, I discussed five things that you may not have known about writing multithreaded applications on the Java platform, including the difference between synchronizing methods versus synchronizing code blocks, the value of employing ThreadLocal variables for per-thread storage, the widely misunderstood volatile keyword (including the dangers of relying on volatile for your synchronization needs), and a brief look at the intricacies of atomic classes.

Downloadable resources

Related topics