In this scenario, we’ve rescued a few hours of productive work from the first worker, and the second worker was able to get the box shipped a few hours earlier than would have happened with the first inefficient scheduling algorithm.

This is more or less how work stealing functions in Java. Work stealing is a solution for coordination and scheduling inefficiencies with plain old thread pools. Work stealing is a huge improvement when you consider that scheduling in most servlet-based MVC frameworks involves nothing more elegant than a thread grabbing requests from a FIFO queue and spinning on the request — even when blocked — until it has enough data to send a response.

It’s the difference in elegance between painting with an artist’s airbrush and painting with a sledgehammer dipped in paint.

Concurrency in Java

Play concurrency is built on top of Java concurrency, so it’s important to understand the basics of Java concurrency before you start production tuning Play applications.

ForkJoin

Under the covers of Play, work stealing is implemented with the ForkJoin Framework (JSR-166). ForkJoin was first added to Java 7 and then refined in Java 8, making it more efficient for Java developers to write concurrent/parallel code and avoid the complexities of locks, monitors, and synchronization.

Work stealing can be considered a subset of scheduling, and ForkJoin implements its own work stealing technique when scheduling tasks that can be broadly described as follows:

The fork/join framework is distinct because it uses a work-stealing algorithm. Worker threads that run out of things to do can steal tasks from other threads that are still busy. — The Java™ Tutorials

The parallelism building blocks in ForkJoin are:

ForkJoinPool — An instance of this class is used to run all of your fork-join tasks

— An instance of this class is used to run all of your fork-join tasks RecursiveTask — Run a subclass of this in a pool and have it return a result

— Run a subclass of this in a pool and have it return a result RecursiveAction — Run a subclass of this in a pool but without returning a result

— Run a subclass of this in a pool but without returning a result ForkJoinTask — Superclass of RecursiveTask and RecursiveAction ; fork and join are methods defined in this class

Traditionally, web frameworks assign a new request to a thread and the thread processes the request until it sends a response back to the client. This is inefficient for web systems because servicing a typical request requires blocking the thread for a number of tasks like database transactions, external web service calls, etc. Heritage web frameworks were not built for the sheer volume of interactions that modern web-facing systems demand, like executing tens (or hundreds!) of back-end transactions to service a single request.

With work stealing, rather than hog a thread even while it’s blocked (e.g, waiting for a response from a database), we have a much more sophisticated way to schedule the tasks involved in building out a response that takes into account that many of the tasks involve waiting around rather than productive work.

Concurrency in Play

Now that we have a general understanding of what ForkJoin is, let’s dive into how Play builds on top of ForkJoin to implement its own concurrency model.

Execution contexts

An ExecutionContext is the main concurrency abstraction in Play that represents either a thread pool or ForkJoin pool.

In Java this is called Executor . Keep in mind that Play itself is written in Scala, so when you’re configuring concurrency in Play you’ll use the ExecutionContext term.

This is one area in the Lightbend platform where Scala bleeds into the API for Java developers. It’s not an issue as long as you understand why and where the naming conventions are different. Ultimately, understand that Scala is the underpinning of all frameworks in the Lightbend platform, so Scala will inevitably bleed into the Java APIs in small ways.

Futures

A Future[T] in Scala — or CompletionStage<T> in Java — can be considered a box that represents a task to be completed and an eventual value.

The tasks can be considered the operations that gather items for our box before shipping it to the receiver.

In Play, the T represents the type of item that will be placed inside the box. The tasks would be each individual step defined within a Play action. Instead of going to a store, a task would involve going to a database.

Actions

The core of the developer experience in Play are actions, which are functions that are executed when an incoming request is routed to that action based on URI matching. An action can be defined as synchronous or asynchronous.

Because Play is built on top of Scala, a hybrid object/functional programming language, actions themselves are simply anonymous functions.

Actions have a return type as follows:

Scala — Future[HttpResponse] (asynchronous)

(asynchronous) Scala — HttpResponse (synchronous)

(synchronous) Java — CompletionStage<Result> (asynchronous)

(asynchronous) Java — Result (synchronous)

As you can see, without futures you simply have synchronous actions. When you combine actions with futures, you have asynchronous actions.

Scala

The following Scala action is synchronous:

The following Scala action is asynchronous:

All that separates a synchronous action from an asynchronous action in Play is .async . One of the biggest mistakes I find during Play code reviews is a fundamental misunderstanding of what async means, and without the proper use of .async , Play can suffer from worse performance than a synchronous, servlet-based framework. It’s simply not tuned to allocate a single thread for every single request/response.

Java

In Java, an asynchronous action has a type of CompletionStage<Result> . In this Java example we process a simple form from the client and grab the items of a shopping cart asynchronously:

You’ll notice that the method we call to get the client’s shopping cart is also asynchronous. Within an action, if all of our dependent method calls return CompletionStage<T> , we can compose the entire action to be asynchronous all the way down.

Putting the pieces together

Let’s review our original concurrency example and consider the following:

The box represents the instance of Future[T] (in Scala) or CompletionStage<T> (in Java)

(in Scala) or (in Java) The tracking number represents the reference to the future

The task represents the lambda code that produces the result

Item(s) placed inside the box represents the result

The syntactic difference of implementing efficient concurrency and scheduling is subtle, just like the inefficiencies in our anecdotal example.

Modern web applications rely on many blocking calls to produce results, such as making requests from external web services and databases.

Wherever possible it’s best to eliminate blocking calls altogether, but blocking is not the enemy — synchronous blocking is the enemy.

The four modes of concurrency, from best to worse, are:

Asynchronous and non-blocking (excellent!) Asynchronous and blocking (acceptable) Synchronous and non-blocking (meh) Synchronous and blocking (awful!)

Traditional web frameworks were simply not built for the age of API-first integration. Today, web applications are composed of other web applications which are composed of other web applications. If we look at a typical Play action, it makes calls to caches, databases, web services, identity services, secret stores, and the list goes on. Allocating a single thread for all of that work is completely inefficient given how limited and precious threads are.

Using ForkJoin, on a machine with 48 cores the Akka team was able to saturate those cores with 20 million messages per second. In comparison, using thread pools, they were unable to push more than 2 million messages per second.

Ultimately, getting the most out of Play requires an understanding of how it implements concurrency. Once you do, the benefits of a properly tuned Play application are immense!

Further Reading