It is now about four and a half years since the Servlet 3.0 specification was released in December 2009, together with Java EE 6. One feature that came in Servlet 3.0 was the possibility to decouple an HTTP request from any container threads, which is often referred to as Async Servlets, but perhaps more correctly called Asynchronous Processing in Servlets. I believe this is still an underutilized feature.

Motivation

By making a servlet asynchronous, you can return the thread calling the servlet to the container and continue the remainder of the request on another thread. In many cases just a few threads can handle lots of requests simultaneously, for example if an asynchronous HTTP client is used to call an external REST endpoint. This causes the number of threads needed to handle requests in the container to go down significantly. By using a lot fewer threads we save memory (normally hundreds of kilobytes per thread, although this is tweakable), but not only that – we also improve performance, because of reduced thread context switching.

Asynchronous processing should be used for requests that take a “long” time to process, especially when the request needs to wait for something. Often this is IO but it could also be some kind of event that is triggered independently of the request. Even if the called service mainly computes something you might want to hand the computation over to a thread pool that takes care of it (perhaps in parallel) and thereby let go of the container thread.

To be a viable case for async the client should also be interested in some result coming from the service, or otherwise you can just respond synchronously, but continue processing afterwards in another thread.

Another benefit that comes with async processing is that non-blocking IO can be used for the request and response bodies. This is a feature that came with Servlet 3.1 in Java EE 7 that was released in May 2013. By using this, which is especially appropriate if the request or response body is large, you can keep even more threads from being blocked.

When should it then not be used? Perhaps not by default, since it adds some complexity and a bit more code is needed. Error handling and debugging also become a little harder. When the request and response are small, the external IO is low and the processing is short there is least benefit of using an asynchronous servlet.

How it works

How does it work then? Actually it is quite simple to use. The basic flow for a servlet that calls an external REST service using an async HTTP client (for example AsyncHttpClient) looks like this:

The container calls the servlet. The servlet tells the container that the request should be asynchronous, by calling ServletRequest.startAsync . startAsync returns an AsyncContext . Thereafter the asynchronous call to the REST service is made. The service method of the servlet returns and at this point no thread is associated specifically with this request. Some time passes and then the response from the external REST service comes back. That response is processed and a response from this servlet is created and sent. The method AsyncContext.complete is called to tell the container that this request has been processed by the servlet.

Let us take a look at how that might look in code. Working code examples related to this post are available on GitHub.

protected void doGet(final HttpServletRequest req, HttpServletResponse resp) { // Initialize async processing. final AsyncContext context = req.startAsync(); // This call does not block. client.callExternalService( // This callback is invoked after the external service responds. new Callback () { public void callback(String result) { ServletResponse response = context.getResponse(); response.setContentType("text/plain"); response.setCharacterEncoding("UTF-8"); byte[] entity = ("Result: " + result + ".n").getBytes(Charset.forName("UTF-8")); response.setContentLength(entity.length); try { response.getOutputStream().write(entity); } catch (IOException e) { // Ignored. } context.complete(); } }); }

Timeouts

One thing you might want to take care of is timeouts. You can set the timeout of a specific request through AsyncContext.setTimeout , otherwise a container default value will be used. Then you can “listen” for timeouts by attaching an AsyncListener through the method AsyncContext.addListener . The AsyncListener interface looks like this:

public interface AsyncListener extends EventListener { void onComplete(AsyncEvent); void onError(AsyncEvent); void onStartAsync(AsyncEvent); void onTimeout(AsyncEvent); }

And the AsyncEvent like this:

public class AsyncEvent { public AsyncContext getAsyncContext() { ... } public ServletRequest getSuppliedRequest { ... } public ServletResponse getSuppliedResponse { ... } public Throwable getThrowable { ... } }

The code that sends the response (in the callback in the example above) will get an IllegalStateException if the request timed out, since the AsyncContext will be completed. This needs to be handled and the easiest might be to just catch the exception.

One thing I found odd is that, at least when using Jetty, when you don’t have any custom timeout handling the servlet is called again after the timeout occur. The second time the response will have status code 500, so you can take care of this by checking the status code at the top of the service method and just return if the status is 500.

Non-blocking IO

If your service is expected to receive large request or response bodies, especially if the clients write or read slowly, you would benefit from using the non-blocking IO feature introduced in Servlet 3.1, as mentioned earlier. On the ServletInputStream there is the method setReadListener where you can set a ReadListener . The ReadListener interface looks like this:

public interface ReadListener extends EventListener { void onAllDataRead(); void onDataAvailable(); void onError(Throwable); }

onDataAvailable will be called whenever it is possible to read data without blocking. Inside that method you should read as long as ServletInputStream.isReady returns true . Here is an example of how this can be used.

Similarly there is a method setWriteListener on ServletOutputStream to set a WriteListener . The WriteListener interface looks like this:

public interface WriteListener extends EventListener { void onError(Throwable); void onWritePossible(); }

Writing works analogously to reading.

Jersey

JAX-RS 2.0 and Jersey 2.x also support asynchronous processing. The example above might look like this with Jersey:

@GET @Produces(MediaType.TEXT_PLAIN) public void get(@Suspended final AsyncResponse response) { // This call does not block. client.callExternalService( // This callback is invoked after the external service responds. new Callback () { public void callback(String result) { response.resume("Result: " + result + ".n"); } }); }

Here an AsyncResponse is injected by using the Suspended annotation. This is later in a callback used to respond by invoking the resume method. This method takes the arguments we would normally return from the resource method, which now instead is declared with return type void .

Spring

Spring has server-side async support as well. The same example again using a Spring controller:

@RequestMapping(value = "", method = GET, produces = "text/plain") @ResponseBody public DeferredResult get() { final DeferredResult result = new DeferredResult<>(); // This call does not block. client.callExternalService( // This callback is invoked after the external service responds. new Callback () { public void callback(String result) { result.setResult("Result: " + result + ".n"); } }); return result; }

Returning a DeferredResult signals to Spring that the request should be treated asynchronously. After invoking setResult the response will be sent back to the client.

Async clients

To take full advantage of asynchronous processing in servlets we need asynchronous non-blocking APIs, which do not block a thread while waiting for the response. Such an API most likely uses a thread pool with just a few threads, which are able to handle a large number of simultaneous outstanding requests. This can be achieved either by using non-blocking IO or using a message-based protocol. Here follows an incomplete overview of such client APIs for Java.

HTTP

AsyncHttpClient, which uses Netty by default, but can also use Grizzly or Apache.

Jersey client uses Apache, Grizzly or Jetty. Unfortunately this bug currently causes Jersey client to put each request in its own thread, which then blocks waiting for a response. Additionally, when the response comes back yet another thread is started for each request, so there are two threads per simultaneous request plus the thread pool of the underlying http client. Fail…

Spring AsyncRestTemplate, which makes use of Apache.

RDBMS

ADBCJ, Asynchronous Database Connectivity in Java, looks like an attempt to make a non-blocking API, but it seems abandoned now.

File

NoSql

For data stores that have an HTTP API you can use an async HTTP client.

For MongoDB there is MongoDB Asynchronous Java Driver.

Cassandra: Java Driver 1.0 for Apache Cassandra.

Lettuce for Redis.

Summary

Using asynchronous processing can take you a long way in making your web application more scalable. Both latency and throughput can be improved. To take full advantage you should use non-blocking IO for the request and response and use asynchronous APIs that use non-blocking IO for the external services you call.

Comments are most welcome. What are your experiences with asynchronous processing?