> draft

Comparing Java HTTP Servers' Latencies

Zhong Yu, 2014-07-28

In this experiment, we try to measure differences in latencies among several Java HTTP servers running very simple HelloWorld applications. The result likely reflects how much work each server does in a basic request-response cycle.

First, our results, for the impatient:

server round-trip (ns) latency diff (ns) ................................................... dummy 44,400 0,000 undertow 52,600 8,200 vert.x 55,900 11,500 sun 58,100 13,700 bayou 63,300 18,900 jetty 78,100 33,700 tomcat-bio 93,500 49,100 tomcat-nio 95,200 50,800

The test machine is an Amazon EC2 m3.xlarge instance with 4 CPUs, running 64bit Amazon Linux AMI. Don't let the name fool you, m3.xlarge is a very modest hardware. We use Amazon so that the test setup can be easily replicated by others.

We test the servers one at a time; for each server, we run ApacheBench on the same machine to establish a single keep-alive connection that sends 10 million requests to the server:

ab -k -c1 -n10000000 http://localhost:8080/

There is no concurrency; the communication is strictly half-duplex; both ApacheBench and the server have one thread dedicated to the connection; the two threads are likely pinned to two separate CPUs most of the time. The average round trip time for a request-response cycle would a sum of

the time ApacheBench spends on the request/response,

the time the TCP stack (local loopback) spends,

the time the server spends on the request/response.

We postulate that the first two items are constant, therefore the differences in the round trip time reflect the differences in the 3rd item, i.e. how much time each server spends on the request-response cycle.

The following servers are tested:

dummy, a phony server that does almost nothing. It's probably as far as we can get in Java. We use it as a baseline in comparisons; Though it is not a legitimate HTTP server, it'll fool ApacheBench in this simple experiment.

undertow, version 1.0.15 .

vert.x, version 2.1.2 .

sun - com.sun.net.httpserver.HttpServer shipped with the JDK.

bayou, version 0.9.4 .

Note: the author of this article is also the author of bayou.

jetty, version 9.2.1.v20140609 .

tomcat, version 8.0.9 . Both BIO (blocking IO) and NIO options are tested. In conf/server.xml , maxKeepAliveRequests="10000000" is added to <Connector> , and access log is disabled,

The test sources and dependent libraries can be found on https://github.com/zhong-j-yu/latency-diff. We create a HelloWorld app on each server that responds with a 10-byte "HelloWorld" text/plain message to any request. To be fair, responses generated by all servers should be about the same size, containing same amount of headers, as the following example

HTTP/1.1 200 OK Accept-Ranges: bytes Content-Type: text/plain;charset=UTF-8 ETag: "t-53d5df40-18519600" Last-Modified: Mon, 28 Jul 2014 05:27:28 GMT Cache-Control: private, no-cache Content-Length: 10 Connection: keep-alive Date: Mon, 28 Jul 2014 05:27:28 GMT Server: Bayou HelloWorld

All servers run on JDK 8 u20 b23 (link to tar.gz) with default JVM settings.

As previously mentioned, we run the HelloWorld app of each server, use ApacheBench to send 10 million requests over a single keep-alive connection. The average requests/second measured:

requests/second dummy 22,500 undertow 19,000 vert.x 17,900 sun 17,200 bayou 15,800 jetty 12,800 tomcat-bio 10,700 tomcat-nio 10,500

The inverse, ns/request, is the round trip time for a request-response cycle

round-trip (ns) dummy 44,400 undertow 52,600 vert.x 55,900 sun 58,100 bayou 63,300 jetty 78,100 tomcat-bio 93,500 tomcat-nio 95,200

We are interested in how much time each server spends on its own code. Since the code of the dummy server does almost nothing, we use it as the baseline for comparisons, by subtracting dummy's number from others

latency diff (ns) dummy 0,000 undertow 8,200 vert.x 11,500 sun 13,700 bayou 18,900 jetty 33,700 tomcat-bio 49,100 tomcat-nio 50,800

The remaining numbers likely reflect how much work each server does for the request-response cycle. Part of the differences may be due to the amount of features and abstractions that different servers have to support.

We see that the last place finisher, tomcat-nio , adds only 0.05 ms to the latency compared to the dummy server. Very few applications need to worry about a per-request overhead at that level.

Any of the Java servers listed here is probably way faster than you'll actually care.

The author of this article has an obvious conflict of interest.

The benchmark numbers are measured in a very specific environment for a very specific purpose. It's risky to extrapolate the numbers for other situations and purposes.

Contact: bayou-io@googlegroups.com