Performance Comparison Between Node.js and Java EE For Reading JSON Data from CouchDB

Posted by by

Node.js has impressed me several times with high performance right out of the box. In my last Node.js project it was the same: we beat the given performance targets without having to tweak the application at all. I never really experienced this in Java EE projects. Granted the project was perfectly suited for Node.js: a small project centered around fetching JSON documents from CouchDB. Still I wanted to know: how would Java EE compare to Node.js in this particular case?

TL;DR I wanted to know how the performance of the application built on a vanilla Java stack would compare. I ran some simple performance tests. Turns out the Node.js application actually was 20% faster than a similar Java servlet application running on Tomcat 7. Not bad. You cannot generalise these results, though.

The Original Project

The Node.js project had to meet the following performance targets: 150 requests/second at 200ms average response time. I am not a performance guru, 200ms response time sounded pretty fast, and my feeling was that we would have to tweak the application to reach those goals.

A separate team ran performance tests against our application, and when the results came back the application actually had exceeded all performance targets: 200 requests/second at 100ms average response time. That was much better than the targets. I was quite amazed that Node.js was outperforming the requirements by such a margin, and all of this without any performance optimisation.

I asked myself: Is this really a good performance given the functionality of the application? Is Node.js just magically fast? What would the performance be if we would’ve gone with the established platform of Java EE?

I really couldn’t answer that question. Many Java EE applications I have worked on had response times that felt more like 1000ms, but they had more complex functionality than our Node.js application did. The core of our application only pulled out JSON documents by ID from a single table in a CouchDB database. No complex SQL, no table joins, and no data manipulation. I don’t know how a Java EE application would perform given those requirements. So I went out to answer the question: Can the perceived performance of Node.js vs. a traditional Java EE system be backed up by hard performance tests?

To answer this question I designed a set of performance tests to be run against both a Java EE application and a Node.js application, both backed by the same CouchDB database, and looked at how the two systems compared.

Preparation

I ran the the same performance tests against both a Node.js application and a Java servlet application. Both applications used the same backend as our original Node.js application: CouchDB. I used CouchBase Single Server version 1.1.3. I created 10.000 sample documents of 4KB each with random text. The test machine was a iMac with 2.4 GHZ Intel Core 2 Duo, 4 GB RAM, and Mac OS X.

I used Apache JMeter running on a separate machine as a test driver. The JMeter scripts fetched random documents from each application at various levels of concurrency.

Java EE

The Java servlet was run on an Apache Tomcat version 7.0.21, default configuration running on Java 1.6. The database driver was CouchDB4J version 0.30. The driver has no caching options available, so no configuration was done.

The following Java code is a servlet that fetches a document from CouchDB by id and forwards the data as a JSON object.

package com.shinetech.couchDB; import java.io.IOException; import java.io.PrintWriter; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; import org.apache.log4j.Logger; import com.fourspaces.couchdb.Database; import com.fourspaces.couchdb.Document; import com.fourspaces.couchdb.Session; @SuppressWarnings("serial") public class MyServlet extends HttpServlet { Logger logger = Logger.getLogger(this.getClass()); Session s = new Session("localhost",5984); Database db = s.getDatabase("testdb"); public void doGet(HttpServletRequest req, HttpServletResponse res) throws IOException { String id = req.getPathInfo().substring(1); PrintWriter out = res.getWriter(); Document doc = db.getDocument(id); if (doc==null){ res.setContentType("text/plain"); out.println("Error: no document with id " + id +" found."); } else { res.setContentType("application/json"); out.println(doc.getJSONObject()); } out.close(); } }

I ran the JMeter tests against this servlet at various levels of concurrency. The following table shows the number of concurrent requests, the average response time, and the requests that were served per second.

Concurrent Requests Average Response time (ms) Requests/second 10 23 422 50 119 416 100 243 408 150 363 411

What can be seen is that the response time deteriorates as the number of concurrent requests increases. The response time was 23 ms on average at 10 concurrent requests, and 243 ms on average at 100 concurrent requests.

The interesting part is that the average response time has an almost linear correlation to the number of concurrent requests, so that a tenfold increase in concurrent requests leads to a tenfold increase in response time per request. This makes the number of requests that can be handled per second is pretty constant, regardless of whether we have 10 concurrent requests or 150 concurrent requests. At all observed concurrency level the number of requests served per second was roughly 420.

Node

The Node.js application ran on Node.js 0.10.20 using the Cradle CouchDB driver version 0.57. The caching was turned off for the driver to create equal conditions.

The following shows the Node.js program that delivers the same JSON document from CouchDB for a given ID:

var http = require ('http'), url = require('url'), cradle = require('cradle'), c = new(cradle.Connection)( '127.0.0.1',5984,{cache: false, raw: false}), db = c.database('testdb'), port=8081; process.on('uncaughtException', function (err) { console.log('Caught exception: ' + err); }); http.createServer(function(req,res) { var id = url.parse(req.url).pathname.substring(1); db.get(id,function(err, doc) { if (err) { console.log('Error'+err.message); res.writeHead(500,{'Content-Type': 'text/plain'}); res.write('Error' + err.message); res.end(); } else { res.writeHead(200,{'Content-Type': 'application/json'}); res.write(JSON.stringify(doc)); res.end(); } }); }).listen(port);

The numbers for Node.js system were as follows:

Concurrent Requests Average Response time (ms) Requests/second 10 19 509 50 109 453 100 196 507 150 294 506

As before the average response time has a linear correlation to the number of concurrent requests, keeping the requests that can be served per second pretty constant. Node.js is roughly 20% faster, e.g. 509 requests/second vs. 422 requests/second at ten concurrent requests.

Conclusion

The Node.js is 20% faster than the Java EE solution for the problem at hand. That amazed me. An interpreted language as fast as or faster a compiled language on a VM in which years of optimisation have gone into. Not bad at all.

It is important to take this with a grain of salt: this type of application is perfectly suited for Node.js. I would be weary to extend the findings here to other applications. I believe because of the interpreted nature of JavaScript and the lack of established patterns for programming in the large Node.js application are best kept small.

Both Node.js and Java EE scale beyond what a normal server needs. 400-500 requests per second is quite a lot. Google, the largest website in the world, has about 5 billion requests per day. If you divide that by 24 hours, 60 minutes, and 60 seconds it comes out to 57870 requests/ second. That is the number of requests across all Google domains worldwide, so if you have a website running with 400 requests per second on one machine your website is already pretty big. 1 million requests per day on average means 11.5 requests per second. Keep that in mind.

In this test the different concurrency models between single-threaded Node.js and multi-threaded Java EE made no difference. To test Node.js at higher concurrency levels – where it is supposed to outshine multi-threading – other problems like the number of open files need to be considered. I was not able to run these tests beyond 150 concurrent users because the OS complained about too many open files. This could have been solved through configuration, but is beyond the scope of this article.

For a general comparison of Node.js and Java EE see my blog Node.js From the Enterprise Java Perspective.