Node.js From the Enterprise Java Perspective

Posted by by

Node.js currently is getting much attention because it uses a concurrency model that shows great promise in scalability: event-driven asynchronous Input/Output. This model can handle thousands of concurrent user-requests and do that with a tiny memory footprint, things that cannot be done with the traditional multi-threaded concurrency model of Enterprise Java. This article explains this new approach from the viewpoint of an Enterprise Java developer.

Node.js Basics

Node.js is a JavaScript server-side framework that sits on top of Google’s V8 JavaScript engine. The V8 engine allows JavaScript to be executed from the console instead of in the browser. Node.js provides the operations that are needed on a server: protocol support such as HTTP, DNS and TCP, as well as disk, sockets, and database operations (some through 3rd party modules).

The HelloWorld example on the Node.js website is a complete HTTP Server in six lines of code. It answers an HTTP request with “HelloWorld”. This example tells a lot about Node.js: It works at a much low-level than a typical enterprise developer normally does. Enterprise application developers write applications on top of an application server, they don’t write them on top of an HTTP server, and they certainly don’t write an HTTP server. Even though frameworks on top of Node.js are written at a breakneck speed they are many layers away from the things we call frameworks in the JEE world.

In that respect one could easily dismiss Node.js, but there is another thing that is very intriguing: It scales to proportions unheard of by enterprise developers. It scales so well because it doesn’t use multi-threading for concurrency, but uses event-driven asynchronous Input/Output. Let’s compare the Enterprise Java way with the Node.js way to see where that leaves the enterprise developer.

The Enterprise Java Way of Concurrency

Support for multiple users in Enterprise Java applications is provided by the application server in the form of multi-threading. Each user-request is mapped to a thread, which takes care of responding to the user-request. The application developer writes his application as if only one user was using it (with a sprinkle of thread safety). The server takes care of the rest by running your code in multiple threads and mapping user-requests to threads. What works for one user suddenly also works for thousands of users. This is a great advantage: Not much to do for the developer to accommodate a large number of users.

There are also disadvantages to this multi-threading concurrency model:

There is a computing overhead in constantly switching threads.

There is a linear relationship between the number of threads and the memory used, because at minimum each thread gets a stack allocated to him. A stack uses memory.

The disadvantages only really show at a very large number of threads. Traditional JEE applications before Ajax and Comet (server-side push events) needed to manage maybe 100 concurrent threads (not concurrent sessions, this is often a much larger number). Modern applications that rely heavily on Ajax – which increases the number of user-requests – and Comet – which drastically increases the time a user-request blocks a single thread – need a much higher number of concurrent threads. At numbers like 10000 concurrent threads the disadvantages of multi-threading do start to show.

The Node.js Way of Concurrency

Node.js believes a new way of building web applications is necessary, one that accommodates more user-requests and longer-running user-requests by providing better CPU and memory utilisation. Node.js addresses these requirements with asynchronous event-driven non-blocking I/O: it uses a single thread that handles all user-requests, but outsources all I/O operations so they do not block the tread.

I/O operations are all those operations that access resources outside the CPU and the memory: disk operations, database operations, communication via the internet, etc. I/O operations are by far the slowest operations the computer can do, and that is the reason they are handled outside of the main thread. They are started from the main thread, run asynchronously, and send an event to the main thread when they completed. With this model Node.js can handle thousands of concurrent user-requests easily with very little memory.

Do we Need a new Concurrency Model in Enterprise Java?

Node.js is in a totally different space than Enterprise Java, so in general it is comparing apples to oranges. Aside from that you could raise the question whether JEE should consider the concurrency model of Node.js if it scales better than multi-threading.

To answer that question you first have to ask yourself: Do I have a problem with my enterprise applications in terms of concurrency? If you are writing anything but the most modern Ajax/Comet applications and do not have a very large user base the answer most likely be no (If it is yes but you don’t have a large user base you should check your application first for common anti-patterns such as the I-keep-half-of-the-database-in-the-user-session pattern). There are only few applications that need this high concurrency at the moment. This changes a little bit as applications use Ajax and Comet together extensively, but on top of these two technologies you also need a very large user base to get to 10000 concurrent user-requests. This is a similar conclusion Ant Kutschera drew from his experiments.

Second, asynchronous event-driven frameworks do exists in the Java world. Examples the jboss.org project Netty and Apache MINA, witch are both based on the Java NIO (New I/O) API. According to this test done by Bruno Fernandez-Ruiz the NIO API performs very well against Node.js. Problem is that these frameworks are outside of the JEE specifications. In the Enterprise Java world it is the JEE specification that lays out the way applications deal with concurrency, and so far they are sticking to the multi-threaded model. If you want to be JEE conform you have to use multi-threading for concurrency.

Third, the current JEE 6 specification addresses the need to break the one-user-request-one-thread paradigm. Servlet 3.0 introduces the ability for asynchronous processing of requests, so that a thread may be returned to the server if it is just waiting. In that respect the JEE 6 specification is in step with current trends, so if the need arises for high concurrency the Enterprise Java developer can use a JEE 6-conform application server to do the heavy lifting.

My answer to this question therefore is that chances are you don’t have a concurrency problem in the first place. If you do then you can look at Java options around the NIO API, but beware that you are not writing JEE applications anymore. If you want to stay within the JEE space you can use the asynchronous possibilities of the current JEE 6 specification. Since the JEE 6 specification is relatively new it remains to be seen whether this will be fast enough, but I suspect it will.