Taking Baby Steps with Node.js – Don’t Block The Event Loop

< The list of previous installments can be found here. >

The basic premise of Node.js is that all I/O operations are expensive. That is why all I/O should be carried out asynchronously, at least according to the philosophy behind Node.js. What this basically boils down to is that we need to specify a callback function or bind to one or more events in order to get the outcome for a particular I/O related function that we want to execute. This means that things like file access, database operations, communication over HTTP, TCP or UDP, etc. ? don?t block the main execution of a Node.js application.

One of the components that lies at the heart of Node.js is the event loop. This is basically a component that processes a queue of events and invokes an associated callback for each of these events. Tom Hughes-Croucher provides a very nice explanation in this article where he provides an analogy between the event loop in Node.js and a mail man:

To our event-loop postman, each letter is an event. He has a stack of events to deliver in order. For each letter (event) the postman gets, he walks to the route to deliver the letter. The route is the callback function assigned to that event (sometimes more than one). However, critically, since our mailman only has a single set of legs, he can only walk a single code path at once.

Since the event loop is running on a single thread, it is very important that we do not block it?s execution by doing heavy computations in callback functions or synchronous I/O. Going over a large collection of values/objects or performing time-consuming computations in a callback function prevents the event loop from further processing other events in the queue.

Doing synchronous I/O is also a bad thing for the same obvious reason that it blocks the event loop. There are a couple of synchronous I/O functions available in the Node.js built-in API. A few of these are exported by the file system (fs) module (the ones that end with ?Sync). I strongly advise you to stay away from these functions.

There?s one synchronous, pseudo-global function that you probably have to use in any Node.js application besides the typical ?hello world? example, and that is the require function. By executing the require function we can load another module into the process of our application. This means that the content of the corresponding JavaScript file for a requested module is read from disk. By caching this JavaScript content, Node.js ensures that such an expensive synchronous read doesn?t happen more than once when the require function is issued multiple times for the same module. Because we typically make a call to the require function at the beginning of a module, executing this synchronous function only affects the startup time of our application. Make sure that you do not call the require function inside a callback. In this scenario, the event loop is blocked until the requested JavaScript file is loaded from disk (unless the requested module is already in the cache, but don?t count on it).

The event loop that is baked into Node.js is a wonderful thing when building real-time applications. But it can turn into a nightmare when holding on to the paradigms of synchronous I/O. Node.js stands for asynchronous I/O and the event loop is the pumping heart that makes it happen.

Until next time.