The JavaScript language has a concurrency model based on the “event loop” and “message queue” concepts. Since JavaScript is executed on a single-thread, executing a CPU-bound task would block all other operations. This makes JavaScript and Node.JS well suited for IO-bound operations as long as the blocking portion of the I/O requests is offloaded for execution, by the operating system whenever possible. In order to achieve that JavaScript uses the callback mechanism – asynchronous operations are implemented by functions that pass back callbacks – functions that get executed when the original function returns. This technique allows programs written JavaScript, to run without blocking, instead of executing other code while waiting to callbacks to be fired. The callbacks are stored inside a message queue. Each time a function returns a value and passes it to a callback, Node.js places that callback in a queue to be executed. The queue is processed by the “event loop” engine – a worker loop that polls for ready messages and processes callbacks.

┌───────────────────────────┐ ┌─>│ timers │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐ │ │ pending callbacks │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐ │ │ idle, prepare │ │ └─────────────┬─────────────┘ ┌───────────────┐ │ ┌─────────────┴─────────────┐ │ incoming: │ │ │ poll │<─────┤ connections, │ │ └─────────────┬─────────────┘ │ data, etc. │ │ ┌─────────────┴─────────────┐ └───────────────┘ │ │ check │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐ └──┤ close callbacks │ └───────────────────────────┘ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ┌───────────────────────────┐ ┌─ > │ timers │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐ │ │ pending callbacks │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐ │ │ idle , prepare │ │ └─────────────┬─────────────┘ ┌───────────────┐ │ ┌─────────────┴─────────────┐ │ incoming : │ │ │ poll │ < ─────┤ connections , │ │ └─────────────┬─────────────┘ │ data , etc . │ │ ┌─────────────┴─────────────┐ └───────────────┘ │ │ check │ │ └─────────────┬─────────────┘ │ ┌─────────────┴─────────────┐ └──┤ close callbacks │ └───────────────────────────┘

Simplified Node.JS Event-Loop diagram from Node.JS documentation project.

This combination of event messages and polling allows JavaScript to process asynchronous code without blocking, all on a single execution thread. Every concurrent JavaScript action is executed in the same shared environment. Since our event-loop is the engine responsible for processing our asynchronous code, including I/O operations, overwhelming event-loop will cause all kind of problems. Given the single-threaded nature of our event-loop, no task can complete until the previous task did. Hence, increased latency will, at some point, cause all tasks to be queued for waiting, before execution. If latency will continue increasing, we will be unable to serve clients in time and their requests will timeout. This phenomenon is called event loop saturation or starvation.

Running CPU-bound tasks under JavaScript, and Node.JS in particular, requires either spawning CPU-hungry tasks off the working pool threads or frequently yielding execution back to the JavaScript runtime. Executing CPU-intensive tasks on the main thread of a concurrent Node.JS application, such as a web service, will bring the application to halt.

For more information on Node.JS event-loop, refer to the official Node.JS documentation.

Handling Overload from within Node.JS code

Now, let’s consider a theoretical Node.JS web application that exposes REST API to clients. We need to build into our server a mechanism allowing us to understand and notify the load-balancer that our existing server-load is high and we are unlikely to able to handle additional requests correctly.. The load-balancer would then know to use a different server instance, while our nearly overloaded one will process the outstanding requests.



Since JavaScript concurrency model is based on events, everything that happens in Node.JS is the reaction to an event. A transaction passing through Node traverses a cascade of callbacks. The library that provides the event loop service inside Node.js is called Libuv. The event loop code runs on the same single JavaScript thread as the user code. Since everything under Node.JS passes through the event loop, for practical purposes, Node.JS event-loop saturation, which manifests itself in increased events processing latency, is the best possible signal telling us that the server is overloaded.

Let’s use the Express.JS web application framework with the help of the npm library called “overload-protection” and build our server as below:

const protectCfg = { production: process.env.NODE_ENV === 'production', clientRetrySecs: 1, sampleInterval: 5, maxEventLoopDelay: 42, // max delay between event loop ticks maxHeapUsedBytes: 0, // max used heap threshold (0 to disable) maxRssBytes: 0, errorPropagationMode: false } const app = require('express')() // middleware which blocks requests when we're too busy const protect = require('overload-protection')('express', protectCfg) app.use(protect) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 const protectCfg = { production : process . env . NODE_ENV === 'production' , clientRetrySecs : 1 , sampleInterval : 5 , maxEventLoopDelay : 42 , // max delay between event loop ticks maxHeapUsedBytes : 0 , // max used heap threshold (0 to disable) maxRssBytes : 0 , errorPropagationMode : false } const app = require ( 'express' ) ( ) // middleware which blocks requests when we're too busy const protect = require ( 'overload-protection' ) ( 'express' , protectCfg ) app . use ( protect )

The maxEventLoopDelay parameter represents the maximum amount of time in milliseconds between event loop ticks, before we consider the process too busy.

The more modern Hapi.JS web framework has a built-in maxEventLoopDelay option allowing us to specify at the server configuration level the maximum delay duration before requests are rejected with 503.

A sample Hapi.JS connection configuration might look like:

{ "load": { "maxHeapUsedBytes": 1073741824, "maxRssBytes": 1610612736, "maxEventLoopDelay": 1000 } } 1 2 3 4 5 6 7 { "load" : { "maxHeapUsedBytes" : 1073741824 , "maxRssBytes" : 1610612736 , "maxEventLoopDelay" : 1000 } }

In the above example, the maximum allowed event loop latency is capped at 1,000 milliseconds. Whenever event loop delay exceeds that limit, our Hapi.JS -based application will automatically send 503 response codes.

In order to implement event loop overload protection for Fastify framework, use the “under-pressure” Fastify plugin. To use it, register the plugin as below:

fastify.register(require('under-pressure'), { maxEventLoopDelay: 1000, maxHeapUsedBytes: 100000000, maxRssBytes: 100000000 }) 1 2 3 4 5 fastify . register ( require ( 'under-pressure' ) , { maxEventLoopDelay : 1000 , maxHeapUsedBytes : 100000000 , maxRssBytes : 100000000 } )

Much like the overload-protection module and Hapi.JS built-in overload prevention mechanism, under-pressure allows specifying additional limits.

Handling 503 errors

Let’s assume that we have a Node.JS web application hello.js.

$ node hello.js 1 $ node hello . js

This will start our web application. Now that we have our web app running locally, we need to expose it on the network. We will do this using the nginx web server as a reverse-proxy tool. The typical location of the nginx configuration file is /etc/nginx/sites-available/default. For our sample configuration, we could use the below settings:

location / { proxy_pass http://localhost:8080; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection 'upgrade'; proxy_set_header Host $host; proxy_cache_bypass $http_upgrade; } 1 2 3 4 5 6 7 8 location / { proxy_pass http : //localhost:8080; proxy_http _ version 1.1 ; proxy_set_header Upgrade $ http_upgrade ; proxy_set_header Connection 'upgrade' ; proxy_set_header Host $ host ; proxy_cache _ bypass $ http_upgrade ; }

Adjust location and other properties as necessary. To validate configuration, use:

$nginx -t 1 $ nginx - t

For more information on nginx, please refer to the documentation. At this point, we configured our sample Node.JS application to run on a single node, as a system process.

We could also use nginx as the load-balancer. In order to do that, we need to specify:

location / { # set this to your upstream module proxy_pass http://app_nodejs; # proxy headers proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-Proto $scheme; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $host; proxy_set_header X-NginX-Proxy true; # handle websockets proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection ‘upgrade’; proxy_cache_bypass $http_upgrade; proxy_http_version 1.1; proxy_redirect off; # select next upstream if server is down proxy_next_upstream error timeout http_500 http_502 http_503 http_504; proxy_connection_timeout 5s; # gateway timeout proxy_read_timeout 10s; proxy_send_timeout 10s; } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 location / { # set this to your upstream module proxy_pass http : //app_nodejs; # proxy headers proxy_set _ header X - Real - IP $ remote_addr ; proxy_set _ header X - Forwarded - Proto $ scheme ; proxy_set _ header X - Forwarded - For $ proxy_add_x_forwarded_for ; proxy_set_header Host $ host ; proxy_set _ header X - NginX - Proxy true ; # handle websockets proxy_set_header Upgrade $ http_upgrade ; proxy_set_header Connection ‘ upgrade ’ ; proxy_cache _ bypass $ http_upgrade ; proxy_http _ version 1.1 ; proxy_redirect off ; # select next upstream if server is down proxy_next_upstream error timeout http_500 http_502 http_503 http_504 ; proxy_connection _ timeout 5s ; # gateway timeout proxy_read _ timeout 10s ; proxy_send _ timeout 10s ; }

When an application is running behind a load-balancer, like in the above example, if an instance server is overloaded, it will send back the appropriate response code – HTTP 503. The HTTP load-balancer will in such case try the next available server, until one of them will send 2xx-4xx response. If all servers are unavailable, only then send the 503 code to the client.

In order for load-balancers to understand how long a server might be down due to a 503 error, system administrators configure the load-balancers’ vendor-specific configuration setting, such as idle timeout.

Conclusion

This simple “event loop latency” policy applied to Node.JS, combined with load-balancing, allows us to create scalable Node.JS web applications resistant to resource overload, properly handling 503 HTTP responses and correctly configuring load-balancers, we are able to build Node.JS REST endpoints with virtually limitless horizontal scalability.