One of the most interesting topics in back-end development is server scaling and distribution.

There are many ways to scale your app and to handle a lot of requests and connections. In this article, we will explain one of the most popular ways of scaling Node.js applications, especially Socket connections.

Imagine you have a Node application which receives 300 requests per second. It works fine, but one day the requests count becomes 10 or 100 times more. Then, you will have a big problem. Node applications aren't meant for handling 30k requests per second (in some cases they can, but only thanks to CPU and RAM).

As we know Node is a single thread and doesn’t use the much resources of your machine (CPU, RAM). Anyway, it will be ineffective.

You don't have any guarantee that your application will not crash, or you can't update your server without stopping it. In any case, if you have only one instance, then most likely your application will experience some downtime.

How can we decrease downtimes? How can we use RAM and CPU in an effective way? How can we update the application without stopping all system?

NGINX Load Balancer

One of the solutions is a Load Balancer. In some cases, you can also use Cluster — but our suggestion for you is not to use Node Cluster, because Load Balancers are more effective and provide more useful things.

In this article, we will use only the Load Balancer. In our case, it will be Nginx. Here is an article which will explain to you how to install Nginx.

So, let's go ahead.

We can run multiple instances of a Node application and use an Nginx server to proxy all requests/connections to a Node server. By default, Nginx will use round robin logic to send requests to different servers in sequence.

As you can see, we have a Nginx server which receives all requests sent by the client and forwards to different Node servers. As we have said, Nginx by default uses round robin logic, which is the reason why the first request reaches to server:8000, the second to 8001, the third to 8002 and so on...

Nginx also has some more features (i.e. create backup servers, which will help when a server crashes, Nginx will automatically move all requests to the backup server), but in this article, we will only use the Proxy.

Here is a basic Express.js server, which we will use with Nginx.

// server.js const express = require('express'); const app = express(); app.get('/', (req, res) => { res.end(`Hi, PID: ${process.pid}`); }); app.listen(process.env.PORT); console.log(`Server running on ${process.env.PORT} port, PID: ${process.pid}`);

Using env we can send the port number from the terminal and the express app will listen to that port number.

Let's run and see what happens.

PORT=8000 node server.js

In the console and in the browser, we can see the server PID, which will help to identify which server received our call.

Let's run two more servers in the 8001 and 8002 ports.

PORT=8001 node server.js PORT=8002 node server.js

Now we have three Node servers in different ports.

http://localhost:8000/

http://localhost:8001/

http://localhost:8002/

Let's run the Nginx server.

upstream nodes { server 127.0.0.1:8000; server 127.0.0.1:8001; server 127.0.0.1:8002; } server { listen 3000; location / { proxy_pass http://nodes; } }

Our Nginx server listens to the 3000 port and proxy to upstream node servers.

Restart the Nginx server and go to http://127.0.0.1:3000/

Refresh multiple times, and you will see different PID numbers. We have just created a basic Load Balancer server which forwards the requests to different Node servers. This way, you can handle a large number of requests and use full CPU and RAM.

Let's see how the Socket works and how we can balance the Socket server in this way.

Socket Server Load Balancing

First of all, let's see how the Socket works in browsers.

There are two ways how Socket opens connections and listens for events. They are Long Polling and WebSocket — which are called transports.

By default, all browsers start Socket connections with Polling and then, if the browser supports WebSocket, it switches to WebSocket transport. But we can add optional transports option and specify which transport or transports we want to use for the connection. And then we can open the socket connection at once using WebSocket transport, or the opposite will only use Polling transport.

Let's see what is the difference between Polling and WebSocket.

Long Polling

Sockets allow receiving events from the server without requesting anything, which can be used for games, messengers, among others. You don't know when your friend will send you a message to request server and get the response. Using Sockets, the server will automatically send an event to you and you will receive that data.

How can we implement a functionality which will use only HTTP requests and provide a layer which will be able to receive some data from the server without requesting for that data? In other words, how do we implement Sockets using only HTTP requests?

Imagine you have a layer which sends requests to the server, but the server doesn't respond to you at once — in other words, you just wait. When the server has something which needs to be sent to you, the server will send that data to you using the same HTTP connection which you had opened a little while ago.

As soon as you receive the response, your layer will automatically send a new request to the server and again will wait for another response, without checking the response of the previous request.

This way, your application can receive data/events from the server at any time, because you always have an open request which is waiting for the server response.

This is how the Polling works. Here is a visualization of its work.

WebSocket

WebSocket is a protocol which allows to open only one TCP connection and keeps it for a long time. Here is another visualization image which shows how WebSocket works.

As we said, by default most browsers connect Socket server using Polling transport (with XHR requests). Then, the server requests to change transport to WebSocket. But, if the browser doesn’t support WebSockets, it can continue using Polling. For some old browsers which are not able to use WebSocket transport, the application will continue to use Polling transport and won’t upgrade the transport layer.

Let's create a basic socket server and see how it works in Chrome's Inspect Network.

index.html

<!Doctype html> <html> <head> <title>Hello World</title> <script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/2.1.1/socket.io.js"></script> <script> const socket = io('http://0.0.0.0:3000', { transports: ['polling'] // transports: ['websocket'] }); socket.on('connect', () => { console.log(`Socket connected id: ${socket.id}`); }); </script> </head> <body> <h1>Basic Socket connection</h1> </body> </html>

server.js

const io = require('socket.io')(process.env.PORT); io.on('connection', (socket) => { console.log(`${socket.id} connected`); }); console.log(`Socket Server running on ${process.env.PORT} port, PID: ${process.pid}`);

Run Node.js server on port 3000 and open index.html in Chrome.

As you can see, we’re connecting to the socket using polling , so under the hood, it makes HTTP requests. It means that, if we open XHR in the Inspect Network page, then we will see how it sends requests to the server.

Open the Network tab from the browser’s Inspect mode and look at the last XHR request in network. It's always in waiting process, there is no response. After some time, that request will be ended and a new request will be sent — because, if one request isn't responded from the server for a long time, you will get the timeout error. So, if there is no response from the server, it will update the request and send a new one.

Also, pay attention to the response of the first request which the server sends to the client. The response data looks like this:

96:0{"sid":"EHCmtLmTsm_H8u3bAAAC","upgrades":["websocket"],"pingInterval":25000,"pingTimeout":5000}2:40

As we have said, the server sends options to the client to upgrade transport from "polling" to "websocket". However, as we have only "polling" transport in the options, it will not switch.

Try to replace the connection line with this:

const socket = io('http://0.0.0.0:3000');

Open the console and choose "All" form the Inspect Network page.

When you refresh the page, you will notice that, after some XHR requests, the Client upgrades to "websocket". Pay attention to the type of Network items in the Network console. As you can see, "polling" is basic XHR and WebSocket is the type of "websocket". When you click on that, you will see Frames. When the Server emits a new event, you will receive a new frame. There are also some events (just numbers, i.e. 2, 3) which the client/server just send to each other to keep the connection, otherwise, we will get a timeout error.

Now you have basic knowledge of how Socket works. But what kind of problems can we have when we try to balance a socket server using Nginx, as in the previous example?

Problems

There are two major problems.

First, there is an issue when we have an Nginx load balancer with multiple Node servers and the client uses polling.

As you may remember, Nginx uses round-robin logic to balance requests, so every request which the client sends to Nginx will be forwarded to a Node server.

Imagine you have three Node servers and an Nginx load balancer. A user requests to connect server using Polling (XHR requests), Nginx balances that request to Node:8000, and the server registers the client's Session ID to be informed about the client connected to this server. The second time, when the user does any action, the client sends a new request which Nginx forwards to Node:8001.

What should the second server do? It receives an event from the client which isn't connected to it. The server will return an error with a Session ID unknown message.

The balancing becomes a problem for clients who use polling. In the Websocket way, you will not get any error like this, because you connect one time and then receive/send frames.

Where should this problem be fixed: in the client side or in the server?

Definitely in the server! More specifically in Nginx.

We should change the form of Nginx's logic which is used to balance the load. Another logic which we can use is ip_hash .

upstream nodes { ip_hash; server 127.0.0.1:8000; server 127.0.0.1:8001; server 127.0.0.1:8002; }

Every client has an IP address, so Nginx creates a hash using the IP address and forwards the client request to a Node server, which means that every request from the same IP address will always be forwarded to the same server.

Actually, this is the minimal solution to that problem; there are other possibilities. If you wish to go deeper, sometimes this solution will come short. You can research other logics for Nginx/Nginx PLUS or use other Load Balancers (i.e. HAProxy).

Moving on to the second problem: the user connects to one server.

Imagine a situation where you are connected to Node:8000, a friend of yours is connected to Node:8001, and you want to send him/her a message. You send it by socket, the server receives an event and wants to send your message to another user (your friend). I think you already guessed what problem we can have: the server wants to send data to the user which is not connected to it but is connected to another server in the system.

There is only one solution, which can be implemented in many ways.

Create an internal communication layer for servers.

It means each server will be able to send requests to other servers.

This way, Node:8000 sends a request to Node:8001 and Node:8002 , and they check if user2 is connected to it. If user2 is connected, that server will emit the data provided by Node:8000 .

Let's discuss one of the most popular technologies which provides a communication layer and which we can use in our Node servers.

Redis

As is written in the official documentation:

Redis is an in-memory data structure store used as a database

As so, it allows you to create key-value pairs in memory. Also, Redis provides some useful and helpful features. Let's talk about one of these popular features.

PUB/SUB

This is a messaging system which provides us a Subscriber and a Publisher.

Using Subscriber in a Redis client, you can subscribe to a channel and listen for messages. With Publisher, you can send messages to a specific channel which will be received by the Subscriber.

It's like Node.js EventEmitter . But EventEmitter will not help when the Node application needs to send data to another Node application.

Let's see how it works with Node.js

Subscriber

// subscriber.js const redis = require('redis'); const subscriber = redis.createClient(); subscriber.on('message', (channel, message) => { console.log(`Message "${message}" on channel "${channel}" arrived!`) }); subscriber.subscribe('my channel');

Publisher

// publisher.js const redis = require('redis'); const publisher = redis.createClient(); publisher.publish('my channel', 'hi'); publisher.publish('my channel', 'hello world');

Now, to be able to run this code, we need to install the redis module. Also, don't forget to install Redis on your local machine.

npm i -S redis

Let's run and see the results.

node subscriber.js

and

node publisher.js

In the subscriber window, you will see this output:

Message "hi" on channel "my channel" arrived! Message "hello world" on channel "my channel" arrived!

Congratulations! You have just established communication between different Node applications.

This way, we can have subscribers and publishers in one application to be able to receive and send data between one another.

You can read more about Redis PUB/SUB on the official documentation. Also, you can check the node-redis-pubsub module which provides a simple way to use Redis PUB/SUB.

Connecting All These Pieces

Finally, we have come to one of the most interesting parts.

For handling lots of connections, we run multiple instances of the Node.js application and then we balance the load to those servers using Nginx.

Using Redis PUB/SUB, we establish communications between Node.js servers. Every time any Server wants to send data to a client which is not connected to it, the Server publishes the data. Then, every Server receives it and checks if the user is connected to it. In the end, that server sends the provided data to the client.

Here is the big picture of how the Back-end architecture is organized:

Let's see how we can implement this in Node.js. You won't need to create all this from scratch. There already are packages which do some of that work.

These two packages work for socket.io.

One interesting thing which the socket.io-emitter dependency provides is emitting events to users from outside of the socket server. If you can publish valid data to Redis which the servers will receive, then one of the servers can send the socket event to the client. This means that it isn't important to have a Socket server to be able to send an event to the user. You can run a custom server which will connect to the same Redis and will send socket events to the clients using PUBLISH.

Also, there is another package named SocketCluster. SocketCluster is more advanced — it uses cluster, which has brokers and workers. Brokers help us with the Redis PUB/SUB part, and workers are our Node.js applications.

There is also Pusher which helps to build big scalable apps. It provides an API to their hosted PUB/SUB messaging system, and it also has an SDK for some platforms (e.g., Android, IOS, Web). However, note that this is paid service.

Conclusion

In this article, we explained how you can balance the socket server, what kind of problems may occur, and how you can solve them.

We used Nginx to balance server load to multiple nodes. There are many load balancers out there, but we recommend Nginx/Nginx Plus or HAProxy. Also, we saw how the socket works and the difference between polling and websocket transport layers. Finally, we saw how we can establish communication between Node.js instances and use all of them together.

As a result, we have a load balancer which forwards requests to multiple Node.js servers. Note that you must configure the load balancer logic to avoid any problems. We also have a communication layer for the Node.js server. In our case, we used Redis PUB/SUB, but you can use other communication tools.

I have worked with Socket.io (with Redis) and SocketCluster, and advise you to use them both in small and big projects. With these strategies, it is possible to have a game with SocketCluster which can handle 30k socket connections. Actually, the SocketCluster library is a little bit old and its community isn't so big, but it likely won't pose you any issues.

There are many tools which will help you balance your load or distribute your system. We advise that you also learn about Docker and Kubernetes. Start researching about them ASAP!

Thank you for reading this article. Feel free to ask any questions or tweet @nairihar.

Lastly, if you want to secure your JavaScript source code against theft and reverse-engineering, you can try Jscrambler for free.