We all have a default choice when it comes to libraries (for better or worse). When it comes to network management, my default choice was Boost::Asio. It is simple to use and can read/write on sockets asynchronously. And, like a lot of developers, I like to control my code end to end, which makes me prefer light libraries than big frameworks.

But after I discovered and tested ZeroMQ, I decided to choose it as my default networking library for FyS MMORPG and certainly any future project I will work on.

Reinventing the wheel

Developers tend to reinvent the wheel on every new project they work on. And it is particularly true when speaking about networking. We usually redo our Connection manager’s classes that handle our sockets from scratch applying new knowledge and/or features of the language.

When working with multiple servers that communicate and synchronize information with each other. We quickly encounter interesting challenges, for example, it is required to manage a server that is disconnecting in the middle of a conversation; is it better to implement a re-connection-mechanism or to simply drop the connection and return an error?

If you are managing multiple servers depending on each other, the order you are starting your server matters as they will try to initialize a connection with each other and a socket needs to connect to another socket already present and listening.

In this case, is it better to have a script starting all your backend in the correct order for them to connect to each other correctly (but then, again, what to do in case of startup error?) or is it better to implement a “retry connection” mechanism?

As you can see, when reinventing the wheel, a lot of fun works come with it. And it is usually harder than it looks at first sight if you want to do the things the right way.

But at the end of the day, the code is doing the same thing than the 10th other versions you did before and it is usually not more efficient.

But what to use if we don’t want to?

In the case of the new architecture of FyS MMORPG which is taking a micro-services approach, it complicates a lot the interactions between the servers instance. The number of connection is relative and it would be very hard to handle the sockets manually. The harder part is to handle the disconnection/re-connection of a server instance.

All those error cases are really time-consuming to handle correctly. If you want to focus on other problematic than networking issues, there is one library that can manage to help you in this complicated task, ZeroMQ.

What is ZeroMQ?

ZeroMQ is a lightweight networking library developed in C, the main focus of this library is speed and usability. It can be used for inter-thread communication and inter-process communication as well.



To be more exact, ZMQ is a messaging library (using message queuing) based on sockets. This library is special for multiple reasons, it is fast, easy to use, and it contains very powerful built-in features and patterns that ease the complexity of the development a lot. It will manage the socket for the user, who will just have to focus on the patterns he wants to use.

It has complete and very good documentation with examples and also provides wide language support (language binding) making it a library that can be used for almost any project.

Very light, very fast

As previously said, this library has been developed in C and is very fast. To attest this fact, ZMQ provides performance tests to run and check the speed of ZeroMQ. And this library has been bound with a lot of different languages.

Originally the zero in ZeroMQ was meant as “zero broker” and (as close to) “zero latency” (as possible). Since then, it has come to encompass different goals: zero administration, zero cost, zero waste. More generally, “zero” refers to the culture of minimalism that permeates the project. We add power by removing complexity rather than by exposing new functionality. zmq documentation

The below graph represents messages per second, for send and receive. During the transmission of one million 1K messages. The tests were executed under a machine running with Windows Vista.

As explained in the blog this graph was taken from, this graph is a bit unfair as the other technologies used for comparison are really different. But it makes it obvious that if you want speed for your messaging system, ZMQ is by far the better choice.

The cost of speed

ZMQ is a messaging system, but it has the issue of not being reliable as a messaging tool. In comparison to RabbitMQ or other of the same kind, it doesn’t actually store the messages it receives, so it cannot re-send a message if an error occurred while sending a message (or if your application crash). ZMQ is doing everything in-memory which is one of the reasons for its speed. We could see it as a big flaw, but everything depends on the design of your application.

Depending on how your application works, it doesn’t require a reliable messaging system. If reliability is not a requirement, then you just don’t implement anything and send the messages in a “fire and forget” mode. If it is a requirement, it is possible for the application to know if the request it sent didn’t reach its destination and resend it himself if needed.

Where does the reliability should reside? This mechanism can be stored either In the messaging system (the middleware part) or in the application itself.

There is no definite answer to this question, but if the reliability is on the application side, you have the advantage to be able to implement it only when it matters, while if it is present in the middleware, you will pay the cost of the reliability even when it is not needed. For example, there is no need to keep a notification of a player movement that has not been sent correctly, as it is going to be too late to send this information when the retry will occur. So there is no reason to implement a retry mechanism on the middleware part.

A good point of ZMQ is that it gives you the choices between very efficient “fire and forget” and reliability… Of course, when messages are very important (bank transfer, the player got a legendary item) it would be sad that the message disappears, in this case. In this case, reliability is required, and with ZMQ you will need to implement it by yourself, by sending acknowledgement notifications for example (if the acknowledgement is not received by the requester, it can re-send the message).

We can keep in mind that most of the time, everything will work fine and will go as fast as ZMQ can go (faster than other product). It’s only in case of error that it will have to enter into your potential safeguard implementation (for reliability), which depending on how you do it can be slower than the one of another product (like RabbitMQ).

Asynchronous socket connection

One of the key features of ZMQ sockets is the fact that they connect asynchronously. Which means that a socket can connect to a server that is not up yet. The socket will be in “pending connection” and it is even possible to send messages into this socket that has no established connection. Those pending message will be queued until the connection is established.

This makes it possible to start servers in any order, which is very important when working with a wide and expandable cluster of servers running, which is the case for a micro-service oriented architecture.

A lot of built-in patterns

But what makes ZMQ standout, is the panel of possible socket you can create. ZMQ reason about socket like a classical messaging system does with a queue. A single socket can handle multiple connections and have different dispatching strategy.

Without entering into too much details (the official documentation does it smoothly and very well) I am going to try to convince you by presenting some of those strategies.

The Publisher/Subscriber: if you are familiar with messaging systems, you should already be familiar with this pattern. Here is how it basically works, a sender (called the publisher) is sending messages without knowing or even specifying to who it will be received. The publisher can categorize those messages (often thanks to a header section). It is another server that is going to connect himself to this server and “subscribe” to a specific category of message (or to all messages), this server is called the subscriber.

ZMQ makes it very easy to implement this mechanism, the only thing needed is to create a PUB socket on one side, and a SUB socket on the other. All the socket connection, the registration of category, the dispatch mechanism is abstracted away by ZeroMQ. You can find in the CPP snippet below how this can be implemented.

// Code example on how to do a pub/sub socket connection with ZMQ // Server side { //... // This is our public endpoint for subscribers zmq::socket_t publisher(context, ZMQ_PUB); publisher.bind("tcp://*:8100"); //... } // Client side (can be lot of them instantiated) { //... // This is where the weather server sits std::string filterCode = "FILT01"; zmq::socket_t subscriber(context, ZMQ_SUB); subscriber.connect("tcp://localhost:8100"); subscriber.setsockopt(ZMQ_SUBSCRIBE, filter.c_str(), filterCode.size()); //... } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 // Code example on how to do a pub/sub socket connection with ZMQ // Server side { //... // This is our public endpoint for subscribers zmq :: socket_t publisher ( context , ZMQ_PUB ) ; publisher . bind ( "tcp://*:8100" ) ; //... } // Client side (can be lot of them instantiated) { //... // This is where the weather server sits std :: string filterCode = "FILT01" ; zmq :: socket_t subscriber ( context , ZMQ_SUB ) ; subscriber . connect ( "tcp://localhost:8100" ) ; subscriber . setsockopt ( ZMQ_SUBSCRIBE , filter . c_str ( ) , filterCode . size ( ) ) ; //... }

ROUTER/DEALER sockets: Those sockets require to understand how the ZMQ envelope works (the official documentation explain it into details). Basically, ZMQ sends messages in multiple parts. Those part are used for the routing of the messages. ZMQ sockets usually handle those envelopes without you noticing it, but with router and dealer sockets, you have full control over the envelope of the message.

A Router socket tracks the client connected to it, and thanks to the identity part of the envelope, it can send the message to a specified client. The identity can be set manually, but if not, one is generated automatically. In other words, a router socket is doing the same thing than the classical “Connection manager” that we like to remake from scratch. Registering the clients and giving a way to send a message to a specific connected client.

A Dealer socket is simply a socket not touching the envelope of the message making you able to use it to add your own metadata in the envelope or to use the data present in the envelope (often used in order to retrieve the identity of a client given by a router socket). Multiple clients can connect on this socket, but when writing on a dealer socket. The message is fairly dispatched (basic load balancing) and only one connected client receives the message. If a more intelligent load balancing is required, more work would be required, but less than if everything had to be implemented manually anyway.

You could find below a more practical example of how this socket can be used, (a REQ socket is just a ZMQ socket that can only be written on, it cannot receive a message).

As you can see, it can be very simple to implement a server registering an incoming client and sending messages to them without having to know exactly who the client is via a PUB/SUB socket connection.

It is easy to see how much boilerplate code and waste of time is avoided by using a Router socket or a Publisher/Subscriber socket instead of having to implement it (and debug it) by hand.

Why ZerøMQ in FyS MMORPG?

When developing a lot of different backends that interact and synchronize together, it becomes really hard to handle manually with a simple socket. Which is the reason why enterprise usually use messaging-oriented middleware (MOM) based on queuing technologies (RabbitMQ, IBM MQ etc…) with an ESB (Enterprise Service Bus) in order to implement an SOA (Service Oriented Architecture) architecture. This works perfectly fine. This kind of technologies has the advantage to let the developer focus on the business code, and the pipelining between the applications is basically guaranteed. It is reliable when done correctly and can manage a very big amount of messages quickly.

But when it comes to video games (or any other performant sensitive application), speed has a new definition. In those, the usage of an enterprise messaging system may be too slow because of the persistence management, and all the intermediaries between components (Brokers, Queues, Bus etc…).

In this case, ZMQ is a good choice, as a messaging library, it is as easy to use as an enterprise MOM except everything occurs in-memory and doesn’t require to install a broker or any kind of intermediary. It is designed to be fast and easy to use but, as previously said, doesn’t provide a persistence that makes it reliable.

As FyS architecture is tending to be split into more and more backends, communications and connection between them need to be ensured, and ZMQ provides this insurance very well thanks to its asynchronous connection. The different socket types provided are incredibly powerful giving you the choice of broadcasting messages or having a fair queuing (basic load balancing) without any effort.