updated 2019–12–17

Briefly: doing sync RPC calls using classic MQ is not effective, it gives lower performance and side effects which you have to handle.

Inverted Json is a lightweight job server, which let you do RPC calls without side effects (a client and a worker are connected via Inverted Json for transfer data), it gives higher performance (7 times faster than via RabbitMQ), it works via http, so you can use any http-client even curl from console.

Inverted Json supports RPC, MQ, PubSub and more.

1. Benchmark

All servers/tools are in 3 categories:

“Direct connection” — when a client connects directly to a worker, so if there is a lot of services, this is more costly way to configure, all clients should know about all required workers (ip and port), but it usually produces minimal overhead for network.

— when a client connects directly to a worker, so if there is a lot of services, this is more costly way to configure, all clients should know about all required workers (ip and port), but it usually produces minimal overhead for network. “Proxy connection” — is a way with single point of access, a client can be simple, but worker’s side still not easy to configure it: allocating and forwarding ports, registering for a proxy, more complex configuring a firewall, so sometimes you need to use some extra tools to control all this things.

— is a way with single point of access, a client can be simple, but worker’s side still not easy to configure it: allocating and forwarding ports, registering for a proxy, more complex configuring a firewall, so sometimes you need to use some extra tools to control all this things. “Inverted connection” — is a single point of access for clients and workers (can be ESB), it’s more easy way to configure it.

— is a single point of access for clients and workers (can be ESB), it’s more easy way to configure it. CPU and memory usage are taken from `docker stats`

“2-core test” splits up a server and clients with workers to different cores to reduce effect on each other, so a server was limited in 2 cores using taskset (multi-core test is without limits)

Some thoughts about benchmarks are below.

2. MQ vs RPC

Despite these 2 ways are different, some times former is used instead of latter and vice versa, if you will try to find a kind of norm of which one and when to use it, it can look like this:

RPC (sync call) — when a client requires a response immediately (in short time), when a worker should respond while a client is waiting it, and if the client left (by timeout), then a response is not needed anymore (it’s why not needed to preserve a request, how it usually works in MQ)

For example when you do some request/query to database, you do RPC, and you don’t want to use MQ for this.

(sync call) — when a client requires a response immediately (in short time), when a worker should respond while a client is waiting it, and if the client left (by timeout), then a response is not needed anymore (it’s why not needed to preserve a request, how it usually works in MQ) For example when you do some request/query to database, you do RPC, and you don’t want to use MQ for this. MQ (async call) — when a response is not needed (at least not immediately), when you want to complete a task eventually, or just to transfer a data.

For example it can be used for mailing.

3. RPC over RabbitMQ

RabbitMQ is popular to use for RPC, but as other MQ systems has some overhead, as result it gives not the best performance.

If you use MQ you need to “clean” queues, e.g. if a worker becomes unavailable, it can receive a lot of expired tasks when it’s active again, and when a client gone by timeout, a task is still in queue.

The same with queues for client’s responses, it will keep responses if a client gone before a worker responds, and you need to “clean” it. Though in RabbitMQ you can close a client’s queue, but in this case performance falls fatally (x10–20 slower).

Also you need to ping a worker to be sure if a worker is alive. Beside of it MQ spends resources to handle queues and messages when in RPC systems a data is just transferred to a worker and back.

4. Inverted Json

There is a lot of MQ systems, but not such many RPC/Job servers like Gearman/Crossbar — it’s a small choice, so it’s why developers choosing MQ for RPC.

It’s why inverted Json was created, it’s built with C/C++ and epoll, finite-state automaton for routing, stream parser for json, slices instead of strings*, etc. for better performance

Advantages Inverted Json over RabbitMQ for RPC (sync calls)

Don’t need to clean queues of expired messages

You don’t have to ping a worker, a client receives error 502 immediately if a worker falls (in keepalive mode)

API is more easy and compact — just http request (and it’s supported by all popular languages and frameworks)

Works faster and uses less resources

More easy way to send commands on certain worker (e.g. if there is a few similar workers, you can set “worker-id” in headers)

Other info about Inverted Json

Inverted Json supports RPC, MQ, Pub-Sub, priority, custom worker

You can send binary data (not only json, how it can be seemed by the name)

Id is not required for a request if worker works in keep-alive, Inverted Json just connects a client and a worker to each other

is not required for a request if worker works in keep-alive, Inverted Json just connects a client and a worker to each other A worker can register multiple tasks, and register patterns (like `/command/*`) without losing performance

Docker image is just 2.6Mb (slim version)

(slim version) Core of Inverted Json is just ~1400 lines of code (v0.3), less code — less bugs ;)

Inverted Json never change a body of request, it’s transferred as is

5. Try Inverted Json in 3 minutes

You can try Inverted Json right now, if you have Docker and curl:

1. Run Inverted Json via docker on port 8001 (you can choose any), — log 47 is a mask for logging incoming requests and some important events:

$ docker run -it -p 8001:8001 lega911/ijson --log 47

2. Register a worker for task “calc/sum”, and get a task (type of request “get”):

$ curl localhost:8001/calc/sum -H 'type: get'

3. A client calls “calc/sum”:

$ curl localhost:8001/calc/sum -d '{"id": 15, "data": "2+3"}'

4. The worker receives task `{“id”: 15, “data”: “2+3”}`, now send a response for the same id (type of request “result”):

$ curl localhost:8001 -H 'type: result' -d '{"id": 15, "result": 5}'

… and the client receives the response as is `{“id”: 15, “result”: 5}`

5.1. JsonRPC

JsonRPC 2 is not fully supported, but a client can send requests like

`{"jsonrpc": "2.0", "method": "calc/sum", "params": [42, 23], "id": 1}` on `/rpc/call`

and can receive errors like

`{"jsonrpc": "2.0", "error": {"code": -32601, "message": "Method not found"}, "id": null}`

I don’t know how popular it is, but in future it can be improved.

5.2. Python example of client and worker

Here is example of “worker mode”, which is more performant and compact.

6. Some thoughts about benchmark

Crossbar.io : is based on Python, so it’s not so fast and can’t use multiple cores (because of GIL)

: is based on Python, so it’s not so fast and can’t use multiple cores (because of GIL) RabbitMQ : RPC over MQ, so it has some overhead which is described above, also noticed rapid decline of perfomance if RabbitMQ is overloaded (e.g. if you start +2 test clients)

: RPC over MQ, so it has some overhead which is described above, also noticed rapid decline of perfomance if RabbitMQ is overloaded (e.g. if you start +2 test clients) Nats : gives a good performance, though less than Inverted Json, also can have the same problems like “cleaning” channels etc.

: gives a good performance, though less than Inverted Json, also can have the same problems like “cleaning” channels etc. Inverted Json : reached a network limit for this server (starting 2 copies of this test-case on separated cores doesn’t give better result in sum), lowest memory and cpu usage (regarding performance) among proxy systems.

: reached a network limit for this server (starting 2 copies of this test-case on separated cores doesn’t give better result in sum), lowest memory and cpu usage (regarding performance) among proxy systems. Nginx proxy-pass: a performance rapid decline if there is a lot of small requests (it’s not shown in the test), apparently linux doesn’t let to open/close so many sockets in short time, so it’s better to use keep-alive (which is not default mode)

proxy-pass: a performance rapid decline if there is a lot of small requests (it’s not shown in the test), apparently linux doesn’t let to open/close so many sockets in short time, so it’s better to use keep-alive (which is not default mode) Traefik : uses a lot of CPU (600% in multicore test), is a little slower than nginx

: uses a lot of CPU (600% in multicore test), is a little slower than nginx uvloop (for asyncio) : gives very good performance, because a main part is written on C/C++, it’s better than ZeroMQ for RPC

: gives very good performance, because a main part is written on C/C++, it’s better than ZeroMQ for RPC ZeroMQ : a worker itself is written on Python (GIL), so it can’t use more than 1 core, though on test it uses more than 100% because zeromq library is on C/C++ (and don’t use GIL)

It gives good performance, but on the other side, if a worker will be more complex than “a+b”, any extra code will decline performance because it reach 1 core limit earlier.

: a worker itself is written on Python (GIL), so it can’t use more than 1 core, though on test it uses more than 100% because zeromq library is on C/C++ (and don’t use GIL) It gives good performance, but on the other side, if a worker will be more complex than “a+b”, any extra code will decline performance because it reach 1 core limit earlier. ZeroRPC : is declared as light-weight wrapper over ZeroMQ, in real 95% of ZeroMQ performance was lost, so it looks not so light-weight.

: is declared as light-weight wrapper over ZeroMQ, in real 95% of ZeroMQ performance was lost, so it looks not so light-weight. GRPC : produces a lot of boilerplate python code, which reaches 1-core limit to fast. So probably GRPC is good for network, but not for Python.

: produces a lot of boilerplate python code, which reaches 1-core limit to fast. So probably GRPC is good for network, but not for Python. 2-core vs multi-core test:

in multi-core test a server, clients and workers share common resources (CPU), so some result is less than in 2-core test.

on the other side, servers which used more than 2 cores, gave better result, e.g. Traefik which used 600% CPU.

7. Conclusion

If you work in a big company with many developers/devops, then it can be ok to handle different complex systems to organize direct connection to get maximum performance.

But for small teams, where you need to solve different tasks with microservices, Inverted Json can save your time and resources.

To improve Inverted Json I’m want to make a support for pub-sub, kubernetes, and other interesting ideas.

If you think it can be a interesting project or just want to help an author, you can star the project on github (https://github.com/lega911/ijson), thank you.

What do you use for RPC (sync calls)?

PS: