Scaling RabbitMQ on a CoreOS cluster through Docker

2017-03-22 by Gabriele Santomaggio

Erlang Solutions offers world-leading RabbitMQ consultancy, support & tuning solutions. Learn more >

Introduction

RabbitMQ provides, among other features, clustering capabilities. Using clustering, a group of properly configured hosts will behave the same as a single broker instance.

All the nodes of a RabbitMQ cluster share the definition of vhosts, users, and exchanges but not queues. By default they physically reside on the node where they have been created, however as from version 3.6.1, the queue node owneriship can be configured using Queue Master Location policies. Queues are globally defined and reachable, by establishing a connection to any node of the cluster.

Modern architectures often involve container based ways of scaling such as Docker . In this post we will see how to create a dynamic scaling RabbitMQ cluster using CoreOS and Docker:

We will take you on a step by step journey from zero to the cluster.

Get ready

We are going to use different technologies, although we will not get into the details of all of them. For instance it is not required to have a deep CoreOS/Docker knowledge for the purpose of executing this test.

It can be executed using your pc, and what you need is:

What we will do:

Configure CoreOS cluster machines

First we have to configure the CoreOS cluster:

1. Clone the vagrant repository:

$ git clone https://github.com/coreos/coreos-vagrant $ cd coreos-vagrant

2. Use the user-data example file:

$ cp user-data.sample user-data

3. Configure the cluster parameters:

$ cp config.rb.sample config.rb

4. Open the file then uncomment num_instances and change it to 3, or execute:

sed -i.bk 's/$num_instances=1/$num_instances=3/' config.rb

5. Start the machines using vagrant up :

$ vagrant up Bringing machine 'core-01' up with 'virtualbox' provider... Bringing machine 'core-02' up with 'virtualbox' provider... Bringing machine 'core-03' up with 'virtualbox' provider…

6. Add the ssh key:

ssh-add ~/.vagrant.d/insecure_private_key

7. Use vagrant ssh core-XX -- -A to login, ex:

$ vagrant ssh core-01 -- -A $ vagrant ssh core-02 -- -A $ vagrant ssh core-03 -- -A

8. Test your CoreOS cluster, login to the machine core-01:

$ vagrant ssh core-01 -- -A

Then

core@core-01 ~ $ fleetctl list-machines MACHINE IP METADATA 5f676932... 172.17.8.103 - 995875fc... 172.17.8.102 - e4ae7225... 172.17.8.101 -

9. Test the etcd service:

core@core-01 ~ $ etcdctl set /my-message "I love Italy" I love Italy

10. Login to vagrant ssh core-02:

$ vagrant ssh core-02 -- -A core@core-02 ~ $ etcdctl get /my-message I love Italy

11. Login to vagrant ssh core-03:

vagrant ssh core-02 -- -A core@core-03 ~ $ etcdctl get /my-message I love Italy

As result you should have:

12. Test Docker installation using docker -v :

core@core-01 ~ $ docker -v Docker version 1.12.3, build 34a2ead

13. (Optional step) Run the first image with docker run :

core@core-01 ~ $ docker run ubuntu /bin/echo 'Hello world' … Hello world

The CoreOS the cluster is ready, and we are able to run Docker inside CoreOS. Let’s test our first RabbitMQ docker instance:

14. Execute the official RabbitMQ docker image:

core@core-01 ~ $ docker run -d --hostname my-rabbit --name first_rabbit -p 15672:15672 rabbitmq:3-management

15. Check your eth1 vagrant IP (used to access the machine) :

core@core-01 ~ $ ifconfig | grep -A1 eth1 eth1: flags = 4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 172.17.8.101 netmask 255.255.255.0 broadcast 172.17.8.255

Go to http://<your_ip>:15672/#/ in this case: http://172.17.8.101:15672/#/.

You should see the RabbitMQ management UI as ( guest guest ):

In order to scale up the node above, we should run another container with --link parameter and execute rabbitmqctl join_cluster rabbit@<docker_host_name> . In order to scale down we should stop the second container and execute rabbitmqctl forget_cluster_node rabbit@<docker_host_name> .

Turn into more positive. e.g. This is one of the areas where further enhancements on automation would be helpful.

We need docker orchestration to configure and manage the docker cluster. Among the available orchestration tools, we have chosen Docker swarm.

Before going ahead we should remove all the running containers:

core@core-01 ~ $ docker rm -f $( docker ps -a -q )

And the images:

core@core-01 ~ $ docker rmi -f $( docker images -q )

Configure Docker swarm

Docker Swarm is the native clustering mechanism for Docker. We need to initialize one node and join the other nodes, as:

1. Swarm initialization: to the node-01 execute docker swarm init --advertise-addr 172.17.8.101 .

docker swarm init automatically generates the command (with the token) to join other nodes to the cluster, as:

core@core-01 ~ $ docker swarm init --advertise-addr 172.17.8.101 Swarm initialized: current node ( 2fyocfwfwy9o3akuf6a7mg19o ) is now a manager. To add a worker to this swarm, run the following command : docker swarm join \ --token SWMTKN-1-3xq8o0yc7h74agna72u2dhqv8blaw40zs1oow9io24u229y22z-4bysfgwdijzutfl6ydguqdu1s \ 172.17.8.101:2377

Docker swarm cluster is is composed by leader node and worker nodes.

2. Join the core-02 to the cluster docker swarm join --token <token> <ip>:<port> (you can copy and paste the command generated to the step 1) :

In this case:

core@core-02 ~ $ docker swarm join \ --token SWMTKN-1-3xq8o0yc7h74agna72u2dhqv8blaw40zs1oow9io24u229y22z-4bysfgwdijzutfl6ydguqdu1s \ 172.17.8.101:2377 This node joined a swarm as a worker.

3. Join the core-03 to the cluster docker swarm join --token <token> <ip>:<port> :

core@core-03 ~ $ docker swarm join \ --token SWMTKN-1-3xq8o0yc7h74agna72u2dhqv8blaw40zs1oow9io24u229y22z-4bysfgwdijzutfl6ydguqdu1s \ 172.17.8.101:2377 This node joined a swarm as a worker.

4. Check the swarm cluster using docker node ls :

core@core-01 ~ $ docker node ls ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS 07m3d8ipj2kgdiv9jptv9k18a core-02 Ready Active 2fyocfwfwy9o3akuf6a7mg19o * core-01 Ready Active Leader 8cicxxpn5f86u3roembijanig core-03 Ready Active

Configure RabbitMQ docker cluster

There are different ways to create a RabbitMQ cluster:

Manually with rabbitmqctl

Declaratively by listing cluster nodes in a config file

Declaratively with rabbitmq-autocluster (a plugin)

(a plugin) Declaratively with rabbitmq-clusterer (a plugin)





To create the cluster we use the rabbitmq-autocluster plugin since it supports different services discovery such as Consul, etcd2, DNS, AWS EC2 tags or AWS Autoscaling Groups.

We decided to use etcd2, this is why we tested it on Configure CoreOS cluster machines see step 8.

Ready to the final round, create the RabbitMQ cluster.

1. Create A Docker network:

core@core-01~$ docker network create --driver overlay rabbitmq-network

The swarm makes the overlay network available only to nodes in the swarm that require it for a service

2. Create a Docker service:

core@core-01 ~ $ docker service create --name rabbitmq-docker-service \ -p 15672:15672 -p 5672:5672 --network rabbitmq-network -e AUTOCLUSTER_TYPE = etcd \ -e ETCD_HOST = ${ COREOS_PRIVATE_IPV4 } -e ETCD_TTL = 30 -e RABBITMQ_ERLANG_COOKIE = 'ilovebeam' \ -e AUTOCLUSTER_CLEANUP = true -e CLEANUP_WARN_ONLY = false gsantomaggio/rabbitmq-autocluster

Note: The first time you have to wait a few seconds.

3. Check the service list using docker service ls

4. You can check the RabbitMQ instance running on http://<you_vagrant_ip>:15672/#/ most likely http://172.17.8.101:15672/#/

5. Scale your cluster, using docker service scale as:

core@core-01 ~ $ docker service scale rabbitmq-docker-service = 5 rabbitmq-docker-service scaled to 5

Congratulations!! You just scaled your cluster to 5 nodes!

Since the 3 CoreOS machine are in cluster, you can use all the 3 machines to access the cluster, as:

6. Check the cluster status on the machine:

core@core-01 ~ $ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES b480a09ea6e2 gsantomaggio/rabbitmq-autocluster:latest "docker-entrypoint.sh" 1 seconds ago Up Less than a second 4369/tcp, 5671-5672/tcp, 15671-15672/tcp, 25672/tcp rabbitmq-docker-service.3.1vp3o2w1eelzbpjngxncb9wur aabb62882b1b gsantomaggio/rabbitmq-autocluster:latest "docker-entrypoint.sh" 6 seconds ago Up 5 seconds 4369/tcp, 5671-5672/tcp, 15671-15672/tcp, 25672/tcp rabbitmq-docker-service.1.f2larueov9lk33rwzael6oore

Same to the other nodes, you have more or less the same number of containers.

Let’s see now in detail the docker service parameters:

Command Description docker service create Create a docker service --name rabbitmq-docker-service Set the service name, you can check the services list using docker service ls -p 15672:15672 -p 5672:5672 map the RabbitMQ standard ports, 5672 is the AMQP port and 15672 is the Management_UI port --network rabbitmq-network Choose the docker network -e RABBITMQ_ERLANG_COKIE='ilovebeam' Set the same erlang.cookie value to all the containers, needed by RabbitMQ to create a cluster. With different erlang.cookie it is not possible create a cluster.

Next are the auto-cluster parameters:

Command Description -e AUTOCLUSTER_TYPE=etcd set the service discovery backend = etcd -e ETCD_HOST=${COREOS_PRIVATE_IPV4} The containers need to know the etcd2 ip. After executing the service you can query the database using the command line etcdctl ex: etcdctl ls /rabbitmq -recursive or using the http API ex: curl -L http://127.0.0.1:2379/v2/keys/rabbitmq -e ETCD_TTL=30 Used to specify how long a node can be down before it is removed from etcd’s list of RabbitMQ nodes in the cluster -e AUTOCLUSTER_CLEANUP=true Enables a periodic check that removes any nodes that are not alive in the cluster and no longer listed in the service discovery list.

Scaling down removes one or more containers, the nodes will be removed from etcd database, see, for example: docker service scale rabbitmq-docker-service=4 -e CLEANUP_WARN_ONLY=false If set, the plugin will only warn about nodes that it would cleanup. AUTOCLUSTER_CLEANUP requires CLEANUP_WARN_ONLY=false to work. gsantomaggio/rabbitmq-autocluster The official docker image does not support the auto-cluster plugin, in my personal opinion they should. I created a docker image and registered it on docker-hub.

AUTOCLUSTER_CLEANUP to true removes the node automatically, if AUTOCLUSTER_CLEANUP is false you need to remove the node manually.

Scaling down and AUTOCLUSTER_CLEANUP can be very dangerous, if there are not HA policies all the queues and messages stored to the node will be lost. To enable HA policy you can use the command line or the HTTP API, in this case the easier way is the HTTP API, as:

curl -u guest:guest -H "Content-Type: application/json" -X PUT \ -d '{"pattern":"","definition":{"ha-mode":"exactly","ha-params":3,"ha-sync-mode":"automatic"}}' \ http://172.17.8.101:15672/api/policies/%2f/ha-3-nodes

Note: Enabling the mirror queues across all the nodes could impact the performance, especially when the number of the nodes is undefined. Using "ha-mode":"exactly","ha-params":3 we enable the mirror only for 3 nodes. So scaling down should be done for one node at time, in this way RabbitMQ can move the mirroring to other nodes.

Conclusions

RabbitMQ can easily scale inside Docker, each RabbitMQ node has its own files and does not need to share anything through the file system. It fits perfectly with containers.

This architecture implements important features as:

Round-Robin connections

Failover cluster machines/images

Portability

Scaling in term of CoreOS nodes and RabbitMQ nodes





Scaling RabbitMQ on Docker and CoreOS is very easy and powerful, we are testing and implementing the same environment using different orchestration tools and service discovery tools as kubernetes, consul etc, by the way we consider this architecture as experimental.

Here you can see the final result:

Enjoy!

Erlang Solutions is the world leader in RabbitMQ consultancy, development, and support.

We can help you design, set up, operate and optimise a system with RabbitMQ. Got a system with more than the typical requirements? We also offer RabbitMQ customisation and bespoke support.

Learn more about our work with RabbitMQ >

We thought you might also be interested in:



