Service discovery and load balancing

Services in most distributed systems come and go often due to different reasons like service starting/stopping, scaling or service failures. Unlike static load balancer configuration which works for servers with well-known IP addresses and host names, for Marathon, load balancing requires much more sophistication in registering and deregistering services with balancer on the fly. For that, we need some kind of service registry which can hold information about registered services and provide it to clients. This concept is known as service discovery, it’s the key component of most distributed systems.

Here Consul comes to the rescue. As its website states, “Consul makes it simple for services to register themselves and to discover other services via a DNS or HTTP interface”. In addition, it has other useful features we’ll make use of later. Now that we have service registry, we need to let it know which services are started and where they are located (hostname and port) and optionally provide other useful meta information about them. One way of doing that is to make services themselves use Consul API directly, but that would obviously require every service to implement its own communication logic, and while it’s trivial for one or two services it becomes a burden if you have many of them, especially when they are written in the different programming languages. Another way is to have some kind of third-party utility which monitors services and reports them to Consul. We use a program called marathon-registrator, it tightly integrates with Marathon and can register any kind of services Marathon runs. Another option is to use Gliderlabs registrator if you only have services in Docker containers. You just need to run an instance of such utility on each Mesos slave host.

Once services are registered, other services should be able to locate them. Again, they could directly communicate with Consul API or DNS and get this information (client-side discovery), but there is an alternative, a load balancer such as HAProxy (server-side discovery).

Server-side service discovery with HAProxy has many benefits over client-side discovery:

Load balancing for free.

Immediate propagation of changes from service registry to consumers. HAProxy is reconfigured and ready to route requests to new instances right away after a change occurs.

Extremely flexible configuration. To name a few features: load balancing strategies, healthchecks, ACLs, A/B testing, logging, and statistics.

No need for service to implement additional discovery logic.

But how does HAProxy keep track of services registered in Consul? Normally, its configuration is done statically with all backends known in advance. But it could also be built dynamically with an external tool such as consul-template. This tool monitors Consul for changes and generates arbitrary text files from provided Go templates, so it’s not only limited to configuring HAProxy and can be used with anything that is configurable with text files (nginx, varnish, apache, etc.). Its templating language documentation is comprehensive and may be found in README.

As you might have noticed from the overview chart, we run two different HAProxy configurations: one for internal and one for external load balancing. Internal instances provide the actual service discovery and balance traffic across back-end services. External instances in addition to service discovery expose TCP 80 port and accept requests from outside, this way balancing front-end services load.

For these different HAProxy instances we manage two separate consul-template template files which themselves are built by another templating engine (Jinja2) during machine provisioning performed by Saltstack. This was done mainly to keep everything DRY and to populate some parts with data from machine configuration software. Let’s look at the external balancer configuration template. Note raw/endraw markers which make Jinja engine disregard Go template curly braces and render enclosed contents “as is”:

It includes several dependencies. haproxy-defaults.ctmpl.jinja is a regular static part found in many HAProxy config examples, haproxy-internal-frontend.ctmpl.jinja is more interesting, this is where the internal service discovery configuration is done.

The idea is to come up with a well-known port number for every discoverable service and create an HAProxy front-end which listens on this port. We’ll make use of meta information stored along with every registered service. Consul allows to specify a list of tags associated with the service, and marathon-registrator reads them from service environment variable called SERVICE_TAGS. See service.json template of test-server, it contains two tags separated with the comma: $environment and internal-listen-http-3000. The latter is used in a consul-template template to mark services which expose a port (3000 in our case) for service discovery. The following snippet automatically generates the necessary HTTP front-ends:

Its outer loop lists all Consul services while inner loop lists tags of each service and tries to find a match with internal-listen-http-<port>. For every match, an HTTP front-end section is created. Every service here has two hardcoded environments: production and staging, to differentiate them, the port number for staging is prepended with “1” so that production front-end will listen for 3000 and staging for 13000.

Additional if statements allow specifying multiple discoverable ports on the single service. For that just place additional internal-listen-http-<port> markers in tags list, like

$environment,internal-listen-http-3000,internal-listen-http-3010

Now you’ll need to add a newly exposed port to container.docker.portMappings array of the service definition file in order for Marathon to properly configure your container’s network. Note, that in this case marathon-registrator will register two separate services: test-server-3000 and test-server-3010 to resolve them independently and avoid name ambiguity.

You may come up with other predefined markers to implement other kinds of logic in templates, for example, introduce internal-listen-tcp-<port> to generate TCP front-ends or control balancing strategy with something like balance-roundrobin or balance-leastconn.

This template allows to configure HAProxy in such a way that every machine has access to every service known to Consul by connecting to localhost:<well-known-port>, thus solving service discovery problem.

In haproxy-wellknown-services.ctmpl.jinja we specify more or less statically managed services like Marathon, Consul, and Chronos for their easy discovery. They are started by systemd/upstart/etc during machine provisioning. For example, the following snippet allows for very convenient access to Marathon instance by simply contacting localhost:18080 from any machine in the cluster, localhost:14400 and localhost:18500 for Chronos and Consul respectively (master_nodes collection comes from the configuration management software in this case):

haproxy-external-frontend.ctmpl.jinja describes HTTP and HTTPS front-ends. It contains several Jinja macros which define ACL rules for domain name matching and to bind back-ends to those rules:

And finally, there is a haproxy-backends.ctmpl.jinja file. It lists available service instances referred by previous sections. All backends here are crafted manually since they might have very special requirements in terms of health checking or load balancing configuration:

Internal balancer configuration file is a bit simpler, it only needs to route connections to internally accessible services: