We recently migrated our production applications from Amazon Elastic Container Service (ECS) to Amazon Kubernetes Service (EKS). We wrote previously about log shipping to CloudWatch. This post is about utilising our service mesh, provided by Linkerd, to allow service discovery and inter-cluster communication.

Amazon are a cloud compute company, but many of their original products and features point to making it easier buy more EC2 instances. This is the core inventory of their platform; using more instances. With this in mind, it doesn’t really matter which container scheduler you use, the common denominator is EC2 instances. Wrap the instances up in a few autoscaling and security groups and you can mesh the instances into a unified set of services (in our case using Linkerd). Below are a bunch of option that we considered and tested, each one has a verdict about what we did / didn’t like.

When we set out on the migration from ECS to EKS we had a bunch of requirements;

No big-bang rewrites Change one thing at once Any changes we make, should be changes that exploit native Kubernetes features and not home-grown hacks.

This meant that we didn’t want to replace our service discovery system (Consul) and service mesh (Linkerd) as part of this process.

Linkerd provides the service mesh, and using dtabs defines the rules for how services can communicate with each other. It also implements retries and timeouts for inter-service communication. It was also enabled the migration, allowing us to easily mesh the two clusters (ECS and EKS). Services on either cluster only needed to know about Linkerd and nothing else.

Existing ECS setup

We were running Linkerd on our ECS nodes in linker-to-linker mode. Having Linkerd on both ends of the communication between instances allows transparent TLS and a bunch of other benefits (timeouts, retries, circuit breaking, etc.) It means that we don’t have to implement that logic in all the new services we create. The entire Linkerd setup is out of scope of this article, but we’ve written about it before.

The diagram on the left shows an example; the login service talking to the user service would have a request flow that went through Linkerd when leaving and entering the node. This way you would get TLS encrypted communication between nodes for free. Linkerd in this linker-to-linker mode provides load balancing and service discovery by integrating with a service discovery backend. For us, this service discovery backend is consul.

To be able to migrate from ECS to EKS, and use Linkerd in the current service mesh format, we would need consul (the service discovery backend) to be updated with the addresses of services irrelevant of which container scheduler they were being run by.

Updating Consul — Options:

Option 1: replicate our ECS setup in EKS

Our ECS setup exploited the user-data of EC2 instances launch config to run three docker containers; Linkerd and Consul Agent and Registrator. The setup was already capable of registering (and de-registering) services with consul when those service started and stopped, using the Registrator ( gliderlabs/registrator ) container on every node. This container listens to the docker socket for starting and stopping container events, and then updates a local consul agent (which is responsible for updating the consul cluster). It setup, for any given node, looks something like this:

containers running on a node for service discovery

Here we see Linkerd and registrator using the local consul agent for service discovery.

Option 1 consisted of attempting to replicate this registrator → consul-agent setup on the EKS nodes, just how it was running on the ECS nodes. The problem is that pods run by Kubernetes are not as simple as tasks in ECS. Each pod in Kubernetes has a “pause container” that creates and maintains the shared namespace that all of the rest of the containers in the pod will use. It’s the first to start and all other containers are spawned from it. This allows the application containers to restart without losing the namespace.

Having a pause container start first means that the registrator will register, to consul-agent, the IP address of the pause container and not the IP address application container that starts afterwards. We could run consul-agent and linkerd as DaemonSets in Kubernetes, mirroring the once-per-node setup that we have in ECS, but we needed to replace the gliderlabs/registrator .

The goal is to eventually have a native Kubernetes setup, so this was an opportunity to take a step in that direction.

Verdict: we could have made some of the open source forks of registrator work, but we wanted to make a step towards Kubernetes native service discovery and this wouldn’t have been moving in the right direction. Rejected this option due to its complexity.

Option 2: Kubernetes Service with consul-k8s

A service in Kubernetes is how a set of pods are exposed to consumers. They translate almost directly into a consul service. Each of the pods that make up a Kubernetes service should be registered as an address of it’s corresponding consul service. Kubernetes services also have the benefits of respecting the running state of a container. A pod and the containers that make it up have probes (essentially health checks) that determine that container is ready to receive traffic. Taking the naive approach of the registrator and registering containers after they’ve started doesn’t respect if that container is ready to accept traffic. Kubernetes services do respect this, so we investigated how to sync kubernetes services into consul.

The week that we started investigating hashicorp announced consul-k8s . This project considers annotations on Kubernetes services and registers the backend pods of that service to consul, pretty much exactly what we needed. But there were some problems: firstly the project was about 1 week old and a bit buggy (pretty scary to put into production). Secondly, the fact we run Linkerd in a linker-to-linker setup became a problem. consul-k8s first iterations registered the pod IPs in consul, which do not match the node IPs.

The linker-to-linker setup that we use follows a flow similar to;

This means that for each of the pods that make up a service in service discovery we need the host IP and port. This is needed so that we can take the target address of the downstream service and change the port to get the address of the Linkerd container running on the target service’s node.

We can get a static port for pods on each node by exposing it as a NodePort service with Kubernetes. But at the time consul-k8s was not capable of registering the host IP of a pod in service discovery. Meaning that this port transformation approach would result in an address that didn’t match that of the Linkerd container.

Verdict: It would have been great to use a tool backed by a huge open source contributor like HashiCorp, but it wasn’t mature enough at the time and didn’t have the feature set we needed. Rejected because of missing features (and a few launch-week teething issues.)

Option 3: polling pod

We opted to create a single pod that contained an app that would use the Kubernetes API to check for services that should be registered (using annotations). Then register those services with a consul-agent that was running as a sidecar.