Lida Li, Rodrigo Menezes & Yongwen Xu | Pinterest Engineers, Cloud Management Platform & SRE

Over the last year the cloud management platform and site reliability engineering teams at Pinterest have been focused on moving our workload from EC2 instances to Docker containers. To date, we’ve migrated more than half our stateless services, including 100 percent of our API fleet. This blog post shares our learnings and adventures throughout the process.

In recent years, Docker and container orchestration technology like Mesos and Kubernetes have become a hot topic in our industry, demonstrating clear advantages to building a container-based platform. For us, implementing a container platform could yield the following benefits:

Improve developer velocity by eliminating the need to learn tools like Puppet.

Provide an immutable infrastructure for better reliability.

Increase our agility to upgrade our underlying infrastructure.

Improve the efficiency of our infrastructure.

We started our investigation, evaluation and testing in early 2016. Initially, we planned to not only migrate our workload to Docker containers but also run them in a multi-tenant way by adopting the container orchestrator technology at the same time. As the process evolved, we decided to take a phased approach. We first moved our services to Docker to free up engineering time spent on Puppet and to have an immutable infrastructure. At this time, we’re in the process of adopting the container orchestration technology to better utilize the resource and leverage the open source container orchestration technology.

Before the migration

Before the containerization work services at Pinterest were authored and launched as illustrated as below:

There was one base Amazon Machine Image (AMI) with an operating system, common shared packages and installed tools. For some services, mostly these large and complex ones, there also was a service-specific AMI built on the base AMI containing the service dependency packages.

Above that, we had two deployment tools: Puppet and Teletraan. Puppet was used to provision cron jobs, infra components like service discovery, metrics and logging and their config files. Teletraan deployed our production service code and some machine learning models.

These services were run by various process managers including Monit, Supervisor and Upstart.

However, this process had several pain points:

Engineers needed to be involved in the AMI build as well as learn the various configuration languages of these process managers, including Puppet.

Puppet changes could be applied to hosts at any time without fine granular control.

As time went by, environments on the hosts diverged and caused operational issues.

The migration

Migrating infrastructure is challenging for companies like Pinterest because of the scale and complexity. The variance on the programming language, technology dependencies across the technology stack, strict requirements for performance and availability and tech debt all need to be factored in.

Our containerization started from a few small, non-critical applications owned by the CMP and SRE teams. Meanwhile, we started to put critical services into containers in their development and testing environments. Using this as a basis, an effort around containerization was started to bring wider adoption across applications.

One of the first applications we migrated was our main API fleet. We thought that migrating one of our largest and most complex systems early on would help us catch and solve quite a few issues right from the start. This process helped us create and solidify a more complete set of tools required by our wider infrastructure. It also lowered the risk of having to spend time rewriting tools to include other features in the future.

A big concern for us early on was around performance. While the Docker network is great for development, it’s widely known to negatively impact performance when running in production. This drove our decision to use the host’s network and not rely on Docker networking. We ran several tests to ensure there weren’t issues that would prevent our API fleet from running in a container.

For the API migration, we ensured we had a comprehensive set of metrics to compare running in a container and outside of one. This process helped us find gaps that made monitoring what was running in Docker and what was running on the host itself indistinguishable. It also helped us uncover issues caused by the conversion process from the early stages of testing and eventually gave us the confidence we needed to push forward into production. The migration process was completed in less than a month while we monitored for regressions or issues. Thankfully this was a fairly uneventful process and caused no disruptions.

After the migration

After we built the primitives using our API fleet as basis, our containerization effort continues. We’ve been able to leverage the tools created to support one of our most complex applications to support the migration of other applications.

Our docker platform now looks like below:

There’s only one AMI for all container services. All service-specific dependencies are put into this container. In addition to the service container, all supporting components (service discovery, metrics, logging, service proxy, etc.) run in Docker containers working with the service containers.

Engineers still use the Teletraan interface to deploy their containers, but it now works with Telefig, a tool we built to launch and stop containers as well as manage the dependency among containers.

Docker container engine now plays the role of the process manager to monitor and restart containers. The engine’s restart policy is set to unless-stopped and the — live-restore switch is turned on.

Here’s a more detailed look at how we’re running these Docker containers:

All our containers now run in — net=host which gives us native network performance. Our AWS dependencies like IAM role and security group also work under this model without any code changes.

Amazon ECR is our primary Docker registry. We host a secondary Docker registry replicated from ECR which provides high availability for production.

Engineers describe the service in a YAML file which has a very similar syntax like Docker compose file. It contains a set of Pinterest-specific primitives so that developers don’t need to write long configurations.

Each container has the tag format as the [Name]:[git commit hash] which represents a unique build.

We’re running on Ubuntu with our own customized latest Linux kernel.

We started from Docker 1.11 and are now running our fleet on Docker 17.03.1-ce.

What’s next

We’ve completed our first phase goal of containerizing our services. For the next phase, we’re going to adopt the container orchestration and build a multi-tenant cluster to provide one unified interface for long running services and batch jobs.