Why Kubernetes?

Kubernetes and the surrounding ecosystem have experienced a massive surge in popularity over the last few years — but this alone is not a sufficient reason for a company to embark on a major infrastructure overhaul. Our foray into Kubernetes was dictated by the problems we were seeking to solve rather than any one technology solution.

Historically, we have used a combination of Terraform and SaltStack to manage our AWS infrastructure. While this combination of technologies has carried us quite far (from our early days to over six million Robinhood accounts and dozens of microservices), our growth has experienced some technical challenges along the way. Most notably, deployments could be non-deterministic based on how the Salt states were written, and applying the Salt states across the hosts for our larger microservices could be time-intensive. It also gradually became clear that the interface we had for provisioning microservice infrastructure could be improved to better serve and streamline workflows for application owners. In particular, we wanted to create a user-centered interface that best serves application developers and the abstractions they’re familiar with.

Switching to Kubernetes seemed like a no-brainer. Moving toward containerization and container orchestration not only aligned with a company focus on building for the long-term, but also enabled us to solve the technical challenges we were facing around deployment. Furthermore, Kubernetes supports promising application-oriented abstractions such as Deployments, and provides a solid structure for extending these abstractions through CustomResourceDefinitions and custom controllers. Additionally, its API-first approach makes it much easier to interact with dynamically, compared to Salt.

While all these factors created a sense of optimism around Kubernetes, we still needed to vet it in a disciplined way to see if it would be a magical, out-of-the-box solution to the challenges we identified (Spoiler Alert: It wasn’t).

Where did we start?

Conducting high reliability infrastructure migrations is no easy feat, so our first objective was defining a restrained plan for our Kubernetes investigation. This involved descoping GKE (we didn’t want our foray into Kubernetes to require going multicloud), assessing EKS and kops, conducting internal experimentation and proof-of-concepts, and more (this initial work could be another whole post on its own).

Ultimately, we decided to gradually migrate a single application’s microservices from Salt and EC2 to Kubernetes. This effort ended up being a multi-month process (which could be yet another blog post). Once the migration was complete, we had to evaluate whether we actually moved the needle on the problems we were seeking to solve. We saw speed improvement — roughly 2x improvements in our deployment speeds — and our servers could automatically scale out much more quickly. We also gained confidence in the consistency and immutability of our deployments, with the application container image as our source of truth.

On the other hand, we had unwittingly replaced thousands of lines of Salt config YAML with thousands of lines of Kubernetes manifest YAML. The complexity from Salt remained, though in a slightly different form. Salt states for setting up common tooling — Consul agents, Vault integrations, nginx configs, Prometheus exporters, and more — had morphed into cryptic annotations, init containers, and sidecars. Raw Kubernetes manifests on their own, while functional, failed to sufficiently simplify the interface for provisioning microservice infrastructure.

How do we manage complexity?

After running our first application natively on Kubernetes, we were excited by the improvements we saw, but also surprised to find nearly the same amount of YAML configuration as in our previous stack. Upon further investigation, we realized that much of this complexity was being housed in the manifests; the Kubernetes abstractions, while generally applicable, lacked specific context on how to run typical applications at Robinhood. Taking a step back, we mapped our current microservice stack onto declarative models and defined them clearly with three key concepts:

Archetype: An archetype defines the standardized structure for an application, from the cookie-cutters and CI jobs used for development, through the infrastructure patterns used for credential management, service registration, and more. Application: An application refers to a microservice in our ecosystem. Component: An application consists of multiple components that work cohesively to offer a service level agreement to other services in the ecosystem. Web servers, Airflow workers, and Kafka daemons are examples of components.

After conducting this exercise and defining the declarative models that came out of it, we explored means to achieve our overarching goal of abstracting away the complexity of provisioning and operating infrastructure behind these key models. We wanted a solution that would enable us to:

Empower application developers to manage the entire lifecycle of their applications through clear, application-centric abstractions that removed the need for significant expertise in Kubernetes or aspects of Robinhood’s infrastructure.

Enable transparent upgrades and rearchitecting of sidecars and supporting infrastructure with minimal impact (e.g., switching our service mesh should be transparent to application developers).

Create a simple, standardized deployment process, with built-in support for ordered rollouts, canaries, and application-level health checks.

Make replicating applications across environments easy with minimal overhead to application developers, paving the way for more sophisticated CI/CD pipelines.

Contribute back to the Kubernetes community.

We started by surveying the wealth of amazing open source solutions that try to achieve these goals. While there were existing solutions that achieved some of these goals, none of them achieved them all. Powerful client-side templating tools such as Kustomize provided ways to simplify manifest files, but didn’t allow for new application-centric abstractions. Helm had additional powerful server-side “templating” using Charts and the Tiller, but lacked support for orchestrating updates to the generated resources and raised concerns about how the Tiller’s required privileges would mesh with a multi-tenant cluster. Jenkins X had some really interesting capabilities around scaffolding and orchestration, but we wanted Robinhood-specific customizations to be represented as first-class objects as opposed to just new commands.

Though we drew inspiration from many of the objects mentioned above, we opted to build our own platform, the Archetype Framework, to best achieve our goals.

How does it work?

There are four key components to our Archetype Framework.

1. Custom Resource Definitions (CRDs)

Kubernetes CRDs are a powerful way to extend the Kubernetes APIs, providing a way to define new API groups and resources while still being able to leverage the same API machinery and tooling (AuthN, AuthZ, admission, kubectl, SDKs) that is available to native Kubernetes resources. We created four new abstractions: Archetype, Application, Component, and VersionedArchetype (an immutable point-in-time snapshot of an Archetype). We also used Kubernetes’ codegen ability to generate Golang client libraries for these new APIs.

2. Admission webhooks

Kubernetes admission webhooks provide a way to perform custom validations and mutations on API requests, prior to objects being persisted in etcd. We built a single admission webhook server consisting of multiple admission plugins that work together to validate and mutate our custom resources and the relationships between them.

3. Custom controllers

Controllers are arguably the lifeblood of Kubernetes, responsible for moving the current state of the world to the desired state of the world. We built a custom controller, spinning off multiple control loops to realize the Application and Component objects using native Kubernetes resources.

4. Template rendering engine

Perhaps the most important component in the Archetype Framework, the template rendering engine translates user-created Application and Component objects to Kubernetes Deployments, Network Policies, ConfigMaps, Jobs, ServiceAccounts, AWS resources, and more. By capturing the templates themselves in the Archetype and VersionedArchetype objects in the API server, our custom control loops require no logic to be aware of the underlying Kubernetes objects used to realize Applications and Components — from their perspective, they simply render and apply templated objects.

Let’s look at an example of what our custom resources look like and how they come to life.

Archetypes and VersionedArchetypes are created and managed by framework administrators (application developers should not need to know how they work). These objects live in the Kubernetes API server and hold the templates that define how to realize a particular Component for an Application. Application developers can browse through the list of supported Archetypes simply using kubectl :

➜ ~ kubectl get archetypes

NAME AGE

django 30d

golang 30d

generic 30d

An Archetype looks something like this:

apiVersion: apps.robinhood.com/v1alpha1

kind: Archetype

metadata:

name: django

spec:

currentVersion: django-0.1.1

description: Robinhood's Django stack

owner: platform@robinhood.com

While most of the Archetype fields are metadata, it also references a VersionedArchetype, where the actual templates are stored. Just as before, users (mostly framework administrators) can discover all the available VersionedArchetypes using kubectl :

➜ ~ kubectl get vat

NAME AGE

django-0.1.0 30d

django-0.1.1 17d

django-0.1.2 10d

The VersionedArchetypes help us to roll out changes to Archetypes gradually. We can roll out a few Applications on the new version, before making that version the default version for the Archetype. A sample VersionedArchetype looks something like the following:

kind: VersionedArchetype

apiVersion: apps.robinhood.com/v1alpha1

metadata:

name: django-0.1.2

spec:

componentTypes:

- name: server

templates:

- name: serviceaccount

kind: ServiceAccount

apiGroup: v1

template: |

apiVersion: v1

kind: ServiceAccount

metadata:

name: [[ .Application.Name ]]-[[ .Component.Spec.Type ]]

namespace: [[ .Component.Namespace ]]

- name: deployment

kind: Deployment

apiGroup: apps/v1

template: |

...

- name: daemon

templates:

- ...

...

These templates can contain any objects that can be applied to the API Server. The template engine is designed to work with Golang templates by default, but is extensible with other templating engines like Helm and Kustomize.

Once an Archetype and a VersionedArchetype object exist, application developers can start onboarding their microservices to the framework by creating Application and Component objects. These look somewhat like the following:



apiVersion: apps.robinhood.com/v1alpha1

metadata:

name: myapp

namespace: myapp

spec:

owners:

archetype:

name: django

version: django-0.1.2

version: 1.2.3 # This is the application version

containerImageRepo: amazon.ecr.url/myapp

componentRolloutOrder: # This defines the order of rolling out new

# app versions

- canary

- '...' # Wild card indicating all remaining Components can be

# deployed after the canary is deployed and passing health

# checks

alertConfig:

slackNotify: "myapp-slack"

opsgenieNotify: "myapp-pager" kind: ApplicationapiVersion: apps.robinhood.com/v1alpha1metadata:name: myappnamespace: myappspec:owners: myapp@robinhood.com archetype:name: djangoversion: django-0.1.2version: 1.2.3 # This is the application versioncontainerImageRepo: amazon.ecr.url/myappcomponentRolloutOrder: # This defines the order of rolling out new# app versions- canary- '...' # Wild card indicating all remaining Components can be# deployed after the canary is deployed and passing health# checksalertConfig:slackNotify: "myapp-slack"opsgenieNotify: "myapp-pager" --- kind: Component

apiVersion: apps.robinhood.com/v1alpha1

metadata:

name: api-server

namespace: myapp

spec:

application: myapp

type: server

serverConfig:

allowedHosts:

- ...

autoscalingPolicy:

minReplicas: 120

maxReplicas: 200

targetCPUUtilizationPercentage: 60

schedules:

- name: "market-open"

schedule: "00 12 * * 1,2,3,4,5"

minReplicas: 120

- name: "market-close"

schedule: "05 23 * * 1,2,3,4,5"

minReplicas: 20

The Archetype, Application, and Component objects come together to translate our custom application-centric abstractions in a set of native Kubernetes objects, realized through our admission webhooks, custom controllers, template rendering engine. Application developers can interact with our custom objects through kubectl (and soon a UI), or they can look under the hood (pun intended) to see the native Kubernetes objects created on their behalf.

➜ ~ kubectl get apps -n myapp

NAME COMPONENTS READY

myapp 1 1 ~ kubectl get components -n myapp

NAME READY

api-server True # Looking under the hood ➜ ~ kubectl get deployments -n myapp

NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE

api-server 200 200 200 200 3d ~ kubectl get hpa -n myapp

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE

api-server deployment/api-server 6%/60% 1 5 1 3d ➜ ~ kubectl get pods -n myapp -l apps.robinhood.com/component-type=server

NAME READY STATUS RESTARTS AGE

api-server-5749655f95-58tdt 8/8 Running 0 3d

api-server-market-close-1570835100-txkb8 0/2 Completed 0 3d

...

Transitioning to this mechanism has been incredibly valuable to our team. We’ve abstracted away platform- and infrastructure-level complexities for application developers with a streamlined and minimalistic application-centric interface focused specifically on the Applications and Components they’re working with. This simplicity helps us achieve greater application developer velocity and ownership, enabling application developers to manage their infrastructure without needing to become experts on Kubernetes or every other aspect of Robinhood’s various infrastructure systems.

Here’s a diagram that summarizes how the various parts of the Archetype Framework work together: