18th August, 2015

Yesterday we added CloudWatch metrics support for the EC2 Container Service (hat tip to long time partner in crime, Deepak and his team), and I thought this would be a good time to talk about considerations for containers.

Containerized application components

There are four key components to building and deploying a long running, container-based application in production:

One or more application components, encapsulated into containers. A description of the resource requirements for each application component (memory, CPU, IO, networking, storage) A pool of computational capacity. A method of placing the application components to most efficiently take advantage of the pool of resources while meeting the requirements of each component.

On AWS today, this equates to 1) applications encapsulated into Docker containers, 2) ECS task definitions, 3) instances provisioned through EC2, potentially using multiple instance types and 4) & 5) the ECS Service Scheduler.

Consider a simple e-commerce application with three components: inventory (product details, pricing, stock, etc), ordering (processing, payment) and a web front end. From a customer perspective, through the web site (served from the web front end component), they would view product details (from the inventory component), and then buy them (through the ordering component).

Three task definitions capture the resource requirements for each component: web front end needs small amount of CPU and memory; inventory needs lots of memory; ordering needs lots of IO.

We provision a collection of EC2 instances, with a five T2, three R3 and two I2 instances (10 in total). ECS treats these instances as a single cluster, onto which application containers can be placed.

Scheduling Resources

The question is, what is the most efficient way to place the application components across the cluster, based on the individual requirements of each component?

ECS uses the service scheduler to answer this, by running the containers in the most efficient way across the resources available on the cluster. In this example, it would place containers with the front end component on the T2 instances, containers with the inventory component on the R3 instances and containers with the ordering component on the I2 instances.

Since this is a long running application, the service scheduler manages the containers as they handle requests and process orders, with the following features:

Distribute traffic across multiple containers (the five web front ends, in this example) via ELB, by automatically registering and de-registering containers with the load balancer. Customers can direct traffic to the load balancer and have that traffic routed to the containers. Auto-recover containers in an unhealthy state. If one of the inventory components fails, for example, the service scheduler will remove the unhealthy container from the cluster, and place a fresh container with that component back on the cluster. Add and remove capacity for each application component, but placing and removing containers from the cluster dynamically in response. Update application components in a running application without interruption. Let’s say we add a new product description field to the inventory component: the service scheduler would put this into production by replacing containers one by one across the running application component containers on the cluster. Easily adjust application component resource definitions for a running application. If the new inventory component needed additional CPU requirements, we could adjust this and the service scheduler would adhere to the new requirements without interrupting the application. The cluster specification can also be changed in real time: the instance families used in the cluster can be adjusted to reflect new application requirements (adding D2 instances, for example). The size of the cluster can also change dynamically (adding more T2 instances, for example), using EC2 auto-scaling based on custom application metrics. This is where today’s announcement fits in: metrics from your application components and the cluster they are running on are now available in CloudWatch, and importantly, via the CloudWatch API allowing you to programmatically respond to changes in both resource utilization and application-level metrics (page load time, query time, order load, etc). Auto Scaling groups let you scale each cluster of EC2 instance types independently.

These are essential features for a container-based application to run in production while maintaining the flexibility and speed to make changes to the application components, and the availability of the application.

Getting Started

The EC2 Container Service docs have a great tutorial to help get you started: Scaling Container Instances With CloudWatch Alarms. Enjoy.