Container orchestration is the automated arrangement, coordination and management of containers in their clusters. It controls different aspects of container life-cycles, such as placement and initial deployment, scaling and replication, and so on.

Container orchestration is most commonly used for clusters that consist of many nodes. It is mainly used to deploy and manage complex containerized applications. Container orchestration can also be employed for simple clusters or for individual containers. For example, the same unified tools should be used for development, test and production environments. Also, such features as auto-replication, auto-scaling, auto-healing, volume management, networking are useful for individual containers.

Currently, the most popular platform to orchestrate containerized applications is Kubernetes, created by Google. In our previous post we introduced Kubernetes and showed how to install a single-node Kubernetes cluster on a virtual machine. In this post we will take a closer look at container orchestration, its features, and how they work in Kubernetes.

Container Orchestration Features

In computing, orchestration is used for coordination and management of services, middleware and other complex computer systems. Historically, two methods of orchestration have been implemented:

Traditional : when orchestration is imperative and flow-based (IFFT, “if then, then that”).

New method: uses declarative orchestration when a user defines a desired state. The orchestration tool takes care how to achieve the desired state from the current observable state. This method is commonly used for the most popular container orchestration tools, such as Kubernetes, Docker Swarm and Mesosphere Marathon.

The most common container orchestration features represent typical needs of containerized applications. Here are twelve key orchestration features and how they are implemented in Kubernetes:

1 – Managing multiple containers as one entity is a capability that logically groups a set of containers into one entity to co-locate main application with its helpers. This preserves the one-application-per-container model, or can define a complex multi-tier application as a set of smaller entities working together.

In Kubernetes, a pod is a logical group of one or more containers. It is the smallest deployable unit that can be managed. The main purpose of the pod is to support co-located, co-managed helper programs, for example the application server and its local cache, or application server and log watcher, and so on.

For vertically integrated applications like LAMP (web server and database server), Kubernetes provides a concept of Service, which defines a logical group of pods and a policy for accessing such logical groups for the external clients. In the following example, we have two services: one for application server and one for database. Databases use a persistent volume, which we will cover later.

Kubernetes allows use of Labels for pods and other resources (services, deployments, and so on). Labels are key/value pairs that can be attached to resources at creation time and then added and modified at any time. Then labels can be used via Selectors to organize and to select subsets of resources (for example, pods) to manage them as one entity.

2 – Container placement is an algorithm that selects a specific host for a specific container or a set of containers using different rules, including current load of the hosts, colocation constraints and availability constraints. Container placement can be triggered manually by the user or by API call, or automatically by auto-scaling, auto-replication or auto-healing.

In Kubernetes, containers in a pod run on the same host, share an IP address and port space, and can find each other via localhost. They can also communicate with each other using standard inter-process communications like semaphores or shared memory.

4 – Container replication ensures that a specified number of equivalent containers (“replicas”) are running at the time. If there are too many containers, the surplus will be stopped, if there are too few, new containers will be started.

In Kubernetes, a Replication Controller manages a set of pods, making sure that the cluster is in the specified state. Specifically it is responsible for running the specified number of pod’s copies (“replicas”) across the cluster.

5 – Container auto-scaling automatically change the number of running containers, based on CPU utilization or other application-provided metrics.

In Kubernetes, it is possible to define an Autoscaler for a deployment. The defined autoscaler will maintain the number of replicas between the specified minimal and maximal number maintaining, for example, the specified average CPU utilization across all pods. Kubernetes 1.2 adds alpha support for scaling, based on application-specific metrics like queries per second or average request latency.

6 – Volume management is used to control persistent storage for containers. According to best practises, containers should be stateless and files in a container should be ephemeral, so that when a container crashes and gets restarted the changes to the files will be lost. Also, for containers running in a logical group it is often necessary to share files between those containers. A volume is a persistent storage for a container, it can be mounted at the specified paths within the container’s file system, it can be shared with multiple containers, and a container can use multiple volumes. Also a volume may have its own lifecycle.

Kubernetes provides different types of volumes. Each type is implemented as a plugin. Current plugins include Google Cloud Platform volume, AWS Elastic Block Storage volume, Ceph block device, Empty dir (backed on the node where the pod runs, thus it just leverages local storage from that node; it exists on that node until the pod is removed). Kubernetes supports two kinds of volumes:

Regular : has the same lifecycle as the the pod that encloses it

Persistent : has a lifecycle independent of any individual pod

7 – Resource usage monitoring (CPU and RAM) is required at different levels – at the container level, at the logical group level (for example, Kubernetes Pods) and at the cluster level.

Kubernetes includes cAdvisor, an open source container resource usage and performance analysis agent. It runs on each node, auto-discovers all of the containers on the node and collects CPU, memory, filesystem, and network usage statistics. cAdvisor also provides the overall node usage.

To learn about the benefits and challenges with containers and virtualization, watch this webinar

8 – Health checks are used to check container’s liveness or readiness status. Container status is defined by probes such as:

HTTP probe: execute HTTP method for the specified URL and analyze the returned HTTP status code

TCP probe: check if the specified port is open

container execution test: execute the specified command in the container and analyze its exit code

Container orchestration tool allows defining automated actions for failed probes, so a container can be automatically restarted if its liveness probe failed. This feature is called auto-healing.

Kubernetes allows defining liveness or readiness probes for pods. The HTTP and container execution probes are supported out of the box.

Container Orchestration? Containerized Control Planes? Find out about both, here

9 – Networking plays a significant role for container orchestration to isolate independent containers, connect coupled containers and provide access to containers from external clients.

Kubernetes has a pluggable architecture and allows to use different overlay network implementations including simple one L2 networking with Linux bridges, overlay network using OpenVSwitch, Flannel, Calico and so on.

10 – Service discovery allow containers to discover other containers and establish connections to them.

Kubernetes supports 2 primary modes of finding a service via environment variables and via DNS. Per service, Kubernetes create environment variables such as {SERVICE_NAME}_SERVICE_HOST and {SERVICE_NAME}_SERVICE_PORT. For the same service a DNS record for “service.ns” is created, where “service” is the name of the service and “ns” is the Kubernetes namespace. Pods that are running in the “ns” namespace should be able to find it by conducting a name lookup for “service”. Kubernetes also supports DNS SRV records for named ports. If the “service.ns” service has a port named “http” with the TCP protocol, then you can run a query for “_http._tcp.my-service.my-ns” to discover the port number for “http”.

11 – Load balancing works in conjunction with container replication, scaling and service discovery. Load balancing is a dedicated service that knows which replicas are running and provides an endpoint that is exposed to clients. Connections to the exposed endpoint are distributed over running replicas using different methods such as DNS round-robin.

Kubernetes allows you to define a service with a load balancer. To access the service, a client connects to the external IP address, which is forwarded to a port on a node and then forwarded to a port on a pod.

Learn how to implement a load balancing solution.

12 – Rolling update allows you to update a deployed containerized application with minimal downtime using different update scenarios. The typical way to update such application is to provide new images for its containers. Rolling update ensures that only a certain number of old replicas may be down while they are being updated and ensures only a certain number of new replicas may be created above the specified number.

Kubernetes allows using rolling updates for replication controllers and deployments. A Kubernetes Deployment defines a desired state for logical group of pods or replica sets. A Deployment Controller drives the actual state to the desired state at a controlled rate, it creates new resources or replaces the existing resources, if necessary. Deployments can be updated, rolled out, or rolled back.

Summary

Kubernetes container orchestration platform provides the most sought after orchestration features for the automated arrangement, coordination and management of containers in the clusters. Initially designed by Google for big scale, Kubernetes widely used by organizations of various sizes to run containers in production.