Update 08/18/2020: Managed node groups now support launch templates to give you wider range of controls!

When deploying a Kubernetes cluster, you have two major components to manage: the Control Plane (also known as the Master Nodes) and Worker Nodes. AWS EKS is a managed service provided by AWS to help run these components without worrying about the underlying infrastructure. Originally, EKS focused entirely on the Control Plane, leaving it up to users to manually configure and manage EC2 instances to register to the control plane as worker nodes. In the past few months, AWS has released several exciting new features of EKS, including Managed Node Groups and Fargate support. These features provide additional options for running your workloads on EKS beyond the self managed EC2 instances and Auto Scaling Groups (ASGs). However, with these new choices, provisioning an EKS cluster now involves a complicated trade off of the different worker groups available to decide which one is the best for you.

In this guide, we would like to provide a comprehensive overview of these new options, including a breakdown of the various trade offs to consider when weighing the options against each other. The goal of this guide is to give you all the information you need to decide which option works best for your infrastructure needs.

We’ll start the guide by giving a brief overview of the EKS architecture that describes why you need worker nodes in the first place, before diving into each option that AWS gives you. Here is a brief outline of what we will cover:

Brief Overview of EKS Architecture: Control Plane and Worker Nodes

Kubernetes component architecture diagram from the official documentation.

Every EKS cluster has two infrastructure components no matter what option you pick (even serverless): the EKS Control Plane (also known as the “EKS Cluster” in the AWS Console), and Worker Nodes. This component architecture stems from the basic Kubernetes architecture involving the Kubernetes Master Components and Kubernetes Node Components (see the official Kubernetes documentation). Specifically, the EKS control plane runs all the Master components of the Kubernetes architecture, while the Worker Nodes run the Node components.

The Kubernetes Master components are responsible for managing the cluster as a whole and making various global decisions about the cluster, such as where to schedule workloads. Additionally, the Master components include the API server, which provides the main UX for interacting with the cluster.

The Node components of Kubernetes on the other hand, are responsible for actively running the workloads that are scheduled on to the EKS cluster. These components are designed to be run on servers to turn them into Kubernetes worker nodes. When you interact with Kubernetes, you schedule workloads by applying manifest files to the API server (e.g using kubectl ). The Master components then schedule the workload on any available worker node in the cluster, and monitor it for the duration of its lifetime.

For example, when you deploy a Node.js Docker container on to your Kubernetes cluster as a Deployment with 3 replicas, the Control Plane will pick worker nodes from its available pool to run these 3 containers. These worker nodes are then instructed to start and run these containers. This is done through API calls between the Master components running on the Control Plane, and the Node components running on the worker nodes.

Note that this communication is two way: the Node component will also inform the Master of any events that happen on the worker nodes. For example, if any containers stop running on the Node, the Node components will notify the Master components so that it can be rescheduled.

If you want to learn more about the specific components that make up Kubernetes and EKS, you can check out the official docs on EKS.

A key thing to note here is that in most Kubernetes clusters, the Master nodes can also act as Nodes for scheduling workloads. In these clusters, it is not strictly necessary to have additional worker nodes for running your workloads. However, in EKS, the control plane is locked down such that you can not actively schedule any workloads on the control plane nodes. Hence, every EKS cluster requires both the control plane, and worker nodes to run the workloads on.

The rest of the guide will cover the various options AWS provides for provisioning Worker Nodes to run your container workloads. We’ll start with the most flexible option available: Self Managed Worker Nodes.

Self Managed Worker Nodes using Auto Scaling Groups and EC2 Instances

The original option that was available to you when EKS was first announced at the end of 2017 for running worker nodes, was to manually provision EC2 instances or Auto Scaling Groups and register them as worker nodes to EKS.

This option does not benefit from any managed services provided by AWS. However, it gives you the most flexibility in configuring your worker nodes. Since you are not relying on any managed components in this approach, you must configure everything including the AMI to use, Kubernetes API access on the node, registering nodes to EKS, graceful termination, etc. In return, you get control over the underlying infrastructure. This means that you can customize all the nodes to your preference, allowing you to meet almost all infrastructure needs that you might have for running in the cloud. For example, because you have full access to the underlying AMI, you can configure to run on any operating system and install any additional components on to the server that you might need.

To provision EC2 instances as EKS workers, you need to ensure the underlying servers meet the following requirements:

The AMI has all the components installed to act as Kubernetes Nodes. This includes the kubelet process and a container engine (e.g docker) at a minimum.

process and a container engine (e.g docker) at a minimum. The associated Security Group needs to allow communication with the Control Plane and other Workers in the cluster. See the relevant documenation for more details.

The user data or boot scripts of the servers need to include a step to register with the EKS control plane. On EKS optimized AMIs, this is handled by the bootstrap.sh script installed on the AMI. See the script source code for more details on what is involved.

script installed on the AMI. See the script source code for more details on what is involved. The IAM role used by the worker nodes are registered users in the cluster. See the section on managing users and IAM roles for your cluster from the official docs for more details.

Additionally, concerns like upgrading components must be handled with care. A naive approach to rotate or scale down servers, for example, may result in disrupting your workloads and lead to downtime. You can check out our previous post on Zero Downtime Server Updates for your Kubernetes Cluster for an overview of the steps involved, but in general, expect a great amount of configuration to achieve similar effects to the managed options described below.

That said, you can get close to a managed experience by implementing tooling to account for these concerns. For example, we open sourced a utility ( kubergrunt ) that will gracefully rotate the nodes of an ASG to the latest launch configuration (the eks deploy command), which helps automate rolling out AMI updates.

To summarize, self managed worker nodes have the highest infrastructure management overhead and cost of the three options, but in return gives you full access to configure the workers to meet almost any infrastructure need. If you are willing to exchange some of that control (such as forgo the ability to configure the AMI) for a better managed experience that addresses basic concerns like updating, then you can turn to the next option in our list: Managed Node Groups.

Managed Node Groups: Fully Managed ASGs Optimized for EKS

In the previous section, we covered the DIY option in the form of self managed ASGs that were manually configured to act as EKS worker nodes. In this section, we will cover Managed Node Groups. Managed Node Groups are designed to automate the provisioning and lifecycle management of nodes that can be used as EKS workers. This means that they handle various concerns about running EKS workers using EC2 instances such as:

Running the latest EKS optimized AMI.

Gracefully draining nodes before termination during a scale down event.

Gracefully rotate nodes to update the underlying AMI.

Apply labels to the resulting Kubernetes Node resources.

You can learn more about Managed Node Groups in the official docs.

Managed Node Groups can be created using the Console or API, if you are running a compatible EKS cluster (all EKS clusters running Kubernetes 1.14 and above are supported). You can also use Terraform to provision node groups using the aws_eks_node_group resource. Once a Managed Node Group is provisioned, AWS will start to provision and configure the underlying resources, which includes the Auto Scaling Group and associated EC2 instances. These resources are not hidden and can be monitored or queried using the EC2 API or the AWS Console’s EC2 page.

One thing to note is that while Managed Node Groups provides a managed experience for the provisioning and lifecycle of EC2 instances, they do not configure horizontal auto-scaling or vertical auto-scaling. This means that you still need to use a service like Kubernetes Cluster Autoscaler to implement auto-scaling of the underlying ASG.

Additionally, Managed Node Groups also do not automatically update the underlying AMI in reaction to patch releases, or Kubernetes Version updates, although they make it easier to perform one. You still need to manually trigger a Managed Node Group update using the Console or API. See the docs on updating a Managed Node Group for more details.

Since Managed Node Groups use EC2 instances and ASGs under the hood, you still have access to all the Kubernetes features available to you like the self managed worker nodes. You get a managed infrastructure experience without trading off too many features. To customize the underlying ASG, you can provide a launch template to AWS. This allows you to specify custom settings on the instances such as an AMI that you built with additional utilities, or a custom user-data script with different boot options. You can read more about it in the official documentation.

The only thing that is not supported with Launch Templates and Managed Node Groups is that you can’t use spot instances with Managed Node Groups. If you have workloads that can survive intermittent instance failures, spot instances can help fine tune your costs.

To summarize, Managed Node Groups are a good solution for having a managed experience for managing your worker nodes without giving up too many Kubernetes features. However, you still have worker nodes to manage yourself. This means that you still have to worry about concerns like SSH access, auto scaling, updating patches, etc. What if you could completely get rid of the overhead of managing servers? The third and final option gives us exactly that with Fargate.

Serverless Worker Nodes with EKS Fargate

AWS Fargate is a serverless compute engine managed by AWS to run container workloads without actively managing servers to run them. With AWS Fargate, all you need to do is tell AWS what containers you want to run; AWS will then figure out how to run them, including, under the hood, automatically spinning servers and clusters up and down as necessary. This means that you can schedule your workloads without actively maintaining servers to use as worker nodes, removing the need to choose server types, worry about security patches, decide when to scale your clusters, or optimize cluster packing.

Originally Fargate was only available with ECS, the proprietary managed container orchestration service that AWS provided as an alternative to Kubernetes. However, on December 3rd 2019, AWS announced support for using Fargate to schedule Kubernetes Pods on EKS, providing you with a serverless Kubernetes option.

Note that while Fargate removes the need for you to actively manage servers as worker nodes, AWS will still provision and manage VM instances to run the scheduled workloads. As such, you still have Nodes with EKS Fargate, and you can view detailed information about the underlying nodes used by Fargate when you query for them using kubectl with kubectl get nodes .

All EKS clusters running Kubernetes 1.14 and above automatically have Fargate support. If you have a compatible cluster, you can start using Fargate by creating an AWS Fargate Profile. The Fargate Profile is used by the Kubernetes scheduler to decide which Pods should be provisioned on AWS Fargate. The Fargate Profile specifies a Kubernetes Namespace and associated Labels to use as selectors for the Pod. For example, if you had a Fargate Profile for the Namespace kube-system and Labels compute-type=fargate , then any Pod in the kube-system Namespace with the Label compute-type=fargate will be scheduled to Fargate, while others will be routed to EC2 based worker nodes available in your cluster.

You can learn more about how to provision Fargate Profiles and what is required to create one in the official AWS docs.

While Fargate gives you a fully managed Kubernetes experience with minimal infrastructure overhead, there are some downsides. Due to the way Fargate works, there are many features of Kubernetes that are not available. You can see the full list of limitations in the official docs. Here we will highlight a few that stand out:

Fargate does not support NodePort or LoadBalancer Service types. This means that you can not use a Classic Load Balancer or Network Load Balancer ELB to load balance your Pods. You must use ALBs.

or Service types. This means that you can not use a Classic Load Balancer or Network Load Balancer ELB to load balance your Pods. You must use ALBs. Fargate does not support DaemonSet . This means that traditional methods to achieve cluster-wide administrative services (e.g shipping container logs to a log aggregation service) are not available, and you must use side car containers. This adds overhead to your Kubernetes configuration, as you must ensure all Pods have the necessary side cars to achieve the same effects.

. This means that traditional methods to achieve cluster-wide administrative services (e.g shipping container logs to a log aggregation service) are not available, and you must use side car containers. This adds overhead to your Kubernetes configuration, as you must ensure all Pods have the necessary side cars to achieve the same effects. Fargate does not support PersistentVolume . This means that you should not run stateful Pods (e.g a Database) on Fargate.

. This means that you should not run stateful Pods (e.g a Database) on Fargate. Fargate is only available in select regions. See the official docs for availability.

The following are additional limitations of Fargate that are not officially documented by AWS, but can be observed empirically through continuous usage:

Fargate works by dynamically allocating a dedicated VM for your Pods. This naturally means that it can take longer for your Pods to provision. Most Pods provision within a minute, but we have occasionally seen some Pods take up to 10 minutes to provision.

Because you can not configure the underlying servers that run the Pods, you can get a wide range of instance classes that run your workloads. We ran a test container that inspected the contents of /proc/cpuinfo to collect information on the underlying CPU modules. Below you can find a bar chart showing you observed CPU models from 90 Pods deployed on EKS Fargate. We observed 4 CPU models that correspond to the m4 and t3 instance families. Note that the difference in CPU between the two instance families can be a performance difference of ~15% for a CPU intensive task (e.g native-json benchmark), which means that you can expect some non-negligible variable performance for the same Pods depending on which Fargate hardware you were scheduled on.

Observed CPU model counts from 90 Fargate Pods. The processors are a mix of m4 instance class family (E5–2686 and E5–2676) and t3 instance class family (Platinum 8259CL and Platinum 8175M).

To summarize, Fargate is a great way to run your workloads on EKS without having to worry about managing servers to run them. This means that concerns around security, upgrades/patches, cost optimizations, etc are all taken care of for you. However, not all workloads are compatible with Fargate. Non-HTTP based, performance critical, or stateful workloads are examples of a few workloads that should avoid Fargate due to its limitations. If you have these kinds of workloads, you need to rely on one of the other two methods.

Summary

Summary of supported configurations for the three worker node types. Click for full image.

In this guide we covered in detail the three options available to you for running your workloads on EKS. Each option has various trade offs to consider, but in general you should prefer to use more managed solutions over unmanaged solutions to gain the peace of mind of not having to manage your own infrastructure. When deciding which to use, we recommend starting with Fargate, and progress to increasingly more manual options depending on your workload needs and compatibility.

To get a production grade, battle tested EKS cluster with support for all three worker group types, all defined as code, check out Gruntwork.io. Our EKS clusters support: (a) Fargate only EKS clusters with default Fargate Profiles, (b) mixed workers clusters with all three options, (c) Auto Scaling and Graceful Scaling self managed workers, (d) Batteries included EKS cluster with container logs, ALB ingress controller, etc.