Works with Kubernetes 1.16 !

INFRASTRUCTURE

For my preparation to the Cloud Native Computing Foundation - Certified Kubernetes Administrator exam (or CNCF CKA for short), it is important to get the Ins and Outs of creating Kubernetes clusters by hand. This includes generating all the certificates, systemd unit files, K8s configs and the installation of components.

You should already have some basic knowledge about Kubernetes in general.

Most of you may have already seen Kelsey Hightower’s fantastic “Kubernetes The Hard Way” tutorial on GitHub. There is even a LinuxAcademy course and forks of the tutorial for Bare Metal installations (GitHub / Medium). And of course there are already several guides for AWS (here and here).

So why write another one? Simple answer: there is not a single all-in-one guide for an AWS and Multi-Master non-stacked Kubernetes setup which I looked for. Besides, it was a good use-case to do some first baby steps with the AWS CDK for Python :)

If you haven’t already checked out the CDK, give it a try! => https://aws.amazon.com/cdk

So what is this all about?

Executive Summary: it creates the infrastructure and components for a multi-node non-stacked Kubernetes Cluster (v1.16!) on AWS - but doin’ it the real hard way!

Features

CDK code available (Python3, generates CloudFormation)

code available (Python3, generates CloudFormation) Terraform code available (>= 0.12 / HCL2)

code available (>= 0.12 / HCL2) Multi Master HA Kubernetes control plane

Kubernetes control plane Non-Stacked setup (etcd servers are running on their own instances)

setup (etcd servers are running on their own instances) 10x EC2 instances 1x Bastion Host (single-node ASG) 3x etcd nodes (ASG) 3x Kubernetes Master nodes (ASG) 3x Kubernetes Worker nodes (ASG)

3x Load Balancers Bastion Host LB: for safe access to Bastion Host instance K8s-master-public-LB: external kubectl access K8s-master-private-LB: fronts Kubernetes API Servers

Route53 records for all EC2 instances’ internal IPv4 addresses (ease of use)

External access via Bastion Host (SSH) & public Load Balancer (kubectl) only

Access to BastionLB & MasterPublicLB from your workstation IP only (by default)

Strict SecurityGroups

Infrastructure as Code

Create Infrastructure

⚡ The infrastructure created with Terraform and/or CDK may not be for production usage. But it is safe to use, it creates AutoScalingGroups (with LaunchConfigurations), a Bastion Host, Public/Private LoadBalancers and has tightened SecurityGroups assigned to all resources.

First we need to create our infrastructure and you can use either the AWS CDK or Terraform repository above. Both will create the same infrastructure with ten EC2 instances by default for a fully non-stacked Kubernetes setup. This means the etcd nodes are running on their own instances and not on top of the K8s Master nodes (= stacked-setup).

If you change the number of nodes, you have to adapt the below instructions accordingly.

In the IaC, set the Route53 Hosted Zone you want to use (Terraform: var.hosted_zone / CDK: zone_fqdn ). This will create an A record for the Bastion host like this:

bastion.example.com

It also provisions the Bastion Host with a User Data script to install the cfssl binary (by CloudFlare) for easy creation of all the CSRs, certificates and keys.

SecurityGroups

Overview of SecurityGroups:

Component Source Protocol Port Description Bastion Bastion LB TCP 22 SSH from Bastion LB Bastion LB Workstation TCP 22 SSH from Workstation etcd Bastion TCP 22 SSH from Bastion etcd K8s-Master TCP 2379 etcd-client etcd K8s-Master TCP 2380 etcd-server K8s-PublicLB Workstation TCP 6443 kubectl from Workstation K8s-PrivateLB Masters TCP 6443 kube-api from Masters K8s-PrivateLB Workers TCP 6443 kube-api from Workers K8s-Master Bastion TCP 22 SSH from Bastion K8s-Master MasterPublicLB TCP 6443 kubectl from Public LB K8s-Master MasterPrivateB TCP 6443 kube-api from Private LB K8s-Master Bastion TCP 6443 kubectl access from Bastion K8s-Master K8s-Worker any any Allow any from K8s-Worker K8s-Worker Bastion TCP 22 SSH from Bastion K8s-Worker Bastion TCP 6443 kubectl access from Bastion K8s-Worker K8s-Master any any Allow any from K8s-Master

Load Balancers

And finally we will have these three Classic Elastic Load Balancers (CLB):

K8s-Master Public ELB (for remote kube-apiserver / kubectl access from your workstation)

K8s-Master Private ELB (fronts kube-apiservers)

Bastion ELB (for secure SSH access into the Private Subnets)

Terraform

To apply the infrastructure with Terraform >=0.12, just clone my repository and create a terraform.tfvars file with your configuration:

git clone git@github.com:hajowieland/terraform-k8s-the-real-hard-way-aws.git

Requirements

Terraform 0.12 (macOS: brew install terraform )

) awscli with default profile configured

Required Variables

You have to change the values of at least these two TF variables in your terraform.tfvars file:

TF Variable Description Type Default Example owner Your Name string napo.io Max Mustermann hosted_zone Route53 Hosted Zone Name for DNS records string "" test.example.com

There are more variables to configure, but most of them have sane default values.

SSH KEY: If you do not specify a pre-exisintg AWS Key Pair with var.aws_key_pair_name , then TF creates a new one with your ~/.ssh/id_rsa key by default You change this path by setting var.ssh_public_key_path

Deploy the infrastructure with Terraform:

terraform init terraform plan terraform apply

CDK

To apply the infrastructure with the AWS CDK, just clone my repository and edit the cdk_python_k8s_right_way_aws_stack.py file accordingly:

git clone git@github.com:hajowieland/cdk-py-k8s-the-real-hard-way-aws.git

Requirements

cdk installed via npm install -g cdk

awscli with default profile configured

Existing AWS Key Pair

Python3

In repository => VirtualEnv: virtualenv .env -p python3 and then source .env/bin/active

and then Install pip3 requirements: pip3 install -r requirements.txt

Required Variables

You have to change the values of at least these two variables in your cdk_python_k8s_right_way_aws_stack.py file:

variable Description Type Default Example ssh_key_pair Existing AWS Key Pair '' id_rsa MyKeyPairName tag_owner Your Name string napo.io Holy Kubernetus zone_fqdn Route53 Hosted Zone Name string '' test.example.com

There are more variables to configure, most of them have sane default values.

Deploy the infrastructure with CDK:

cdk synth # outputs the rendered CloudFormation code cdk deploy

BE AWARE:

If you change the default value for project in CDK ( tag_project ) or Terraform ( var.project ) then you have to adapt the filters in all following commands using aws-cli !

Defaults

By default all EC2 instances are created in us-east-1 AWS region.

Connect to Bastion Host

Now everything will take place on the Bastion host.

Connect via ssh, whether with the AWS Key Pair name you configured or, when using Terraform, if you have created a new AWS Key Pair (CDK/CloudFormation does not support creating Key Pairs):

ssh ec2-user@bastion.napo.io

It can take a few moments until all resources are ready

Get the internal IPs

On the Bastion Host, we first need to know the IPv4 addresses for all EC2 instances we created (the internal IPs - all nodes are running in private subnets).

In the UserData we already set some global environment variables to make your life easier, so you do not have to set them every time:

# Already set during infrastructure deployment # Verify the values: # echo $HOSTEDZONE_NAME && echo $AWS_DEFAULT_REGION export HOSTEDZONE_NAME = napo.io # << from var.hosted_zone / zone_fqdn export AWS_DEFAULT_REGION = $( curl -s http://169.254.169.254/latest/dynamic/instance-identity/document|grep region|awk -F \" '{print $4}' ) # << AWS Region the EC2 instance is running for use with awscli

Before we continue with the next steps, get the Hosted Zone ID and export it as environment variable:

# Gets HostedZone ID from its name export HOSTEDZONE_ID = $( aws route53 list-hosted-zones-by-name --dns-name $HOSTEDZONE_NAME --query 'HostedZones[].Id' --output text | cut -d/ -f3 )

The next Shell commands help us identify the EC2 instances and create Route53 records for ease of use:

During a for loop, we get all Instances by quering the AutoScalingGroups with a specific LaunchConfiguration prefix

Tag the EC2 instance with incrementing number

Get the private IP address of the EC2 instance

Create a Route53 Recordset JSON file with heredoc (see example below)

Create Route53 record with the private IP address and component name + incremented number

The temporary JSON files for the Route53 records look like this (just an example):

{ "Comment" : "Create/Update etcd A record" , "Changes" :[{ "Action" : "UPSERT" , "ResourceRecordSet" :{ "Name" : "etcd$i.$ZONENAME" , "Type" : "A" , "TTL" : 30 , "ResourceRecords" :[{ "Value" : "$IP" }] } }] }

⚠️ Be aware that you have do manually delete these Route53 records when you’re finished.

Now execute the following shell commands on the Bastion Host:

Etcd:

i = 1 for INSTANCE in $( aws autoscaling describe-auto-scaling-instances --query 'AutoScalingInstances[?starts_with(LaunchConfigurationName, `etcd`)].[InstanceId]' --output text ) ; do aws ec2 create-tags --resources $INSTANCE --tags Key = Name,Value = etcd$i IP = $( aws ec2 describe-instances --instance-id $INSTANCE --query 'Reservations[].Instances[].[PrivateIpAddress]' --output text ) cat << EOF > /tmp/record.json { "Comment":"Create/Update etcd A record", "Changes":[{ "Action":"UPSERT", "ResourceRecordSet":{ "Name":"etcd$i.internal.$HOSTEDZONE_NAME", "Type":"A", "TTL":30, "ResourceRecords":[{ "Value":"$IP" }] } }] } EOF aws route53 change-resource-record-sets --hosted-zone-id $HOSTEDZONE_ID --change-batch file:///tmp/record.json export ETCD ${ i } _INTERNAL = $IP i = $(( i+1 )) done

Master:

i = 1 for INSTANCE in $( aws autoscaling describe-auto-scaling-instances --query 'AutoScalingInstances[?starts_with(LaunchConfigurationName, `master`)].[InstanceId]' --output text ) ; do aws ec2 create-tags --resources $INSTANCE --tags Key = Name,Value = master$i IP = $( aws ec2 describe-instances --instance-id $INSTANCE --query 'Reservations[].Instances[].[PrivateIpAddress]' --output text ) cat << EOF > /tmp/record.json { "Comment":"Create/Update master A record", "Changes":[{ "Action":"UPSERT", "ResourceRecordSet":{ "Name":"master$i.internal.$HOSTEDZONE_NAME", "Type":"A", "TTL":30, "ResourceRecords":[{ "Value":"$IP" }] } }] } EOF aws route53 change-resource-record-sets --hosted-zone-id $HOSTEDZONE_ID --change-batch file:///tmp/record.json export MASTER ${ i } _INTERNAL = $IP i = $(( i+1 )) done

Worker:

i = 1 for INSTANCE in $( aws autoscaling describe-auto-scaling-instances --query 'AutoScalingInstances[?starts_with(LaunchConfigurationName, `worker`)].[InstanceId]' --output text ) ; do aws ec2 create-tags --resources $INSTANCE --tags Key = Name,Value = worker$i IP = $( aws ec2 describe-instances --instance-id $INSTANCE --query 'Reservations[].Instances[].[PrivateIpAddress]' --output text ) cat << EOF > /tmp/record.json { "Comment":"Create/Update worker A record", "Changes":[{ "Action":"UPSERT", "ResourceRecordSet":{ "Name":"worker$i.internal.$HOSTEDZONE_NAME", "Type":"A", "TTL":30, "ResourceRecords":[{ "Value":"$IP" }] } }] } EOF aws route53 change-resource-record-sets --hosted-zone-id $HOSTEDZONE_ID --change-batch file:///tmp/record.json export WORKER ${ i } _INTERNAL = $IP i = $(( i+1 )) done

Setup SSH config

This step really makes your life easier and in the following steps we will use the names configured in the SSH client config file.

ℹ️ Replace the IdentityFile (your OpenSSH key used as AWS EC2 Key Pair) and HostName Domain accordingly - do not forget to add your SSH private key to the Bastion Host (e.g. place it at $HOME/.ssh/id_rsa ).

# On your workstation (macOS - and if your private key is id_rsa) cat ~/.ssh/id_rsa | pbcopy

~/.ssh/id_rsa:

# On Bastion vi ~/.ssh/id_rsa chmod 400 ~/.ssh/id_rsa

On the Bastion Host, open the SSH config file and adapt to your setup (replace napo.io with your domain):

~/.ssh/config:

Host etcd1 etcd2 etcd3 master1 master2 master3 worker1 worker2 worker3 User ubuntu HostName %h.internal.napo.io IdentityFile ~/.ssh/id_rsa

chmod 600 ~/.ssh/config

Create Kubernetes Cluster the (real) hard way

With the preparation done, we now start doing Kubernetes The (real) Hard Way on AWS 🥳

Notes

For parallel execution on multiple instances at once, use tmux and this multiplexer script: https://gist.github.com/dmytro/3984680.

Both are already installed on the Bastion Host (the multiplexer script can be found in ec2-user $HOME directory).

Later, use tmux and execute tmux-multi.sh script to run commands on multiple instances (all etcd/master/worker nodes) at once.

But first we are creating the certificates on the Bastion Host and transfer them to the instances.

Create certificates

Now lets start creating all the stuff needed for Kubernetes (CA, Signing Requests, Certs, Keys etc.)

Certificate Authority

For our Certificate Authority (Lifetime 17520h => 2 years) we create a CA config file and Certificate Signing Requests (CSRs) for the 4096-bit RSA key:

cat > ca-config.json <<EOF { "signing": { "default": { "expiry": "17520h" }, "profiles": { "kubernetes": { "usages": ["signing", "key encipherment", "server auth", "client auth"], "expiry": "17520h" } } } } EOF

cat > ca-csr.json <<EOF { "CN": "Kubernetes", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "Kubernetes", "OU": "Kubernetes The Real Hard Way", "ST": "$HOSTEDZONE_NAME" } ] } EOF

Generate the Certificate Authority key from the previous CA config and CSR:

cfssl gencert -initca ca-csr.json | cfssljson -bare ca

Client and Server Certificates

Now create client and server certificates with their corresponding CSRs:

Admin Client Certificate

cat > admin-csr.json <<EOF { "CN": "admin", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "system:masters", "OU": "Kubernetes The Real Hard Way", "ST": "$HOSTEDZONE_NAME" } ] } EOF

Generate Admin client key:

cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -profile = kubernetes \ admin-csr.json | cfssljson -bare admin

Kubelet Client Certificates

Here we get all the Worker nodes, identify them via their LaunchConfiguration name and create the CSRs:

WORKERCOUNT = $( aws autoscaling describe-auto-scaling-instances --query 'AutoScalingInstances[?starts_with(LaunchConfigurationName, `worker`)].[InstanceId]' --output text | wc -l ) i = 1 while [ " $i " -le " $WORKERCOUNT " ] ; do cat > worker ${ i } -csr.json <<EOF { "CN": "system:node:worker${i}.internal.${HOSTEDZONE_NAME}", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "system:nodes", "OU": "Kubernetes The Real Hard Way", "ST": "${HOSTEDZONE_NAME}" } ] } EOF i = $(( $i + 1 )) done

Create the keys for all Worker nodes:

i = 1 while [ " $i " -le " $WORKERCOUNT " ] ; do cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -hostname = worker ${ i } .internal. ${ HOSTEDZONE_NAME } \ -profile = kubernetes \ worker ${ i } -csr.json | cfssljson -bare worker ${ i } i = $(( $i + 1 )) done

kube-controller-manager Certificate

Generate the kube-controller-manager client certificate and private key:

cat > kube-controller-manager-csr.json <<EOF { "CN": "system:kube-controller-manager", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "system:kube-controller-manager", "OU": "Kubernetes The Real Hard Way", "ST": "${HOSTEDZONE_NAME}" } ] } EOF

cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -profile = kubernetes \ kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager

kube-proxy Client Certificate

Now create everything needed for the kube-proxy component.

First, again, the CSR:

cat > kube-proxy-csr.json <<EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "system:node-proxier", "OU": "Kubernetes The Real Hard Way", "ST": "$HOSTEDZONE_NAME" } ] } EOF

… and then generate the key for kube-proxy:

cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -profile = kubernetes \ kube-proxy-csr.json | cfssljson -bare kube-proxy

kube-scheduler Client Certificate

Generate the kube-scheduler client certificate and private key:

cat > kube-scheduler-csr.json <<EOF { "CN": "system:kube-scheduler", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "system:kube-scheduler", "OU": "Kubernetes The Real Hard Way", "ST": "$HOSTEDZONE_NAME" } ] } EOF

cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -profile = kubernetes \ kube-scheduler-csr.json | cfssljson -bare kube-scheduler

kube-controller-manager ServiceAccount Token

To sign ServiceAccount tokens by the kube-controller-manager (see Documentation), create the certificate and private key:

cat > service-account-csr.json <<EOF { "CN": "service-accounts", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "Kubernetes", "OU": "Kubernetes The Real Hard Way", "ST": "$HOSTEDZONE_NAME" } ] } EOF

cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -profile = kubernetes \ service-account-csr.json | cfssljson -bare service-account

Kubernetes API Server Certificate

And finally the CSR for kube-apiserver:

cat > kubernetes-csr.json <<EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 4096 }, "names": [ { "C": "DE", "L": "Munich", "O": "Kubernetes", "OU": "Kubernetes The Real Hard Way", "ST": "$HOSTEDZONE_NAME" } ] } EOF

For generating the kube-apiserver keys, we need to define all the IPs which will access the Apiserver.

First get the Kubernetes Master ELBs DNS names via their prefixes and assign them to envvars:

MASTER_ELB_PRIVATE = $( aws elb describe-load-balancers --query 'LoadBalancerDescriptions[? starts_with(DNSName, `internal-master`)]| [].DNSName' --output text ) MASTER_ELB_PUBLIC = $( aws elb describe-load-balancers --query 'LoadBalancerDescriptions[? starts_with(DNSName, `master`)]| [].DNSName' --output text )

Generate the API Server certificate key but adapt the number of etcd, worker and master nodes to your setup (here by default three nodes each).

We use the EnvVars we created earlier:

cfssl gencert \ -ca = ca.pem \ -ca-key = ca-key.pem \ -config = ca-config.json \ -hostname = 10 .32.0.1, ${ ETCD1_INTERNAL } , \ ${ ETCD2_INTERNAL } , ${ ETCD3_INTERNAL } , \ ${ MASTER1_INTERNAL } , ${ MASTER2_INTERNAL } , \ ${ MASTER3_INTERNAL } , ${ WORKER1_INTERNAL } , \ ${ WORKER2_INTERNAL } , ${ WORKER3_INTERNAL } , \ etcd1.internal. ${ HOSTEDZONE_NAME } , \ etcd2.internal. ${ HOSTEDZONE_NAME } , \ etcd3.internal. ${ HOSTEDZONE_NAME } , \ master1.internal. ${ HOSTEDZONE_NAME } , \ master2.internal. ${ HOSTEDZONE_NAME } , \ master3.internal. ${ HOSTEDZONE_NAME } , \ worker1.internal. ${ HOSTEDZONE_NAME } , \ worker2.internal. ${ HOSTEDZONE_NAME } , \ worker3.internal. ${ HOSTEDZONE_NAME } , \ ${ MASTER_ELB_PRIVATE } , ${ MASTER_ELB_PUBLIC } , \ 127 .0.0.1,kubernetes.default \ -profile = kubernetes \ kubernetes-csr.json | cfssljson -bare kubernetes

Distribute the Client and Server Certificates

Now with everything in place, we scp all certificates to the instances.

ℹ️ The below commands only work if you have created ~/.ssh/config entries as stated at the beginning!

If you have changed the default, adapt the number of etcd/master/worker nodes to match your setup

etcd

for etcd in etcd1 etcd2 etcd3; do scp ca.pem ca-key.pem kubernetes-key.pem kubernetes.pem ${ etcd } :~/ done

Masters/Controllers

for master in master1 master2 master3; do scp ca.pem ca-key.pem kubernetes-key.pem \ kubernetes.pem service-account-key.pem service-account.pem \ ${ master } :~/ done

Workers

for worker in worker1 worker2 worker3; do scp ca.pem ${ worker } -key.pem ${ worker } .pem ${ worker } :~/ done

Generating Kubernetes Authentication Files for Authentication

Now in this step we generate the files needed for authentication in Kubernetes.

Client Authentication Configs

kubelet Kubernetes Configuration Files

Generate kubeconfig configuration files for kubelet of every worker:

for i in 1 2 3 ; do instance = "worker ${ i } " instance_hostname = "worker ${ i } .internal. $HOSTEDZONE_NAME " kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https:// ${ MASTER_ELB_PRIVATE } :6443 \ --kubeconfig = ${ instance } .kubeconfig kubectl config set-credentials system:node: ${ instance_hostname } \ --client-certificate = ${ instance } .pem \ --client-key = ${ instance } -key.pem \ --embed-certs = true \ --kubeconfig = ${ instance } .kubeconfig kubectl config set-context default \ --cluster = kubernetes-the-real-hard-way \ --user = system:node: ${ instance_hostname } \ --kubeconfig = ${ instance } .kubeconfig kubectl config use-context default \ --kubeconfig = ${ instance } .kubeconfig done

The kube-proxy Kubernetes Configuration File

Generate the kube-proxy kubeconfig:

kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https:// ${ MASTER_ELB_PRIVATE } :6443 \ --kubeconfig = kube-proxy.kubeconfig kubectl config set-credentials kube-proxy \ --client-certificate = kube-proxy.pem \ --client-key = kube-proxy-key.pem \ --embed-certs = true \ --kubeconfig = kube-proxy.kubeconfig kubectl config set-context default \ --cluster = kubernetes-the-real-hard-way \ --user = kube-proxy \ --kubeconfig = kube-proxy.kubeconfig kubectl config use-context default \ --kubeconfig = kube-proxy.kubeconfig

The kube-controller-manager Kubernetes Configuration File

Generate the kube-controller-manager kubeconfig:

kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https://127.0.0.1:6443 \ --kubeconfig = kube-controller-manager.kubeconfig kubectl config set-credentials system:kube-controller-manager \ --client-certificate = kube-controller-manager.pem \ --client-key = kube-controller-manager-key.pem \ --embed-certs = true \ --kubeconfig = kube-controller-manager.kubeconfig kubectl config set-context default \ --cluster = kubernetes-the-real-hard-way \ --user = system:kube-controller-manager \ --kubeconfig = kube-controller-manager.kubeconfig kubectl config use-context default --kubeconfig = kube-controller-manager.kubeconfig

The kube-scheduler Kubernetes Configuration File

Generate the kubeconfig file for the kube-scheduler component:

kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https://127.0.0.1:6443 \ --kubeconfig = kube-scheduler.kubeconfig kubectl config set-credentials system:kube-scheduler \ --client-certificate = kube-scheduler.pem \ --client-key = kube-scheduler-key.pem \ --embed-certs = true \ --kubeconfig = kube-scheduler.kubeconfig kubectl config set-context default \ --cluster = kubernetes-the-real-hard-way \ --user = system:kube-scheduler \ --kubeconfig = kube-scheduler.kubeconfig kubectl config use-context default --kubeconfig = kube-scheduler.kubeconfig

The admin Kubernetes Configuration File

And finally, the kubeconfig file for our admin user (that’s you 🙂):

kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https://127.0.0.1:6443 \ --kubeconfig = admin.kubeconfig kubectl config set-credentials admin \ --client-certificate = admin.pem \ --client-key = admin-key.pem \ --embed-certs = true \ --kubeconfig = admin.kubeconfig kubectl config set-context default \ --cluster = kubernetes-the-real-hard-way \ --user = admin \ --kubeconfig = admin.kubeconfig kubectl config use-context default --kubeconfig = admin.kubeconfig

Distribute the Kubernetes Configuration Files

Now transfer the kubelet & kube-proxy kubeconfig files to the worker nodes:

for worker in worker1 worker2 worker3; do scp ${ worker } .kubeconfig kube-proxy.kubeconfig ${ worker } :~/ done

And then the admin, kube-controller-manager & kube-scheduler kubeconfig files to the master nodes:

for master in master1 master2 master3; do scp admin.kubeconfig \ kube-controller-manager.kubeconfig \ kube-scheduler.kubeconfig \ ${ master } :~/ done

Generating the Data Encryption Config and Key

For encryption we first create a secure encryption key and then the EncryptionConfiguration.

The Encryption Key

ENCRYPTION_KEY = $( head -c 32 /dev/urandom | base64 )

The Encryption Config File

cat > encryption-config.yaml <<EOF apiVersion: apiserver.config.k8s.io/v1 kind: EncryptionConfiguration resources: - resources: - secrets providers: - aescbc: keys: - name: key1 secret: ${ENCRYPTION_KEY} - identity: {} EOF

Transfer the encryption config file to the master nodes:

for master in master1 master2 master3; do scp encryption-config.yaml ${ master } :~/ done

Bootstrapping the etcd Cluster

Now it is time to bootstrap our etcd cluster, which is our highly available key-value store for the Kubernetes API.

Think of it as the Kube API’s persistent storage for saving the state of all resources.

Now it is time to use the power of tmux and the multiplexer script:

Start tmux

Execute $HOME/tmux-multi.sh

Enter etcd1 etcd2 etcd3 (or more, according to your setup and how you configured your SSH config at the beginning)

Now we can execute the following commands in parallel on each etcd node.

First we get the etcdhost internal IPv4 address and set the hostname:

export ETCDHOST = $( aws ec2 describe-tags --filters "Name=resource-id,Values= $( curl -s http://169.254.169.254/latest/meta-data/instance-id ) " "Name=key,Values=Name" --output = text | cut -f 5 ) sudo hostnamectl set-hostname --static $ETCDHOST.internal.$HOSTEDZONE_NAME echo " $INTERNAL_IP $ETCDHOST .internal. $HOSTEDZONE_NAME " | sudo tee -a /etc/hosts

Install etcd and move the files:

wget -q --show-progress --https-only --timestamping \ "https://github.com/etcd-io/etcd/releases/download/v3.4.3/etcd-v3.4.3-linux-amd64.tar.gz" { tar -xvf etcd-v3.4.3-linux-amd64.tar.gz sudo mv etcd-v3.4.3-linux-amd64/etcd* /usr/local/bin/ } { sudo mkdir -p /etc/etcd /var/lib/etcd sudo cp ca.pem kubernetes-key.pem kubernetes.pem /etc/etcd/ }

Get etcd nodes IPv4 addresses end export them as envvars

Again: by default you have three etcds - adapt to your setup if necessary.

for i in 1 2 3 ; do export ETCD ${ i } _INTERNAL = $( dig +short etcd ${ i } .internal. ${ HOSTEDZONE_NAME } ) ; done

Generate the etcd systemd unit file:

cat > etcd.service <<EOF [Unit] Description=etcd Documentation=https://github.com/coreos [Service] ExecStart=/usr/local/bin/etcd \\ --name ${ETCDHOST}.internal.${HOSTEDZONE_NAME} \\ --cert-file=/etc/etcd/kubernetes.pem \\ --key-file=/etc/etcd/kubernetes-key.pem \\ --peer-cert-file=/etc/etcd/kubernetes.pem \\ --peer-key-file=/etc/etcd/kubernetes-key.pem \\ --trusted-ca-file=/etc/etcd/ca.pem \\ --peer-trusted-ca-file=/etc/etcd/ca.pem \\ --peer-client-cert-auth \\ --client-cert-auth \\ --initial-advertise-peer-urls https://${INTERNAL_IP}:2380 \\ --listen-peer-urls https://${INTERNAL_IP}:2380 \\ --listen-client-urls https://${INTERNAL_IP}:2379,http://127.0.0.1:2379 \\ --advertise-client-urls https://${INTERNAL_IP}:2379 \\ --initial-cluster-token etcd-cluster-0 \\ --initial-cluster etcd1.internal.${HOSTEDZONE_NAME}=https://${ETCD1_INTERNAL}:2380,etcd2.internal.${HOSTEDZONE_NAME}=https://${ETCD2_INTERNAL}:2380,etcd3.internal.${HOSTEDZONE_NAME}=https://${ETCD3_INTERNAL}:2380 \\ --initial-cluster-state new \\ --data-dir=/var/lib/etcd Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF

Move the files to the right place, reload systemd and enable + start the etcd service:

sudo mv etcd.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable etcd sudo systemctl start etcd

Check if etcds works

Check for any errors in systemd:

systemctl status etcd

List etcd members:

ETCDCTL_API = 3 etcdctl member list

The output should look like this

OUTPUT:

2d2d6426a2ba46f2, started, etcd3.internal.napo.io, https://10.23.1.109:2380, https://10.23.1.109:2379, false 7e1b60cbd871ed2f, started, etcd1.internal.napo.io, https://10.23.3.168:2380, https://10.23.3.168:2379, false a879f686f293ea99, started, etcd2.internal.napo.io, https://10.23.2.33:2380, https://10.23.2.33:2379, false

Debug

⚠️ If you somehow messed up your etcd, start the key-value store from scratch like this (Reference: https://github.com/etcd-io/etcd/issues/10101 )

ETCDCTL_API = 3 etcdctl del "" --from-key = true sudo systemctl stop etcd sudo rm -rf /var/lib/etcd/default.etcd sudo systemctl start etcd

Bootstrapping the Kubernetes Control Plane

Now that we have our working etcd cluster, it is time to bootstrap our Kubernetes Master Nodes.

exit the tmux-multiplexer on etcds so that you’re back on the Bastion Host. Now again execute $HOME/tmux-multi.sh and type in the master nodes:

SSH to master1 master2 master3 via tmux multiplexer and execute in parallel on each master node.

First we get the masterhost internal IPv4 address and set the hostname:

export MASTERHOST = $( aws ec2 describe-tags --filters "Name=resource-id,Values= $( curl -s http://169.254.169.254/latest/meta-data/instance-id ) " "Name=key,Values=Name" --output = text | cut -f 5 ) sudo hostnamectl set-hostname --static $MASTERHOST.internal.$HOSTEDZONE_NAME echo " $INTERNAL_IP $MASTERHOST .internal. $HOSTEDZONE_NAME " | sudo tee -a /etc/hosts

Get the latest stable Kubernetes version (currently 1.16.3 as of this writing):

KUBERNETES_STABLE = $( curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt ) echo $KUBERNETES_STABLE

Generate the Kubernetes config directory, download kube components and move them to /usr/local/bin :

sudo mkdir -p /etc/kubernetes/config

wget -q --show-progress --https-only --timestamping \ "https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kube-apiserver" \ "https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kube-controller-manager" \ "https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kube-scheduler" \ "https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kubectl" chmod +x kube-apiserver kube-controller-manager kube-scheduler kubectl sudo mv kube-apiserver kube-controller-manager kube-scheduler kubectl /usr/local/bin/

Create the directory for certificates, keys and encryption config and move them there:

sudo mkdir -p /var/lib/kubernetes/ sudo mv ca.pem ca-key.pem kubernetes-key.pem kubernetes.pem encryption-config.yaml /var/lib/kubernetes/

Get the etcd nodes IPv4 addresses for the systemd unit file generation:

for i in 1 2 3 ; do export ETCD ${ i } _INTERNAL = $( dig +short etcd ${ i } .internal. ${ HOSTEDZONE_NAME } ) ; done

Create the kube-apiserver systemd file.

Here all the fun takes place: options and parameters for kube-apiserver.

You can find the current documentation of all options here: https://kubernetes.io/docs/reference/command-line-tools-reference/kube-apiserver/

cat > kube-apiserver.service <<EOF [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes [Service] ExecStart=/usr/local/bin/kube-apiserver \\ --enable-admission-plugins=NamespaceLifecycle,LimitRanger,ServiceAccount,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,PersistentVolumeClaimResize,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,ResourceQuota \\ --advertise-address=${INTERNAL_IP} \\ --allow-privileged=true \\ --apiserver-count=3 \\ --audit-log-maxage=30 \\ --audit-log-maxbackup=3 \\ --audit-log-maxsize=100 \\ --audit-log-path=/var/log/audit.log \\ --authorization-mode=Node,RBAC \\ --bind-address=0.0.0.0 \\ --client-ca-file=/var/lib/kubernetes/ca.pem \\ --etcd-cafile=/var/lib/kubernetes/ca.pem \\ --etcd-certfile=/var/lib/kubernetes/kubernetes.pem \\ --etcd-keyfile=/var/lib/kubernetes/kubernetes-key.pem \\ --etcd-servers=https://${ETCD1_INTERNAL}:2379,https://${ETCD2_INTERNAL}:2379,https://${ETCD3_INTERNAL}:2379 \\ --event-ttl=1h \\ --encryption-provider-config=/var/lib/kubernetes/encryption-config.yaml \\ --insecure-bind-address=127.0.0.1 \\ --kubelet-certificate-authority=/var/lib/kubernetes/ca.pem \\ --kubelet-client-certificate=/var/lib/kubernetes/kubernetes.pem \\ --kubelet-client-key=/var/lib/kubernetes/kubernetes-key.pem \\ --kubelet-https=true \\ --runtime-config=api/all \\ --service-account-key-file=/var/lib/kubernetes/ca-key.pem \\ --service-cluster-ip-range=10.32.0.0/24 \\ --service-node-port-range=30000-32767 \\ --tls-cert-file=/var/lib/kubernetes/kubernetes.pem \\ --tls-private-key-file=/var/lib/kubernetes/kubernetes-key.pem \\ --v=5 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF

Move kube-controller-manager kubeconfig to Kubernetes directory:

sudo mv kube-controller-manager.kubeconfig /var/lib/kubernetes/

Create kube-controller-manager systemd unit file:

cat > kube-controller-manager.service <<EOF [Unit] Description=Kubernetes Controller Manager Documentation=https://github.com/kubernetes/kubernetes [Service] ExecStart=/usr/local/bin/kube-controller-manager \\ --address=0.0.0.0 \\ --cluster-cidr=10.200.0.0/16 \\ --cluster-name=kubernetes \\ --cluster-signing-cert-file=/var/lib/kubernetes/ca.pem \\ --cluster-signing-key-file=/var/lib/kubernetes/ca-key.pem \\ --leader-elect=true \\ --master=http://127.0.0.1:8080 \\ --root-ca-file=/var/lib/kubernetes/ca.pem \\ --service-account-private-key-file=/var/lib/kubernetes/ca-key.pem \\ --service-cluster-ip-range=10.32.0.0/24 \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF

Move kube-scheduler kubeconfig to Kubernetes directory:

sudo mv kube-scheduler.kubeconfig /var/lib/kubernetes/

Create the KubeSchedulerConfiguration config:

cat <<EOF | sudo tee /etc/kubernetes/config/kube-scheduler.yaml apiVersion: componentconfig/v1alpha1 kind: KubeSchedulerConfiguration clientConnection: kubeconfig: "/var/lib/kubernetes/kube-scheduler.kubeconfig" leaderElection: leaderElect: true EOF

Create the kube-scheduler systemd unit file:

cat > kube-scheduler.service <<EOF [Unit] Description=Kubernetes Scheduler Documentation=https://github.com/kubernetes/kubernetes [Service] ExecStart=/usr/local/bin/kube-scheduler \\ --leader-elect=true \\ --master=http://127.0.0.1:8080 \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF

Move the files to the right place, reload systemd and enable + start the kube-* services:

sudo mv kube-apiserver.service kube-scheduler.service kube-controller-manager.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable kube-apiserver kube-controller-manager kube-scheduler sudo systemctl start kube-apiserver kube-controller-manager kube-scheduler

Verify that everything works

Sadly, kubectl get componenstatus (short: kubectl get cs ) is somehow deprecated and does not work correctly with Kubernetes 1.16 - the tables get mixed up: https://github.com/kubernetes/kubernetes/issues/83024

But we can check with increased verbosity if everything is healthy:

# Gives lots of output # Use the curl commands below for health checking kubectl get cs -v = 8

Additionally we check for errors via systemd:

systemctl status kube-apiserver systemctl status kube-controller-manager systemctl status kube-scheduler

curl the healthz health check endpoint (you should get a HTTP 200 back):

curl --cacert /var/lib/kubernetes/ca.pem \ --key /var/lib/kubernetes/kubernetes-key.pem \ --cert /var/lib/kubernetes/kubernetes.pem \ -i https://127.0.0.1:6443/healthz

If you’re curious, you can check the version info, too:

curl --cacert /var/lib/kubernetes/ca.pem \ --key /var/lib/kubernetes/kubernetes-key.pem \ --cert /var/lib/kubernetes/kubernetes.pem \ -i https://127.0.0.1:6443/version

If everything looks good we can now move on to RBAC.

RBAC for Kubelet Authorization

The Role-based access control (RBAC) is the authz concept for Kubernetes. We need to create the ClusterRole and its ClusterRoleBinding for kubelet on the worker nodes.

Exit the tmux multiplexer and SSH to the first Master instance (master1):

ssh master1

Create and apply a ClusterRole for kubelet (Worker) to kube-apiserver (Master) authorization:

cat <<EOF | kubectl apply -f - apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics verbs: - "*" EOF

Create and apply the corresponding ClusterRoleBinding for the above ClusterRole:

cat <<EOF | kubectl apply -f - apiVersion: rbac.authorization.k8s.io/v1beta1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF

Bootstrapping the Kubernetes Worker Nodes

Now we create the Worker Nodes who run the Pods in our cluster. They do all the heavy work and run most of the user software which we deploy on Kubernetes.

Provisioning Kubernetes Worker Nodes

SSH to worker1 worker2 worker3 via tmux multiplexer and execute in parallel on each worker node.

First we get the workerhost internal IPv4 address and set the hostname:

export WORKERHOST = $( aws ec2 describe-tags --filters "Name=resource-id,Values= $( curl -s http://169.254.169.254/latest/meta-data/instance-id ) " "Name=key,Values=Name" --output = text | cut -f 5 ) sudo hostnamectl set-hostname --static $WORKERHOST.internal.$HOSTEDZONE_NAME echo " $INTERNAL_IP $WORKERHOST .internal. $HOSTEDZONE_NAME " | sudo tee -a /etc/hosts

Get current Kubernetes stable version:

KUBERNETES_STABLE = $( curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt )

Install the OS dependencies on Ubuntu via apt-get:

sudo apt-get update sudo apt-get -y install socat conntrack ipset

Download & Install Worker Binaries

Download the CNI Plugins and worker binaries (kubelet, kube-proxy, kubectl):

wget -q --show-progress --https-only --timestamping \ https://github.com/containernetworking/plugins/releases/download/v0.8.2/cni-plugins-linux-amd64-v0.8.2.tgz \ https://github.com/containerd/containerd/releases/download/v1.3.0/containerd-1.3.0.linux-amd64.tar.gz \ https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kubectl \ https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kube-proxy \ https://storage.googleapis.com/kubernetes-release/release/ ${ KUBERNETES_STABLE } /bin/linux/amd64/kubelet

Create the installation directories:

sudo mkdir -p \ /etc/cni/net.d \ /opt/cni/bin \ /var/lib/kubelet \ /var/lib/kube-proxy \ /var/lib/kubernetes \ /var/run/kubernetes

and finally extract and move the CNI plugins and binaries there:

sudo tar -xvf cni-plugins-linux-amd64-v0.8.2.tgz -C /opt/cni/bin/ sudo tar -xvf containerd-1.3.0.linux-amd64.tar.gz -C / chmod +x kubectl kube-proxy kubelet sudo mv kubectl kube-proxy kubelet /usr/local/bin/

Configure CNI Networking

Now we configure the CIDR ranges for the Pod network. This configures the network which the Pods on every Worker node use to communicate with eath other between nodes.

We configure the Kubernetes CNI here with bridge and loopback interfaces and add the routes in the AWS Route Tables later. We could of course use another overlay network CNI like flannel or Calico but for our “the hard way” setup it may be more to learn from by creating it ourself.

But please, play around with other CNIs later, get to know the Pros and Cons and when it makes sense to use one over the other (for example because of NetworkPolicies)

ℹ️ A little shady trick 🤯 I do in the IaC of every worker node’s UserData: It generates a random number between 10-250 and exports the CIDR as environment variable POD_CIDR . This envvar is used now in the next command for creating the bridge config Default Value of POD_CIDR : 10.200.$RANDOM_NUMBER.0/24 )

echo $POD_CIDR

Create the bridge network configuration file:

cat <<EOF | sudo tee /etc/cni/net.d/10-bridge.conf { "cniVersion": "0.3.1", "name": "bridge", "type": "bridge", "bridge": "cnio0", "isGateway": true, "ipMasq": true, "ipam": { "type": "host-local", "ranges": [ [{"subnet": "${POD_CIDR}"}] ], "routes": [{"dst": "0.0.0.0/0"}] } } EOF

Create the loopback network configuration file:

cat <<EOF | sudo tee /etc/cni/net.d/99-loopback.conf { "cniVersion": "0.3.1", "type": "loopback" } EOF

Configure containerd

Install runc, a CLI tool for spawning and running containers according to the OCI runtime specification.

sudo apt-get install runc -y

Create the containerd configuration TOML file:

sudo mkdir -p /etc/containerd/ cat << EOF | sudo tee /etc/containerd/config.toml [plugins] [plugins.cri.containerd] snapshotter = "overlayfs" [plugins.cri.containerd.default_runtime] runtime_type = "io.containerd.runtime.v1.linux" runtime_engine = "/usr/sbin/runc" runtime_root = "" [plugins.cri.containerd.untrusted_workload_runtime] runtime_type = "io.containerd.runtime.v1.linux" runtime_engine = "/usr/sbin/runsc" runtime_root = "/run/containerd/runsc" EOF

ℹ️ INFO: Untrusted workloads will be run using the gVisor (runsc) container runtime sandbox.

Create the containerd.service systemd unit file:

cat <<EOF | sudo tee /etc/systemd/system/containerd.service [Unit] Description=containerd container runtime Documentation=https://containerd.io After=network.target [Service] ExecStartPre=/sbin/modprobe overlay ExecStart=/bin/containerd Restart=always RestartSec=5 Delegate=yes KillMode=process OOMScoreAdjust=-999 LimitNOFILE=1048576 LimitNPROC=infinity LimitCORE=infinity [Install] WantedBy=multi-user.target EOF

Configure the Kubelet

Now to the configuration of kubelet. Move the certs and keys to the right directory.

sudo mv $( hostname -s ) -key.pem /var/lib/kubelet/ sudo mv $( hostname -s ) .pem /var/lib/kubelet/ sudo mv $( hostname -s ) .kubeconfig /var/lib/kubelet/kubeconfig sudo mv ca.pem /var/lib/kubernetes/

Create a simple kubelet configuration file (KubeletConfiguration):

cat <<EOF | sudo tee /var/lib/kubelet/kubelet-config.yaml apiVersion: kubelet.config.k8s.io/v1beta1 kind: KubeletConfiguration authentication: anonymous: enabled: false webhook: enabled: true x509: clientCAFile: "/var/lib/kubernetes/ca.pem" authorization: mode: Webhook clusterDomain: "cluster.local" clusterDNS: - "10.32.0.10" podCIDR: "${POD_CIDR}" runtimeRequestTimeout: "15m" tlsCertFile: "/var/lib/kubelet/$(hostname -s).pem" tlsPrivateKeyFile: "/var/lib/kubelet/$(hostname -s)-key.pem" EOF

Create the kubelet.service systemd unit file:

cat <<EOF | sudo tee /etc/systemd/system/kubelet.service [Unit] Description=Kubernetes Kubelet Documentation=https://github.com/kubernetes/kubernetes After=containerd.service Requires=containerd.service [Service] ExecStart=/usr/local/bin/kubelet \\ --config=/var/lib/kubelet/kubelet-config.yaml \\ --container-runtime=remote \\ --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \\ --image-pull-progress-deadline=2m \\ --kubeconfig=/var/lib/kubelet/kubeconfig \\ --network-plugin=cni \\ --register-node=true \\ --v=2 Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF

Configure the Kubernetes Proxy

And finally the kube-proxy configuration.

Move the kube-proxy kubeconfig to the right directory:

sudo mv kube-proxy.kubeconfig /var/lib/kube-proxy/kubeconfig

Create the kube-proxy-config.yaml configuration file. Here we define the overall Cluster CIDR network range (10.200.0.0/16):

cat <<EOF | sudo tee /var/lib/kube-proxy/kube-proxy-config.yaml kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 clientConnection: kubeconfig: "/var/lib/kube-proxy/kubeconfig" mode: "iptables" clusterCIDR: "10.200.0.0/16" EOF

Create the kube-proxy.service systemd unit file:

cat <<EOF | sudo tee /etc/systemd/system/kube-proxy.service [Unit] Description=Kubernetes Kube Proxy Documentation=https://github.com/kubernetes/kubernetes [Service] ExecStart=/usr/local/bin/kube-proxy \\ --config=/var/lib/kube-proxy/kube-proxy-config.yaml Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target EOF

Start the Worker Services

Reload systemd and enable + start the containerd, kubelet and kube-proxy services:

sudo systemctl daemon-reload sudo systemctl enable containerd kubelet kube-proxy sudo systemctl start containerd kubelet kube-proxy

Check via systemd that there are no errros:

sudo systemctl status containerd sudo systemctl status kubelet sudo systemctl status kube-proxy

Verification

Exit the multiplexer and transfer the admin kubeconfig file to the first master node (master1):

Copy admin.kubeconfig to one master server:

scp admin.kubeconfig master1:~/

Connect to first master server via SSH and get the worker nodes via kubectl:

ssh master1 "kubectl get nodes --kubeconfig admin.kubeconfig"

OUTPUT:

NAME STATUS ROLES AGE VERSION worker1.internal.napo.io Ready <none> 90s v1.16.3 worker2.internal.napo.io Ready <none> 90s v1.16.3 worker3.internal.napo.io Ready <none> 90s v1.16.3

Configuring kubectl for Remote Access

We want to access the Kubernetes Cluster with the kubectl commandline utility from our Bastion Host as well as from our local Workstation.

👍 The Bastion Host already has kubectl installed.

=> On your workstation, you can see here how to install kubectl for all Operating Systems.

The Admin Kubernetes Configuration File

We need to configure a Kubernetes API server endpoint where we want to connect to. For High Availability we created an internal Load Balancer who fronts the Kubernetes Master Servers (kube-apiservers). The other public Load Balancer’s DNS name is our external endpoint for remote access.

Internally, e.g. on the Bastion Host, we use our internal Load Balancer but for external access we use the public-facing one. This may sound unnecessary, but this way we can tighten the SecurityGroups even more.

Bastion Host / Internal access

Generate the kubeconfig file suitable for authenticating as admin user on the Bastion Host:

MASTER_ELB_PRIVATE = $( aws elb describe-load-balancers --query 'LoadBalancerDescriptions[? starts_with(DNSName, `internal-master`)]| [].DNSName' --output text ) kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https:// ${ MASTER_ELB_PRIVATE } :6443 kubectl config set-credentials admin \ --client-certificate = admin.pem \ --client-key = admin-key.pem kubectl config set-context kubernetes-the-real-hard-way \ --cluster = kubernetes-the-real-hard-way \ --user = admin kubectl config use-context kubernetes-the-real-hard-way

Verify everything works from Bastion Host:

kubectl get nodes

Workstation / Remote access

Copy the the admin client cert and key together with the CA cert from the Bastion Host to your local workstation:

~/ca.pem

~/admin.pem

~/admin-key.pem

Generate the kubeconfig file suitable for authenticating as admin user on your workstation.

you may have to set --region us-east-1 to the region where our infrastructure is running

to the region where our infrastructure is running you may have to edit the paths to the certs and key if they aren’t in the current directory MASTER_ELB_PUBLIC = $( aws elb describe-load-balancers --query 'LoadBalancerDescriptions[? starts_with(DNSName, `master`)]| [].DNSName' --region us-east-1 --output text ) kubectl config set-cluster kubernetes-the-real-hard-way \ --certificate-authority = ca.pem \ --embed-certs = true \ --server = https:// ${ MASTER_ELB_PUBLIC } :6443 kubectl config set-credentials admin \ --client-certificate = admin.pem \ --client-key = admin-key.pem kubectl config set-context kubernetes-the-real-hard-way \ --cluster = kubernetes-the-real-hard-way \ --user = admin kubectl config use-context kubernetes-the-real-hard-way

Verify everything works from your Workstation:

kubectl get nodes

Hooray congratulations 🤗

Now we have safe remote access to our Kubernetes Cluster. But to really use it, we have to configure the Pod Network routes in the next step.

BE AWARE: If you workstation IP changes, you have to update the MasterPublicLB SecurityGroup to access Kubernetes!

Provisioning Pod Network Routes

Pods scheduled to a node receive an IP address from the node’s Pod CIDR range ( POD_CIDR envvar). At this point pods can not communicate with other pods running on different nodes due to missing network routes.

Now its time to create the routes in each Worker Node’s AWS Route Table. This establishes a network route from the Node’s POD_CIDR to the Node’s internal IPv4 address.

ℹ️ This way we to not have to install any additional CNI. Like mentioned before, we could use Flannel or some other way for archieving Kubernetes networking.

Routes

Connect back to the Bastion Host and create the network routes for each worker instance via aws-cli.

First get all private Route Tables and save them into the Bash Array ROUTE_TABLES :

ROUTE_TABLES =( $( aws ec2 describe-route-tables --filters "Name=tag:Attribute,Values=private" --query 'RouteTables[].Associations[].[RouteTableId]' --region us-east-1 --output text ) )

Then the next command connects to the Worker nodes via SSH and gets the value of the POD_CIDR envvar, saves it into the Bash Array WORKER_POD_CIDRS :

WORKER_POD_CIDRS =() for i in 1 2 3 ; do WORKER_POD_CIDRS +=( $( ssh worker$i 'echo $POD_CIDR' ) ) done

Now create the Routes for every worker node’s POD_CIDR to the node’s ENI (Elastic Network Interface):

for rt in ${ ROUTE_TABLES[@] } ; do i = 1 for cidr in ${ WORKER_POD_CIDRS[@] } ; do ENI_ID = $( aws ec2 describe-instances --filters "Name=tag:Name,Values=worker ${ i } " --query 'Reservations[].Instances[].NetworkInterfaces[].[NetworkInterfaceId]' --output text ) echo " ${ rt } : ${ cidr } => ${ ENI_ID } " aws ec2 create-route \ --route-table-id ${ rt } \ --destination-cidr-block ${ cidr } \ --network-interface-id ${ ENI_ID } i = $(( i+1 )) done done

OUTPUT:

You should see Return:true for => (Number Workers) X (Private Route Tables) = 9 (by default)

rtb-093ea7f2ab5e6c2d6: 10 .200.188.0/24 = > eni-0f9e482a3d6ac5797 { "Return" : true } rtb-093ea7f2ab5e6c2d6: 10 .200.166.0/24 = > eni-0487ae6ec86bbef5c { "Return" : true } rtb-093ea7f2ab5e6c2d6: 10 .200.152.0/24 = > eni-009f0deb164d3fafa { "Return" : true } rtb-00b8aae6926b2e250: 10 .200.188.0/24 = > eni-0f9e482a3d6ac5797 { "Return" : true } rtb-00b8aae6926b2e250: 10 .200.166.0/24 = > eni-0487ae6ec86bbef5c { "Return" : true } rtb-00b8aae6926b2e250: 10 .200.152.0/24 = > eni-009f0deb164d3fafa { "Return" : true } rtb-03288ee836e727375: 10 .200.188.0/24 = > eni-0f9e482a3d6ac5797 { "Return" : true } rtb-03288ee836e727375: 10 .200.166.0/24 = > eni-0487ae6ec86bbef5c { "Return" : true } rtb-03288ee836e727375: 10 .200.152.0/24 = > eni-009f0deb164d3fafa { "Return" : true }

Verify the the routes:

for rt in ${ ROUTE_TABLES[@] } ; do aws ec2 describe-route-tables --route-table-ids ${ rt } | \ jq -j '.RouteTables[].Routes[] | .DestinationCidrBlock, " ", .NetworkInterfaceId // .GatewayId, " ", .State, "

"' aws ec2 describe-route-tables --route-table-ids ${ rt } | \ jq -j '.RouteTables[].Routes[] | .DestinationCidrBlock, " ", .NetworkInterfaceId // .GatewayId, " ", .State, "

"' aws ec2 describe-route-tables --route-table-ids ${ rt } | \ jq -j '.RouteTables[].Routes[] | .DestinationCidrBlock, " ", .NetworkInterfaceId // .GatewayId, " ", .State, "

"' done

Deploy DNS Cluster Add-on

And as our last step, we configure a DNS add-on which provides DNS based service discovery to all applications running inside our Kubernetes cluster.

The DNS Cluster Add-on

Create kube-dns.yaml file (working with Kubernetes v1.16):

# Copyright 2016 The Kubernetes Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. # # CHANGELOG: # 08/11/2019 # Support for Kubernetes v1.16 added # by @hajowieland https://wieland.tech | https://napo.io # apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile kubernetes.io/name: "KubeDNS" spec: selector: k8s-app: kube-dns clusterIP: 10.32.0.10 ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP --- apiVersion: v1 kind: ServiceAccount metadata: name: kube-dns namespace: kube-system labels: kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile --- apiVersion: v1 kind: ConfigMap metadata: name: kube-dns namespace: kube-system labels: addonmanager.kubernetes.io/mode: EnsureExists --- apiVersion: apps/v1 kind: Deployment metadata: name: kube-dns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" addonmanager.kubernetes.io/mode: Reconcile spec: # replicas: not specified here: # 1. In order to make Addon Manager do not reconcile this replicas parameter. # 2. Default is 1. # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on. strategy: rollingUpdate: maxSurge: 10 % maxUnavailable: 0 selector: matchLabels: k8s-app: kube-dns template: metadata: labels: k8s-app: kube-dns annotations: scheduler.alpha.kubernetes.io/critical-pod: '' spec: tolerations: - key: "CriticalAddonsOnly" operator: "Exists" volumes: - name: kube-dns-config configMap: name: kube-dns optional: true containers: - name: kubedns image: gcr.io/google_containers/k8s-dns-kube-dns-amd64: 1.14.7 resources: # TODO: Set memory limits when we've profiled the container for large # clusters, then set request = limit to keep this container in # guaranteed class. Currently, this container falls into the # "burstable" category so the kubelet doesn't backoff from restarting it. limits: memory: 170Mi requests: cpu: 100m memory: 70Mi livenessProbe: httpGet: path: /healthcheck/kubedns port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /readiness port: 8081 scheme: HTTP # we poll on pod startup for the Kubernetes master service and # only setup the /readiness HTTP server once that's available. initialDelaySeconds: 3 timeoutSeconds: 5 args: - --domain=cluster.local. - --dns-port= 10053 - --config-dir=/kube-dns-config - --v= 2 env: - name: PROMETHEUS_PORT value: "10055" ports: - containerPort: 10053 name: dns-local protocol: UDP - containerPort: 10053 name: dns-tcp-local protocol: TCP - containerPort: 10055 name: metrics protocol: TCP volumeMounts: - name: kube-dns-config mountPath: /kube-dns-config - name: dnsmasq image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64: 1.14.7 livenessProbe: httpGet: path: /healthcheck/dnsmasq port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 args: - -v= 2 - -logtostderr - -configDir=/etc/k8s/dns/dnsmasq-nanny - -restartDnsmasq= true - - - - -k - --cache-size= 1000 - --no-negcache - --log-facility= - - --server=/cluster.local/127.0.0.1#10053 - --server=/in-addr.arpa/ 127.0.0.1 #10053 - --server=/ip6.arpa/ 127.0.0.1 #10053 ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP # see: https://github.com/kubernetes/kubernetes/issues/29055 for details resources: requests: cpu: 150m memory: 20Mi volumeMounts: - name: kube-dns-config mountPath: /etc/k8s/dns/dnsmasq-nanny - name: sidecar image: gcr.io/google_containers/k8s-dns-sidecar-amd64: 1.14.7 livenessProbe: httpGet: path: /metrics port: 10054 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 args: - --v= 2 - --logtostderr - --probe=kubedns, 127.0.0.1 : 10053 ,kubernetes.default.svc.cluster.local, 5 ,SRV - --probe=dnsmasq, 127.0.0.1 : 53 ,kubernetes.default.svc.cluster.local, 5 ,SRV ports: - containerPort: 10054 name: metrics protocol: TCP resources: requests: memory: 20Mi cpu: 10m dnsPolicy: Default # Don't use cluster DNS. serviceAccountName: kube-dns

Deploy kube-dns to the cluster:

kubectl create -f kube-dns.yaml

IT IS DONE! Great Work 👨‍💻

Now deploy some services on your shiny new the-hard-way created Kubernetes Cluster!

# Create deployment of nginx with 10 replicas kubectl run nginx --image = nginx --replicas = 10

Cleaning Up

If you are finished and want to destroy your whole infrastructure, just execute:

# Terraform terraform destroy # CDK cdk destroy

The beauty of Infrastructure as Code 🥰

Further Steps / Ideas

For further training/learning you can do a lot of things with your handmade cluster!

Here just some ideas:

Deploy an Ingress service (like nginx-ingress / aws-alb-ingress)

Increase the master/worker node size in the IaC (CDK/Terraform), deploy the changes and join the new nodes to your cluster

Manually kill etcd/master/worker instances and learn how Kubernetes reacts what info do you get? where do you find important logs? what steps can you take to improve cluster healthiness? what happens when the AutoScalingGroup starts a new instance, e.g. a new K8s Worker node? (no certs, keys available for this new IP address, etc.)

Enhance UserdData to assign ENIs from a pre-defined internal IP address pool (adapt the LaunchConfigurations for etcd, master, worker)

Get to know why it makes sense to use tools like kubeadm

Final Words

If you encounter any problems or have some ideas on how to enhance the IaC code ➡️ please let me know!

I would be very happy to see some Pull Requests on GitHub for the Terraform and CDK Python code of this blog post: