This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell

Background

In case you don’t know, Kubernetes is a Google open source project that tackles the problem of how to orchestrate your Docker containers on a data centre.

In a sentence, it allows you to treat groups of Docker containers as single units with their own addressable IP across hosts, and scale them as you wish, allowing you to be declarative about services much in the same way as you can be declarative about configuration with Puppet or Chef, and let Kubernetes take care of the details.

Terminology

Kubernetes has some terminology it’s worth noting here:

Pods: groupings of containers

groupings of containers Controllers: entities that drive the state of the Kubernetes cluster towards the desired state

entities that drive the state of the Kubernetes cluster towards the desired state Service: a set of pods that work together

a set of pods that work together Label: a simple name-value pair

a simple name-value pair Hyperkube: an all-in-one binary that can run a server

an all-in-one binary that can run a server Kubelet: an agent that runs on nodes and monitors containers, restarting them if necessary

Labels are central point of Kubernetes. By labelling Kubernetes entities, you can take actions across all relevant pods in your data centre. For example, you might want to ensure web server pods run only on specific nodes.

Play

I tried to follow Kubernetes’ Vagrant stand-up, but got frustrated with its slow pace and clunkiness, which I characterized uncharitably as ‘soviet’. Amazingly, a Twitter-whinge about this later and I got a message from Google’s Lead Engineer on Kubernetes saying they were ‘working on it’. Great, but this moved from great to awesome when I was presented with this, a Docker-only way to get Kubernetes running quickly.

NOTE: this code is not presented as stable, so if this walkthrough doesn’t work for you, check the central Kubernetes repo for the latest.

Step One: Start etcd

Kubernetes uses etcd to distribute information across the cluster, so as a core component we start that first:

docker run \ --net=host \ -d kubernetes/etcd:2.0.5.1 \ /usr/local/bin/etcd \ --addr= $( hostname -i ) :4001 \ --bind-addr=0.0.0.0:4001 \ --data-dir=/var/etcd/data

Step Two: Start the Master

docker run \ --net=host \ -d \ -v /var/run/docker.sock:/var/run/docker.sock\ gcr.io/google-containers/hyperkube:dev \ f /hyperkube kubelet \ --api_servers=http://localhost:8080 \ --v=2 \ --address=0.0.0.0 \ --enable_server \ --hostname_override=127.0.0.1 \ --config=/etc/kubernetes/manifests

Kubernetes has a simple Master-Minion architecture (for now – I understand this may be changing). The master handle the APIs for running the pods on the Kubernetes nodes, the scheduler (which determines what should run where based on capacity and constraints), and the replication controller, which ensures the right number of nodes have replicated pods.

If you run it immediately, your docker ps should now look something like this:

imiell@rothko:~$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 98b25161f27f gcr.io/google-containers/hyperkube "/hyperkube kubelet 2 seconds ago Up 1 seconds drunk_rosalind 57a0e18fce17 kubernetes/etcd:2.0.5.1 "/usr/local/bin/etcd 31 seconds ago Up 29 seconds compassionate_sinoussi

One thing to note here is that this master is run from a hyperkube kubelet call, which in turn brings up the master’s containers as a pod. That’s a bit of a mouthful, so let’s break it down.

Hyperkube, as we noted above, is an all-in-one binary for Kubernetes. It will go off and enable the services for the Kubernetes master in a pod. We’ll see what these are below.

Now we have a running Kubernetes cluster, you can manage it from outside using the API by downloading the kubectl binary:

imiell@rothko:~$ wget http://storage.googleapis.com/kubernetes-release/release/v0.14.1/bin/linux/amd64/kubectl imiell@rothko:~$ chmod +x kubelet imiell@rothko:~$ ./kubectl version Client Version: version.Info{Major:"0", Minor:"14", GitVersion:"v0.14.1", GitCommit:"77775a61b8e908acf6a0b08671ec1c53a3bc7fd2", GitTreeState:"clean"} Server Version: version.Info{Major:"0", Minor:"14+", GitVersion:"v0.14.1-dirty", GitCommit:"77775a61b8e908acf6a0b08671ec1c53a3bc7fd2", GitTreeState:"dirty"}

Let’s see how many minions we’ve got using the get sub-command:

imiell@rothko:~$ ./kubectl get minions NAME LABELS STATUS 127.0.0.1 Ready

We have one, running on localhost. Note the LABELS column. Think how we could label this minion: we could mark this minion as “heavy_db_server=true” if it was running on the tin needed to run our db beastie, and direct db server pods there only.

What about these pods then?

imiell@rothko:~$ ./kubectl get pods POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED nginx-127 controller-manager gcr.io/google-containers/hyperkube:v0.14.1 127.0.0.1/127.0.0.1 Running 16 minutes apiserver gcr.io/google-containers/hyperkube:v0.14.1 scheduler gcr.io/google-containers/hyperkube:v0.14.1

This ‘nginx-127’ pod has got three containers from the same Docker image running the master services: the controller-manager, the apiserver, and the scheduler.

Now that we’ve waited a bit, we should be able to see the containers using a normal docker ps:

imiell@rothko:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 25c781d7bb93 kubernetes/etcd:2.0.5.1 "/usr/local/bin/etcd 4 minutes ago Up 4 minutes suspicious_newton 8922d0ba9a75 gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube controll 40 seconds ago Up 39 seconds k8s_controller-manager.bca40ef7_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_c40c7396 943498867bd6 gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube schedule 40 seconds ago Up 40 seconds k8s_scheduler.b41bfb6e_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_871c00e2 354039df992d gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube apiserve 41 seconds ago Up 40 seconds k8s_apiserver.c24716ae_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_4b062320 033edd18ff9c kubernetes/pause:latest "/pause" 41 seconds ago Up 41 seconds k8s_POD.7c16d80d_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_da72f541 beddf250f4da gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube kubelet 43 seconds ago Up 42 seconds kickass_ardinghelli

Step Three: Run the Service Proxy

The Kubernetes service proxy allows you to expose pods as services from a consistent address. We’ll see this in action later.

docker run \ -d \ --net=host \ --privileged \ gcr.io/google_containers/hyperkube:v0.14.1 \ /hyperkube proxy \ --master=http://127.0.0.1:8080 \ --v=2

This is run separately as it requires privileged mode to manipulate iptables on your host.

A docker ps will show the proxy as being up:

imiell@rothko:~$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 2c8a4efe0e01 gcr.io/google_containers/hyperkube:v0.14.1 "/hyperkube proxy -- 2 seconds ago Up 1 seconds loving_lumiere 8922d0ba9a75 gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube controll 15 minutes ago Up 15 minutes k8s_controller-manager.bca40ef7_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_c40c7396 943498867bd6 gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube schedule 15 minutes ago Up 15 minutes k8s_scheduler.b41bfb6e_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_871c00e2 354039df992d gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube apiserve 16 minutes ago Up 15 minutes k8s_apiserver.c24716ae_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_4b062320 033edd18ff9c kubernetes/pause:latest "/pause" 16 minutes ago Up 15 minutes k8s_POD.7c16d80d_nginx-127_default_a8ae24cd98c73bd6d873bc54c030606b_da72f541 beddf250f4da gcr.io/google-containers/hyperkube:v0.14.1 "/hyperkube kubelet 16 minutes ago Up 16 minutes kickass_ardinghelli

Step Four: Run an Application

Now we have our Kubernetes cluster set up locally, let’s run an application with it.

imiell@rothko:~$ ./kubectl -s http://localhost:8080 run-container todopod --image=dockerinpractice/todo --port=8000 CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS todopod todopod dockerinpractice/todo run-container=todopod 1

This creates a pod from a single image (a simple todo application)

imiell@rothko:~$ kubectl get pods POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED nginx-127 controller-manager gcr.io/google-containers/hyperkube:v0.14.1 127.0.0.1/ Running About a minute apiserver gcr.io/google-containers/hyperkube:v0.14.1 scheduler gcr.io/google-containers/hyperkube:v0.14.1 todopod-c8n0r todopod dockerinpractice/todo run-container=todopod Pending About a minute

Lots of interesting stuff here – the HOST for our todopod (which has been given a unique name as a suffix) has not been set yet, because the provisioning is still Pending (it’s downloading the image from the Docker Hub).

Eventually you will see it’s running:

imiell@rothko:~$ kubectl get pods POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED nginx-127 controller-manager gcr.io/google-containers/hyperkube:v0.14.1 127.0.0.1/127.0.0.1 Running About a minute apiserver gcr.io/google-containers/hyperkube:v0.14.1 scheduler gcr.io/google-containers/hyperkube:v0.14.1 todopod-c8n0r 172.17.0.43 todopod dockerinpractice/todo 127.0.0.1/127.0.0.1 run-container=todopod Running 5 seconds

and it has an ip address (172.17.0.43). A replication controller is also set up for it, to ensure it gets replicated:

imiell@rothko:~$ ./kubectl get rc CONTROLLER CONTAINER(S) IMAGE(S) SELECTOR REPLICAS todopod todopod dockerinpractice/todo run-container=todopod 1

We can address this service directly using the pod ip:

imiell@rothko:~$ wget -qO- 172.17.0.43:8000 | head -1

Step Six: Set up a Service

But this is not enough – we want to expose these pods as a service to port 80 somewhere:

imiell@rothko:~$ ./kubectl expose rc todopod --target-port=8000 --port=80 NAME LABELS SELECTOR IP PORT todopod run-container=todopod 10.0.0.79 80

So now it’s available on 10.0.0.79:

imiell@rothko:~$ ./kubectl get service NAME LABELS SELECTOR IP PORT kubernetes component=apiserver,provider=kubernetes 10.0.0.2 443 kubernetes-ro component=apiserver,provider=kubernetes 10.0.0.1 80 todopod run-container=todopod 10.0.0.79 80

and we’ve successfully mapped port 8000 on the pod to a port 80.

Let’s make things interesting by killing off the todo container:

imiell@rothko:~$ docker ps | grep dockerinpractice/todo 3724233c6637 dockerinpractice/todo:latest "npm start" 13 minutes ago Up 13 minutes k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_da1467a2 imiell@rothko:~$ docker kill 3724233c6637 3724233c6637

and then after a moment (to be sure, wait 20 seconds), call it again:

imiell@rothko:~$ wget -qO- 10.0.0.79 | head -1

The service is still there even though the container isn’t! The replication controller picked up that the container died, and restored service for us:

imiell@rothko:~$ docker ps -a | grep dockerinpractice/todo b80728e90d3f dockerinpractice/todo:latest "npm start" About a minute ago Up About a minute k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_00316aec 3724233c6637 dockerinpractice/todo:latest "npm start" 15 minutes ago Exited (137) About a minute ago k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_da1467a2

Step Seven: Make the Service Resilient



Management’s angry that the service was down momentarily. We’ve figured out this is because the container died (and the service was automatically recovered) and want to take steps to prevent a recurrence. So we decide to resize the todopod:

imiell@rothko:~$ ./kubectl resize rc todopod --replicas=2 resized

and there are now two pods running todo containers:

imiell@rothko:~$ kubectl get pods POD IP CONTAINER(S) IMAGE(S) HOST LABELS STATUS CREATED nginx-127 controller-manager gcr.io/google-containers/hyperkube:v0.14.1 127.0.0.1/127.0.0.1 Running 28 minutes apiserver gcr.io/google-containers/hyperkube:v0.14.1 scheduler gcr.io/google-containers/hyperkube:v0.14.1 todopod-c8n0r 172.17.0.43 todopod dockerinpractice/todo 127.0.0.1/127.0.0.1 run-container=todopod Running 27 minutes todopod-pmpmt 172.17.0.44 todopod dockerinpractice/todo 127.0.0.1/127.0.0.1 run-container=todopod Running 3 minutes

and here’s the two containers:

imiell@rothko:~$ docker ps | grep dockerinpractice/todo 217feb6f25e8 dockerinpractice/todo:latest "npm start" 16 minutes ago Up 16 minutes k8s_todopod.6d3006f8_todopod-pmpmt_default_8e645492-dc50-11e4-be97-d850e6c2a11c_480f79b7 b80728e90d3f dockerinpractice/todo:latest "npm start" 26 minutes ago Up 26 minutes k8s_todopod.6d3006f8_todopod-c8n0r_default_439950e4-dc4d-11e4-be97-d850e6c2a11c_00316aec

It’s not just the containers that are resilient – try running:

./kubectl delete pod

and see what happens!

It’s Not Magic

Management now thinks that the service is bullet-proof and perfect – but it’s wrong!

The service is still exposed to failure: if the machine that kubernetes is running on dies, the service goes down.

Perhaps more importantly they don’t understand that the todo app is per browser session only, so their todos will not be retained across sessions. Kubernetes does not magically make applications scalable, so some kind of persistent storage and authentication method is required in the application and to make this work as they want.

Conclusion

This only scratches the surface of Kubernetes’ power. We’ve not looked at multi-container pods and some of the patterns that can be used there, or using labels, for example.

Kubernetes is changing fast, and is being incorporated into other products (such as OpenShift), so it’s worth getting to understand the concepts underlying it. Hyperkube’s a great way to do that fast.

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell