Let’s backtrack quite a bit. In the beginning, there was namespace — the mount namespace around 2001 and finally the user namespace in 2013. In parallel came also Control Groups (cgroups), initially from Google developers and expanded on quite a bit (~2006 to 2014). Note that systemd became defacto init process in most Linux distributions around that time, and it had very deep integration with cgroups for process/service management.

When namespaces and cgroups were somewhat stable in the Kernal, came Docker around 2014 that introduced the world to containers using namespaces to jail and cgroups to limit resources — sort of lightweight VMs.

Kubernetes was born around this time 2014 -15 as a means of container orchestration. This was an internal project in Google -Borg much before this, but opensourced during this time.

In 2015, Docker separated out the runtime it into a lower layer — runc and contributed to OCI — https://www.docker.com/blog/runc/.This is the reference implementation of OCI.

In 2016 CRI was introduced by Kubernetes to host other Container runtimes other than Docker. If you recall in each Kubernetes worker node there is a kubelet service that runs and listens to the master node to orchestrate container lifecycle.

At the lowest layers of a Kubernetes node is the software that, among other things, starts and stops containers. We call this the “Container Runtime”. The most widely known container runtime is Docker, but it is not alone in this space. … In the Kubernetes 1.5 release, we are proud to introduce the Container Runtime Interface (CRI) — a plugin interface which enables kubelet to use a wide variety of container runtimes, without the need to recompile. https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/

In 2017 Docker spun out the integration layer to runc as a separate project — containerd and open-sourced it.

So we started the containerd project to move the container supervision out of the core Docker Engine and into a separate daemon. containerd has full support for starting OCI bundles and managing their lifecycle. This allows users to replace the runc binary on their system with an alternate runtime and get the benefits of still using Docker’s API. https://www.docker.com/blog/docker-containerd-integration/

Note -On Feb 2019 containerd become the fifth project to qualify from CNCF foundation, no mean achievement as the rest of those who qualify till now are Kubernetes, Prometheus, Envoy, and CoreDNS , an elite batch

Docker gets it’s work done via containerd using runc.Kuberenetes get’s it’s work done in an overwhelming majority of installations via containerd and runc via Docker or using a CRI shim in place of Docker.

Now, this is how Docker call flow goes.

Back to kubelet — kubelet can talk to containerd directly. If you look at some older docs this was through a service called cri-containerd initially.

However, this is now part of containerd itself now and cri-containerd is defunct.

If you have installed your Kubernetes cluster with Kubeadm, you may see that kubelet service is by default dependent on docker service like below

But the Kubernetes built-in dockershim CRI does not support runtime handlers. But this does not prevent one to use containerd service and configure the kubelet to use containerd directly.containerd can be configured to use other OCI compatible runtimes like gvisor or kata containers via service plugins like shown below.

For those experimenting with kubeadm based cluster may feel the current documentation bit incomplete. One can refer to this bug/support I raised until the documentation is improved.

If you follow this https://github.com/google/gvisor-containerd-shim/issues/46 you can set your worker node (worker-1 below to use containerd insread of Docker)

NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME

master-1 Ready master 69d v1.17.0 192.168.0.26 <none> CentOS Linux 7 (Core) 4.4.211-1.el7.elrepo.x86_64 docker://1.13.1

worker-1 Ready <none> 69d v1.17.0 192.168.0.6 <none> CentOS Linux 7 (Core) 4.4.211-1.el7.elrepo.x86_64 containerd://1.3.2

GVisor has a user-mode lightweight kernel that traps the syscalls/ kernal invocation of applications running in the namespace and act as a sort of firewall. More information here https://gvisor.dev/docs/architecture_guide/ and better here https://blog.loof.fr/2018/06/gvisor-in-depth.html

You can see in an existing K8s setup, how this sample nginx runs in runsc and the rest in runc, after swapping out Docker and using containerd directly with the gvisor and its shim installed.

References

If you have 35 minutes then this video below is completely worth it to give a good perspective of many aspects of Kubernetes.

Other References

Container runtimes

containerd: integration

https://cloud.google.com/blog/products/gcp/open-sourcing-gvisor-a-sandboxed-container-runtime

https://access.redhat.com/security/cve/cve-2019-5736

Breaking out of Docker via runC — Explaining CVE-2019–5736

https://lwn.net/Articles/531114/http://moi.vonos.net/linux/linux-containers/

The History of Kubernetes on a Timeline | @RisingStack

Isolate containers with a user namespace

Introducing runC: a lightweight universal container runtime — Docker Blog

Docker containerd integration — Docker Blog

https://kubernetes.io/blog/2016/12/container-runtime-interface-cri-in-kubernetes/

Architecture Guide

https://blog.loof.fr/2018/06/gvisor-in-depth.html

That thing that makes KubeVirt a little different — or the Kubernetes virtualization API.

https://medium.com/@yannalbou/the-kubernetes-containers-runtime-jungle-b3eff8c471f3

https://kubernetes.io/docs/concepts/containers/runtime-class/

SHIM V1 and V2 -Between Contained D and Container there is a containerd-shim interface (for runc). This is the V1 version. Recently SHIM V2 version is released -https://www.alibabacloud.com/blog/cri-and-shimv2-a-new-idea-for-kubernetes-integrating-container-runtime_594783

gVisor Related Configuration https://gist.github.com/alexcpn/8b0550b01dd69df5e0a8fd1116dbd073