How does Docker work?

Now that we learned about containerization and how containers work, it’s time to face ultimate truth. Docker is nothing but a containerization software and the Docker engine is nothing but a container engine.

A docker engine consists of Docker daemon and other utilities to create, destroy and manage containers. Docker daemon is a process running in the background that receives commands from local or remote Docker client (CLI) using HTTP REST protocol to manage containers. Hence Docker is said to follow client-server architecture where the server is Docker daemon.

When you install Docker on your system, you get the Docker engine, Docker command-line interface (Docker client) and other GUI utilities. When you start your Docker, it will start the Docker daemon.

☛ What is a Docker container?

The container we discussed so far is a general interpretation of what container is and how it works. Docker container is far more sophisticated than that.

A docker container contains application code and other dependencies. These other dependencies are what makes a container a “container”. These other dependencies consist of necessary (application-specific) libraries, binaries and other resources that are needed for our application to function.

An example of a container would be a node.js server. So our application code would consist of server.js containing application code and node_module library. But to run it, we need node installed in the container, hence we need a node binary file. node.js might depend on other binaries and libraries, hence we need that too. Then node.js needs an OS to run on for example CentOS, hence we need a customized binary for that too which Docker engine could utilize to talk to guest OS and kernel.

☛ What is a Docker image?

The node server example we just talked about contains many pieces that need to be present in the container so that our application could work. A Docker image is a zipped box that contains all these pieces.

We instruct Docker client to create a container from this image. Docker client instructs Docker daemon to unzip the image, reads the content and launch the container with server.js executing as a process. Depending on other instructions in the image, Docker daemon might expose some ports from the container which we can listen to and/or mount volumes and do other things.

To create a Docker image, we need a Dockerfile . Dockerfile is a configuration file with instructions to tell the Docker engine, how to build an image. These instructions can be what would be the base image, what would be the working directory inside the OS running inside the container, what application-specific files need to be copied from the system, what ports need to be exposed in the container and other zillions of things.

A base image is an official image provided by Docker in which we will add our application-specific code and instructions. A base image can contain the CentOS operating system installed with the Apache server.

A docker image follows a modified Union File System such as AuFS. Each instruction in Dockerfile creates a read-only AuFS layer. These layers are stacked on each other as mentioned in Dockerfile . Each layer is only a set of differences from the layer before it.

When we create a container from this image, we copy all these read-only layers and add a new read-write layer on top of it. The read-only layers are called image layers while the thin read-write layer in the container is called a container layer.

A typical Dockerfile would look like below (follow this link for other details of Dockerfile but below is a sample example).

FROM ubuntu:15.04

COPY . /app

RUN make /app

CMD python /app/app.py

In above Dockerfile , we are creating our image from base Ubuntu image of version 15.04 (provided by Docker Hub) which creates the first layer. Then we are copying everything from the current directory to the /app location in Ubuntu OS which creates a new layer and stacks on the previous layer. Then we are building the application using make command which writes output to the new layer and stacks on the top of the previous layer. Then we are running the python program using python command. The last instruction does not take any space in the layer as it is a bash command.

As you can see from the above image, when we run a container, it creates a read-write layer on the top of the image layers. All the changes made to the running container, such as writing new files, modifying existing files and deleting files, are written to this thin writable container layer.

When a container is running, the container layer needs to communicate with layers below it to merge the differences in each layer and generate an actual file system. This is done using storage drivers provided by the Docker engine.

When a layer (including container layer) needs to read a file in the below layer, it reads the file from that layer directly. While building an image, when a layer needs to write a file from the layer below it, that file is copied to the current layer and changes are made there (diff is saved in the layer).

In the container, when the container layer wants to write to the file from the layer below it, that file is copied to the container layer and changes are made to the file. The strategy of copying the file when we want to write (modify it) called a copy-on-write (CoW) strategy.

This makes writable layer lightweight, hence we call it the thin layer. Hence all modification made to the image layers lives in the writable container layer. When the container is destroyed, the container layer is destroyed too but image layers are preserved as it is. We can still save the writable layer of a container if we want to which is called a persistent Docker container.

Multiple containers can share some or all file system layers from one or many images. Since each layer is labeled with UUID which is a checksum of the content in the layer, they are very re-usable. If two containers are made from the same image, they share 100% of the image layers and have their own unique writable layer (as seen in the image below).

Having layered filesystem with a copy-on-write (CoW) strategy along with layer usability is what makes Docker containers so blazing fast to create. Hence containers are lightweight and have small sizes on the disk (size of the writable layer only).