Image courtesy of Docker.com

Recently at Cars.com, we have begun to leverage Docker for building our Node.js middleware applications and front end code. This originally came out of necessity since we had some repositories dependent on version 4 of Node and others on 6, but only 4 installed on the Jenkins agents. Rather than install both versions of Node and toggle them using a node version manager like n or, worse yet, using Jenkins tags to have some agents with one version and some with the other, we decided instead to try using Docker to manage the version of Node. This has been working quite well so far and it has come with many other benefits.

These benefits along with some Docker basics will be the primary topics of discussion in this post and we will explore some common patterns that arose in our Docker usage in Part II.

What is Docker and why use it?

Docker is a containerization platform which includes many tools to make the process of building and running containers simple and approachable. Docker makes it much easier to use, but the better question here is “What is a container?”.

Containers are a lightweight form of virtualization which leverages the kernel of the host and does not have the overhead of running an entire operating system. Typically you would want a container to run a single process and nothing more. You can also bundle the container’s, dependencies, such as binaries and runtimes, inside the container itself. What you end up with is an isolated application which is packaged with it’s dependencies. For many reasons, Docker containers are great for not only running applications, but also for building them.

A key benefit to using Docker in both the development and the release process is that it begins to bridge the gap between a developer’s workstation and the servers used to verify, build, deploy, or run the application. This is due to the fact that running the various lifecycle tasks of the application in a container means that whether its on a developers machine or on a build server, it will be running against the exact same dependencies in the same operating system. This lends itself well to Continuous delivery since you have a controlled environment in which you can build your applications and the resulting Docker image can serve as the runnable artifact. As a result, you will gain an increased confidence that a change will successfully build and deploy, and it also becomes easier for a developer to troubleshoot why a build has failed on their own machine.

Another benefit to running builds in containers is that it gives the developer the opportunity to manage the dependencies, runtime/language, and OS with which the application needs to run. Developers will control how the application is built as well, which will reduce the friction between development and operations teams and save everyone a great deal of time. This will allow your organization to adopt new technologies faster without worrying about how to build or deploy them and will make engineers happier for the same reason. No longer will technologies need to be dictated by what operations is comfortable with deploying.

“No longer will technologies need to be dictated by what operations is comfortable with deploying”

A feature of containers is that they provide complete isolation from the host system and other containers. This helps with the issue of needing different versions of dependencies in different apps but it also provides a more secure way to run multiple applications on a single host. Even if you have multiple apps which need to bind to the same port, you can leave it to Docker to manage the port mappings. This isolation comes in handy for deployment, but also for things like integration tests which may need to start up a mock server or some associated application at a specified port. In these cases, you do not have to worry about coordinating when the tests are run to avoid port conflicts.

Hopefully at this point, it is clear why you may want to use Docker in your own pipelines, so now let’s learn a bit more about the tool and how it can be used.

Docker basics

In this section we will run through some examples of using Docker. To install Docker on your machine look here.

Let us start by running a container and seeing what it has to offer:

docker container run -it ubuntu bash

In the above command we are running a Docker container in interactive mode ( -it ) so that we see the outputs from the container and can provide input. The container we are running is going to be based on the ubuntu image (more on this shortly), and the command we want to run to start the container is bash to start a shell. You will find yourself at command prompt in an ubuntu container.

Note: If you are already familiar with docker, you may be accustomed to seeing commands like docker run or docker images . These commands and others still exist, but it is recommended to use the new syntax introduced as of version 1.13. The commands have the new counterparts docker container run and docker image ls respectively.

Now let’s try to run a curl command against an open API:

curl wttr.in/Chicago

You should see output like this: bash: curl: command not found . That’s strange because I have curl installed on my machine and you most likely do as well. This is because the container is isolated from your host operating system and will only have access to binaries or dependencies within the container. What this means is that we will need to go ahead and install curl in the container:

apt-get update -y && apt-get install -y curl

curl wttr.in/Chicago

Now when we run the curl command it works! Great, but we shouldn’t have to go into the container and install curl every time. Lucky for us Docker also provides a way to capture container states and allow you to start up new containers from that point. To do this we will use the docker container commit , but first we will need the ID of the container we are running on currently.

The easiest way to get this is to leave our current session on the container open and get the ID in a new terminal window. Let’s start a new terminal window and run the command docker container ls . You should see an ubuntu container running and the first row in the list will have the ID. Copy the ID and run the following:

docker container commit {container id} {your github username}/curl

docker image ls

After the second command you will see another list that should, at very least, have ubuntu and your new image. Now we can kill our old container running in another terminal session and start a new one from where we left off:

docker container run -it {your github username}/curl bash

curl wttr.in/Chicago

Now would be a good point to distinguish a Docker container from an image. It is important to know the difference and also to use the correct terms when talking about one or the other. An analogy I typically will use to differentiate the two is that an image is to a class as a container is to an instance in your object oriented language of choice. This is a good start, but as you get into more advanced features it is good to know that the two are not mutually exclusive.

A Docker image was created in a containerized environment, but is made read only upon creation (as we have seen above). Running a container from that image then creates a layer on top of the image which has read and write permissions but still has access to the image filesystem. In fact an image will usually consist of a number of read-only layers stacked on top of each other forming a copy-on-write filesystem (you can find more details about this in the documentation for Docker). We will expand upon this idea shortly, as it will be important in the second part of this post.

Manual Configuration? No, Thanks

We have seen that we can run containers and even configure them and commit the changes to be used later, but at this point you should be skeptical. Manually configuring a container to commit as an image is not ideal. Doing this leaves room for error and also whenever possible, it is nice to have those configurations committed to version control. As you may have guessed, Docker has our back with this one in the form of Dockerfiles.

A Dockerfile is a script with its own syntax which declares how to create a Docker image and it will fix all the qualms we have with the process we saw previously. It gives us the reproducibility and configuration as code which we so love in continuous delivery.

As you may have realized from our previous example, when we run a container we start it from an existing image. We initially started with the base ubuntu image ( docker container run -it ubuntu bash ) and we will do the same here. Dockerfiles have their own set of commands, or directives, and the first one we will use is FROM which will specify our base image. From there we need to install curl, which we can do using RUN , which will expect to be followed by the command you wish to run as part of the image creation.

Up until this point we have been running containers by passing in the command we want them to start with, but we can also set a default that will be run in the case where we omit this. I think that our curl command to the weather API would make sense here, so we will add this via the CMD directive. Here is the result:

Most of this is straightforward, but there are a couple of things I would like to point out before moving on.

The first is that you may have noticed that I had combined the apt-get update and install commands into a single command. This was intentional because with just the single RUN we are creating just one layer in the resulting image instead of two. Any directive will create a new layer in the container (excluding FROM which references the top layer of a different image).

The next thing to take a look at would be the CMD section which we had passed an array. This is to avoid any issues in running commands in different types of operating systems and is the best practice with Dockerfiles. Before moving on, be sure to create this Dockerfile on your machine by putting the above code in a file simply called Dockerfile .

Now we have our Dockerfile which tells Docker how to create our image, so all we have left to do is build the image and run a container from that image. To build you will hop into your terminal, change directories :

docker build -t {your github username}/curl .

Like before when we committed our container to an image, we pass the name of our image prefixed with our github, or dockerhub, user name by using the -t flag. This is to avoid conflicts with others who may have created another curl image and is best practice when publishing your images to the public image registry.

Note: In the case that you have a private registry, instead of your username, you would replace this with the DNS name of your registry which let’s docker know it should push the image into that private registry as opposed to making it public.

The final argument is called the context of the image. It is the directory from which you want to build the image. In our case we use the current directory, . , and Docker will expect a file called Dockerfile to be in this directory (the location or name of the Dockerfile can be overridden using the -f flag). The context will also serve as the root directory for any references to files from the host, but there will be more to come on this in later posts. Upon running the build command, give your machine a moment to create then let’s run it:

docker run -i {your github username}/curl

If all goes according to plan, you should see the container start up, run the curl command we set using CMD then return you back to your host’s command prompt.

Final Thoughts

This post will hopefully serve as a good jumping off point before moving on to my next post which will explore common patterns with Docker that will benefit your CI/CD pipelines. In the following post we will go into great detail about how we use Docker at Cars.com and also review some common design patterns that have arisen from that usage.