Do you have thousands of infrastructure dependent scripts with huge development and testing configurations to handle?

All of them are old excuses conspiring against the Continuous Delivery concept. Must it always be so? No. The following post will explain why all those excuses are fading now thanks to Docker.

It’s early Wednesday morning in Chicago. Yesterday the developers and the QAs just completed a new software feature, and today they’re deploying it into production. The company now has the newest feature in its web application 24 hours ahead of all competitors. How is it possible?

Here is where Docker comes in.

Every day, several issues occur over and over again when it comes to deploying a web app. These problems make this process an absolute nightmare, even on Unix based systems.

Docker enables you to build isolated and repeatable development environments. You can have your development environment configured as an exact mirror of production, and deploy your web app into a container without having to worry about configurations or dependencies. It sounds amazing…

But, what is Docker itself?

Docker is a container-based virtualization approach that builds on Linux Container technology in order to enable better application portability across platforms. Docker containers are also lightweight, which means that unlike virtual machines, they don’t have to run an underlying operating system.

Docker currently uses Linux Containers (LXC), that apply features built-in into the Linux kernel that provides encapsulation features over operative system processes. This allows it to share the host operating system resources. Moreover, it uses AuFS for the file system, and manages the networking for you. Pretty nice, isn’t it?

AuFS is a layered file system. It takes an existing filesystem and transparently overlays it only by adding the changes that you apply to it. It allows files and directories from separate file systems to coexist under a single roof. AuFS can merge several directories and provide a single merged view of it. Thus, it’s possible to have the common parts of the operating system as read-only, (which are shared among all of your containers) and then give each container its own mount for writing.

A full virtualized system gets its own set of resources allocated to it, and does minimal sharing. You get more isolation, but it is much heavier and requires more resources. Instead, with LXC you get less isolation with a solution that is more lightweight and requires less resources.

Let’s say you have a container image that is 1GB in size. If you wanted to use a Full VM, you would need to have 1GB multiplied by the number of VMs you want. With LXC and AuFS you can share the bulk of the 1GB, and if you have 1000 containers you still might only have a little over 1GB of space for the containers OS, assuming they are all running the same OS image.

A full virtualized systems usually takes minutes to start. LXC containers take seconds, and sometimes even less than a second!!!

What’s the difference between LXCs and VMs?

VM: LXCs: A virtual machine emulates a physical computing environment, but requests CPU, memory, hard disk, network, and other hardware resources that are managed by a virtualization layer that translates these requests to the underlying hardware.

In this context the VM is called as the ‘Guest’ while the environment it runs on is called the ‘Host.’ Linux Containers (LXC) are operating system-level capabilities that make it possible to run multiple isolated Linux containers on one control host (the LXC Host). LXC is a lightweight alternative to Virtual Machines as it don’t require the hypervisors.

Docker leads to some echoing opinions in which cloud computing environments should abandon virtual machines (VMs) and replace them with containers, due to their lower overhead and potentially better performance.

Dependencies? Not a problem…

Nowadays more and more applications are built under other components and external services. Each of these comes with its own dependencies, which may create conflicts among them. Changing one component might easily break another. With Docker, the days of “dependency hell” have come to an end.

The question Docker asks is: Why should we try to solve these problems when we can avoid them beforehand? By packaging each component and its dependencies into separate containers, Docker manages to solve the following:

Two websites with different PHP versions? Not a problem. Simply run a different container for each version.

Missing dependencies? Each Docker image already comes with all the dependencies wrapped together and ready for use.

Moving from one distribution to another? Again, not a problem! Both have to run Docker, and the same container will start clean right away.

Who uses Docker?

eBay Inc., uses Docker to implement an efficient, automated path from the developer’s laptop through testing and QA.

At eBay Inc., they have several different build clusters. They are primarily partitioned due to a number of factors: requirements for running different OS flavors (mostly RHEL and Ubuntu), software version conflicts, associated application dependencies, and special hardware. When using Mesos, we try to operate on a single cluster with heterogeneous workloads instead of having specialized clusters. Docker provides a good solution to isolate the different dependencies inside the container, irrespective of the host setup where the Mesos slave is running. This helps us operate on a single cluster.

Baidu, Inc. is the number one Chinese-language Internet search provider. They have a broad portfolio of products, including social-networking products, music products, mobile related products, and other products and services.

“We were drawn to Docker because it replaces sandboxing with containerization. Docker provides a multi-language, agile, and cost effective solution that provides our developers with the flexibility needed to support a growing number of frameworks and applications.” (Yifei Chen, Baidu App Engine Team Lead)

Gilt Groupe, Inc., a leading online shopping company, operates an online flash sale site in the United States.

“At Gilt, we are moving all of our software to run on Docker’s platform. Gilt runs on a very modern micro services architecture. Docker helps us keep services isolated and simplifies our continuous delivery pipeline that in turn encourages innovation and experimentation across all of our teams.” (Michael Bryzek, Gilt Groupe Founder and Chief Technology Officer)

Spotify streams music to more than 40 million users in 57 countries around the world.

“Docker is changing the way we deploy services and run our data centers. We are accelerating our continuous delivery process by leveraging Docker containers for testing and deployment, and internal teams who have switched to Helios. Our open sourced and Docker-powered platform are experiencing productivity gains within weeks of adoption.” (Simon Cohen, Spotify Team Lead)

New Relic is a Software Analytics company that analyses billions of metrics across millions of apps.

“Docker’s platform has become an important part of how we build and deploy new services at New Relic. For example, when we launched the New Relic Insights product, Docker gave our development team more independence and let them focus on bringing a great piece of software live on an aggressive schedule. Docker has helped us deliver new services faster than ever before.” (Nic Benders, Director of Site Engineering)

Now, it’s your turn to adopt Docker.

Docker has become a tool that perfectly matches the current technological ground with current trends, like using the language of the web for application integration, and leveraging application composition and SOA patterns for developing big web applications. However, handling all the tiny concerns can be painful. Imagine having to deploy a dozen services independently. The amount of effort and investment required is really big, but Docker provides a minimal footprint that doesn’t constrain large application servers that have difficulty scaling. It also helps if you don’t have to spin out hundreds of VMs.

The path for adoption may vary depending on the current maturity of your organization, but don’t be deterred by this. Even freelancers make good use of this wonderful technology.

We don’t encourage you to start running Docker right away when building real-life applications. You will need certain skills in order to build applications, handle continuous delivery practices, and maintain automatic infrastrastructure provisioning. You should also know about virtualization, and your organization will need to know how to deploy things to the cloud.

Let’s talk about what we think will be the first step on Docker adoption and how you should develop your application. The key is composition.

The Big and Scary Monolithic Application: In some instances you must support existing applications. The first step is to start splitting application functionality into smaller pieces. This can be accomplished by simply using programing interfaces and separating constraints using contracts. After you have done this, you can start moving those pieces into a container, and connect the pieces. Those interfaces will then implement REST calls.

Greenfield Applications: Industries are already leveraging continuous delivery tools, that’s why the path of automatic provisioning is getting easier. But if you don’t follow these practices already you will need to spend some time setting up all the required infrastructure. This may be difficult at first, but it will make building new containerized applications a breeze.

We’ve already talked about how the need to split applications into smaller pieces will require you to manage how those applications interact with one another, and how they should be deployed.

Once you have a couple of services ready, you will start to notice the burden of needing to handle the delivery process of multiple components. An orchestration tool should give you the ability to deploy several containers, configure the network anatomy, and allow containers to properly scale. One of the most important qualities to look at in this kind of tools is the capacity to span from a single machine or VM scenario to multiple machines or data centers.

Several solutions address these requirements, known as scheduling frameworks. Given that Docker allows a process to run, you are now basically scheduling a task. Some tasks are part of Docker, like Docker Machine and Docker Swarm. Others come from big players, like Kubernetes, Amazon ECS, and Mesos.

Once you have the means of deploying several containers, it’s time to have them talking to each other. For tiny applications that don’t span multiple machines, simply link containers. This will expose HTTP ports in a way that allows an application to contact others by making an HTTP request to that port. As good as that sounds, it will add some configuration complexity by requiring you to store all the ports in a configuration while also having to worry about collisions.

Things can get out of control when you scale beyond a single host. That’s when techniques, like an Ambassador container, are more suited to handle these kinds of setups.

Let’s go one step further and consider other solutions that may apply to you, like a distributed configuration system such as Zookeeper or Etcd. Of course, it will depend on your use case, but in this situation a service registry tool is useful. Having a common registry for all other containers is one of the prefered approaches. This will store all the network configurations for a particular application. Some registries even work as distributed configuration stores, thus killing two birds with one stone.

Containers are generally ephemeral in essence. Although this is not always true, it is a great way to design your applications, especially given that a container is your unit of failure or your means to scale. If a container fails, you will probably want to at least restart it or to have it running elsewhere if your machine gets compromised.

Once they have accomplished their work, it’s nice to think of them as a disposable unit. So how do we resolve our storage requirements? You can think of using Docker volumes that span the lifetime of a container, and you can of course mount a host machine directory into the container. There are also more options when working in the cloud, like using Buckets.

But for database requirements we don’t recommend that you switch to a fully Dockerized environment unless you really know what are you doing. There are some efforts out there that have to do with rethinking databases and splitting them for better horizontal scale, but that goes beyond this post topic.



Current Panorama

Everyday containers are gaining more support from the big companies out there and more uses are starting to appear, such as using them as an alternative to VMs. For example, a platform-as-a-service cloud will need to have scalability and interoperability between many applications. Here, a Docker container is simpler to set up, run, and manage than a VM.

The Docker community continues to grow, creating and cultivating generic service containers that anyone can use as starting points. The fact that any of these containers can be run on any system that runs Docker shows just how an incredible an achievement of engineering Docker really is.

Writers: Eduardo Rodríguez, Diego Marinelli, Marcelo Cejas, Francisco Chiotta

Collaborators: Gustavo Cipriani, Pablo Frías, Alejandro Garro, Mark Noce

by