Packer + Ansible - Dockerfile = AwesomeContainer

December 03, 2017

As a trendy software engineer, I use Docker because it’s a nice way to try software without environment setup hassle. But as an SRE/DevOps kinda guy I also create my own images – for CI environment, for experimenting and sometimes even for production.

We all know that Docker images are built with Dockerfiles but in my not so humble opinion, Dockerfiles are silly - they are fragile, makes bloated images and look like crap. For me, building Docker images was tedious and grumpy work until I’ve found Ansible. The moment when you have your first Ansible playbook work you’ll never look back. I immediately felt grateful for Ansible’s simple automation tools and I started to use Ansible to provision Docker containers. During that time I’ve found Ansible Container project and tried to use it but in 2016 it was not ready for me. Soon after I’ve found Hashicorp’s Packer that has Ansible provisioning support and from that moment I use this powerful combo to build all of my Docker images.

Hereafter, I want to show you an example of how it all works together, but first let’s return to my point about Dockerfiles.

Why Dockerfiles are silly

In short, because each line in Dockerfile creates a new layer. While it’s awesome to see the layered fs and be able to reuse the layers for other images, in reality, it’s madness. Your images size grows without control and now you have a 2GB image for a python app, and 90% of your layers are not reused. So, actually, you don’t need all these layers.

To squash layers, you either use do some additional steps like invoking docker-squash or you have to give as little commands as possible. And that’s why in real production Dockerfiles we see way too much && s because chaining RUN commands with && will create a single layer.

To illustrate my point, look at the 2 Dockerfiles for the one of the most popular docker images – Redis and nginx. The main part of these Dockerfiles is the giant chain of commands with newline escaping, inplace config patching with sed and cleanup as the last command.

RUN set -ex; \ \ buildDeps=' \ wget \ \ gcc \ libc6-dev \ make \ '; \ apt-get update; \ apt-get install -y $buildDeps --no-install-recommends; \ rm -rf /var/lib/apt/lists/*; \ \ wget -O redis.tar.gz "$REDIS_DOWNLOAD_URL"; \ echo "$REDIS_DOWNLOAD_SHA *redis.tar.gz" | sha256sum -c -; \ mkdir -p /usr/src/redis; \ tar -xzf redis.tar.gz -C /usr/src/redis --strip-components=1; \ rm redis.tar.gz; \ \ # disable Redis protected mode [1] as it is unnecessary in context of Docker # (ports are not automatically exposed when running inside Docker, but rather explicitly by specifying -p / -P) # [1]: https://github.com/antirez/redis/commit/edd4d555df57dc84265fdfb4ef59a4678832f6da grep -q '^#define CONFIG_DEFAULT_PROTECTED_MODE 1$' /usr/src/redis/src/server.h; \ sed -ri 's!^(#define CONFIG_DEFAULT_PROTECTED_MODE) 1$!\1 0!' /usr/src/redis/src/server.h; \ grep -q '^#define CONFIG_DEFAULT_PROTECTED_MODE 0$' /usr/src/redis/src/server.h; \ # for future reference, we modify this directly in the source instead of just supplying a default configuration flag because apparently "if you specify any argument to redis-server, [it assumes] you are going to specify everything" # see also https://github.com/docker-library/redis/issues/4#issuecomment-50780840 # (more exactly, this makes sure the default behavior of "save on SIGTERM" stays functional by default) \ make -C /usr/src/redis -j "$(nproc)"; \ make -C /usr/src/redis install; \ \ rm -r /usr/src/redis; \ \ apt-get purge -y --auto-remove $buildDeps

All of this madness is for the sake of avoiding layers creation. And that’s where I want to ask a question – is this the best way to do things in 2017? Really? For me, all these Dockerfiles looks like a poor man’s bash script. And gosh, I hate bash. But on the other hand, I like containers, so I need a neat way to fight this insanity.

Ansible in Dockerfile

Instead of putting raw bash commands we can write a reusable Ansible role invoke it from the playbook that will be used inside Docker container to provision it.

This is how I do it

FROM debian:9 # Bootstrap Ansible via pip RUN apt-get update && apt-get install -y wget gcc make python python-dev python-setuptools python-pip libffi-dev libssl-dev libyaml-dev RUN pip install -U pip RUN pip install -U ansible # Prepare Ansible environment RUN mkdir /ansible COPY . /ansible ENV ANSIBLE_ROLES_PATH /ansible/roles ENV ANSIBLE_VAULT_PASSWORD_FILE /ansible/.vaultpass # Launch Ansible playbook from inside container RUN cd /ansible && ansible-playbook -c local -v mycontainer.yml # Cleanup RUN rm -rf /ansible RUN for dep in $(pip show ansible | grep Requires | sed 's/Requires: //g; s/,//g'); do pip uninstall -y $dep; done RUN apt-get purge -y python-dev python-pip RUN apt-get autoremove -y && apt-get autoclean -y && apt-get clean -y RUN rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp* /usr/share/doc/* # Environment setup ENV HOME /home/test WORKDIR / USER test CMD ["/bin/bash"]

Drop this Dockerfile to the root of your Ansible repo and it will build Docker image using your playbooks, roles, inventory and vault secrets.

It works, it’s reusable, e.g. I have some base roles that applied for docker container and on bare metal machines, provisioning is easier to maintain in Ansible. But still, it feels awkward.

Packer with Ansible provisioner

So I went a step further and started to use Packer. Packer is a tool specifically built for creating of machine images. It can be used not only to build container image but VM images for cloud providers like AWS and GCP.

It immediately hooked me with these lines in the documentation:

Packer builds Docker containers without the use of Dockerfiles. By not using Dockerfiles, Packer is able to provision containers with portable scripts or configuration management systems that are not tied to Docker in any way. It also has a simple mental model: you provision containers much the same way you provision a normal virtualized or dedicated server.

That’s what I wanted to achieve previously with my Ansiblized Dockerfiles.

So let’s see how we can build Redis image that is almost identical to the official.

Building Redis image with Packer and Ansible

First, let’s create a playground dir

$ mkdir redis-packer && cd redis-packer

Packer is controlled with a declarative configuration in JSON format. Here is ours:

{ "builders" : [{ "type" : "docker" , "image" : "debian:jessie-slim" , "commit" : true , "changes" : [ "VOLUME /data" , "WORKDIR /data" , "EXPOSE 6379" , "ENTRYPOINT [\"docker-entrypoint.sh\"]" , "CMD [\"redis-server\"]" ] }], "provisioners" : [{ "type" : "ansible" , "user" : "root" , "playbook_file" : "provision.yml" }], "post-processors" : [[ { "type" : "docker-tag" , "repository" : "docker.io/alexdzyoba/redis-packer" , "tag" : "latest" } ]] }

Put this in redis.json file and let’s figure out what all of this means.

First, we describe our builders – what kind of image we’re going to build. In our case, it’s a Docker image based on debian:jessie-slim . commit: true tells that after all the setup we want to have changes committed. The other option is export to tar archive with the export_path option.

Next, we describe our provisioner and that’s where Ansible will step in the game. Packer has support for Ansible in 2 modes – local and remote.

Local mode ( "type": "ansible-local" ) means that Ansible will be launched inside the Docker container – just like my previous setup. But Ansible won’t be installed by Packer so you have to do this by yourself with shell provisioner – similar to my Ansible bootstrapping in Dockerfile.

Remote mode means that Ansible will be run on your build host and connect to the container via SSH, so you don’t need a full-blown Ansible installed in Docker container – just a Python interpreter.

So, I’m using remote Ansible that will connect as root user and launch provision.yml playbook.

After provisioning is done, Packer does post-processing. I’m doing just the tagging of the image but you can also push to the Docker registry.

Now let’s see the provision.yml playbook:

--- - name: Provision Python hosts: all gather_facts: no tasks: - name: Boostrap python raw: test -e /usr/bin/python || (apt-get -y update && apt-get install -y python-minimal) - name: Provision Redis hosts: all tasks: - name: Ensure Redis configured with role import_role: name: alexdzyoba.redis - name: Create workdir file: path: /data state: directory owner: root group: root mode: 0755 - name: Put runtime programs copy: src: files/{{ item }} dest: /usr/local/bin/{{ item }} mode: 0755 owner: root group: root with_items: - gosu - docker-entrypoint.sh - name: Container cleanup hosts: all gather_facts: no tasks: - name: Remove python raw: apt-get purge -y python-minimal && apt-get autoremove -y - name: Remove apt lists raw: rm -rf /var/lib/apt/lists/*

The playbook consists of 3 plays:

Provision Python for Ansible Provision Redis using my role Container cleanup

To provision container (or any other host) for Ansible, we need to install Python. But how install Python via Ansible for Ansible? There is a special Ansible raw module for exactly this case – it doesn’t require Python interpreter because it does bare shell commands over SSH. We need to invoke it with gather_facts: no to skip invoking facts gathering which is done in Python.

Redis provisioning is done with my Ansible role that does exactly the same steps as in official Redis Dockerfile – it creates redis user and group, it downloads source tarball, disables protected mode, compile it and do the afterbuild cleanup. Check out the details on Github.

Finally, we do the container cleanup by removing Python and cleaning up package management stuff.

There are only 2 things left – gosu and docker-entrypoint.sh files. These files along with Packer config and Ansible role are available at my redis-packer Github repo

Finally, all we do is launch it like this

$GOPATH/bin/packer build redis.json

You can see example output in this gist

In the end, we got an image that is even a bit smaller than official:

$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE docker.io/alexdzyoba/redis-packer latest 05c7aebe901b 3 minutes ago 98.9 MB docker.io/redis 3.2 d3f696a9f230 4 weeks ago 99.7 MB

Any drawbacks?

Of course, my solution has its own drawbacks. First, you have to learn new tools – Packer and Ansible. But I strongly advise for learning Ansible, because you’ll need it for other kinds of automation in your projects. And you DO automate your tasks, right?

The second drawback is that now container building is more involved with all the packer config, ansible roles and playbooks and stuff. Counting by the lines of code there are 174 lines now

$ (find alexdzyoba.redis -type f -name '*.yml' -exec cat {} \; && cat redis.json provision.yml) | wc -l 174

While originally it was only 77:

$ wc -l Dockerfile 77 Dockerfile

And again I would advise you to go this path because:

It’s reusable. You can apply the Redis role not only for the container but also for your EC2 instance or bare metal service or pretty much anything that runs Linux with SSH. It’s maintainable. Come back few month later and you’ll still understand what’s going on because Packer config, playbook and role is structured and even commented. And you build the image with a simple packer build redis.json command to produce ready and tagged image. It’s extensible. You can use pretty much the same role to provision Redis version 4.0.5 by simply passing redis_version and redis_download_sha variables. No new Dockerfile needed.

Conclusion

So that’s my Docker image building setup for now. It works well for me and I kinda enjoy the process now. I would also like to look at Ansible Container again but that will be another post, so stay tuned – this blog has Atom feed and I also post on twitter @AlexDzyoba