There comes a point in time in every hacker's career when the need to compile a program for an alternate CPU architecture arises. Perhaps you want to compile a program for your Raspberry Pi project, create a custom image for an embedded device, or support your software across multiple platforms. Perhaps you just want to learn how the process works, or are curious what the assembly code looks like for architectures other than the ubiquitious x86-64/amd64 on desktop PCs.

Either way, you were typically required to pack your bags and attempt the programmer's equivalent of a spiritual pilgrimage; but instead of ascending to a forlorn mountaintop, you would begin a hellish descent, a journey that took you from the sunny plains of application development down into the dark caverns of the computing stack: the elusive world of low-level systems and embedded programming. Given the questionable prospects of this trek, most hackers who attempt the journey end up hitting Ctrl+Z and dashing back to the surface, gasping madly for air as they warn their colleagues of the terrors of cross-compilation, QEMU, and chroots.

OK, maybe I'm exaggerating a bit. But the truth is, it's still not straightforward to build your programs for other CPU architectures. Thankfully, this story has significantly improved with the recent introduction of a new experimental plugin in Docker 19.03 that makes multi-arch builds easier than ever.

In order to appreciate the significance of Docker's new multi-arch build support, we first need to learn a bit about building programs for foreign architectures.

Background — Methods for compiling programs for foreign architectures

Note: If you're familiar with this concept already or just want to get some damn images built already, feel free to skip this section.

Let's take a quick survey of the methods that exist today for compiling programs that target foreign architectures.

Method #1 — Build on the target hardware itself

If you have access to hardware for the target architecture, and the OS has support for all the build tooling you need, you can just compile the program directly on the hardware itself.

For our specific use-case of building multi-arch Docker images, you could, for instance, install the Docker runtime on a Raspberry Pi and build your app's Dockerfile directly on the Pi just like you normally would on your developer machine. This is possible because Raspbian, the Raspberry Pi's official OS, supports installing Docker natively.

But what if you don't have convenient access to your target hardware? Is there any way we can somehow build programs for non-native architectures directly on our developer workstation?

And that brings us to...

Method #2 — Emulate the target hardware

Do you remember the good 'ol 16-bit days with the Super Nintendo? I was just a toddler at the time, but when I grew a little older, I discovered the reverance with which gamers would reminisce of classic games like Super Mario World and Chrono Trigger. I never had the chance to own a SNES, but thanks to the emulators like ZSNES, I was able to travel back in time and experience the delight of playing those classic games — all from the comfort of my 32-bit PC.

It turns out that we can use emulation to not only play video games, but to build non-native binaries as well. Instead of using ZSNES, we can use a much more powerful and flexible emulator: QEMU. QEMU is a free and open-source emulator that supports many common architectures including ARM, Power-PC, and RISC-V. By running in full system emulation mode, you can run a generic ARM virtual machine that can boot Linux, set up your development environment as usual, and compile your app from within the VM.

But if you think about it, full system emulation seems a bit wasteful. In this mode, QEMU will emulate an entire system, including hardware like timers, memory controllers, bus controllers like SPI and I2C, etc. — but the binaries we're compiling most of the time don't care at all about these hardware-specific features. Can we do better?

Method #3 — Emulate just the target's user-space via binfmt_misc

On Linux, QEMU has an alternative operating mode that can run Linux binaries compiled for non-native architectures via user-mode emulation. This mode skips the overhead of emulating the entire target system hardware in Option #2. Rather, QEMU will register binary format handlers with the Linux kernel via binfmt_misc and interpret the foreign binary transparently when it is run, converting system calls from the target to the host system as needed. The end result for a user is that it appears that she can run foreign binaries "natively".

Using user-mode emulation and QEMU, we could install a foreign Linux distribution via lightweight virtualization (chroot or container) and build our binary as though we were building natively on our target.

We will soon see that this has become the method of choice for multi-arch Docker images.

Method #4 — Use a cross-compiler

Finally, we have the standard method used in the embedded systems community: cross-compilation.

A cross-compiler is a compiler that is specifically built to run on a given host architecture, yet output binaries for a different target architecture. For example, you may have a C++ cross-compiler for a an amd64 host that targets an embedded device (perhaps a smartphone or something) that is aarch64 (64-bit ARM). To give you a real-world example of this, consider the billions of Android devices around the world that have their software built precisely with this method.

Performance-wise, this approach is just as efficient as building on the target hardware itself (Method #1) since it runs without emulation. But the complexity of cross-compilation varies depending on your language (it's super easy with Go).

Confused yet? It gets more complicated with Docker images...

Keep in mind that all of these compilation hoops we had to jump through were just to generate a single program binary. In the modern age of containers, when we throw Docker images into the mix, we are not only talking about building single binaries, we are talking about building an entire foreign container image! It's even more annoying than before!

If all of this sounds like a pain in the ass, don't feel bad, because it kind of is a pain in the ass to build binaries for non-native platforms. Add the complexity of Docker on top of that and it seems like something best left to the experts.

But thanks to a new experimental extension in the latest Docker runtime, building multi-arch images is now easier than ever.

Building multi-arch Docker images

To build multi-arch Docker images the easy way, we can make use of a recently announced Docker extension called buildx. buildx is the next-generation frontend for the standard docker build ... command that we are all so familiar with for building Docker images. buildx extends the standard functionality of docker build by leveraging the full functionality of BuildKit, the new backend build system for Docker.

Let's see how we can use buildx to cook up some multi-arch images in just a few minutes.

Step 1 — Enable buildx

To use buildx , make sure your Docker runtime is at least version 19.03. buildx actually comes bundled with Docker by default, but needs to be enabled by setting the environment variable DOCKER_CLI_EXPERIMENTAL . Let's enable it for our current terminal session by running:

$ export DOCKER_CLI_EXPERIMENTAL=enabled

Verify that you now have access to buildx by checking your version:

$ docker buildx version github.com/docker/buildx v0.3.1-tp-docker 6db68d029599c6710a32aa7adcba8e5a344795a7

Optional: Build from source

If you want the bleeding edge release or in case setting DOCKER_CLI_EXPERIMENTAL is not working for you (I couldn't get it to work on Arch Linux for example), you can always build from source:

$ export DOCKER_BUILDKIT=1 $ docker build --platform=local -o . git://github.com/docker/buildx $ mkdir -p ~/.docker/cli-plugins && mv buildx ~/.docker/cli-plugins/docker-buildx

Step 2 — Enable binfmt_misc to run non-native Docker images

If you're using Docker Desktop (Mac and Windows), you can skip this step because binfmt_misc is set up by default.

If you're on Linux, you need to set up binfmt_misc. This is pretty easy in most distributions, but is even easier now that you can just run a privileged Docker container to set it up for you:

$ docker run --rm --privileged docker/binfmt:66f9012c56a8316f9244ffd7622d7c21c1f6f28d

Verify that binfmt_misc is set up correctly by inspecting the QEMU handlers:

$ ls -al /proc/sys/fs/binfmt_misc/ total 0 drwxr-xr-x 2 root root 0 Nov 12 09:19 . dr-xr-xr-x 1 root root 0 Nov 12 09:16 .. -rw-r--r-- 1 root root 0 Nov 12 09:25 qemu-aarch64 -rw-r--r-- 1 root root 0 Nov 12 09:25 qemu-arm -rw-r--r-- 1 root root 0 Nov 12 09:25 qemu-ppc64le -rw-r--r-- 1 root root 0 Nov 12 09:25 qemu-s390x --w------- 1 root root 0 Nov 12 09:19 register -rw-r--r-- 1 root root 0 Nov 12 09:19 status

And verify that the handlers are enabled, for example:

$ cat /proc/sys/fs/binfmt_misc/qemu-aarch64 enabled interpreter /usr/bin/qemu-aarch64 flags: OCF offset 0 magic 7f454c460201010000000000000000000200b7 mask ffffffffffffff00fffffffffffffffffeffff

Step 3 — Switch from the default Docker builder to a multi-arch builder

By default, Docker will use the old builder instance without multi-arch support.

To create a new builder with multi-arch support, run:

$ docker buildx create --use --name mybuilder

Verify that our new builder is selected:

$ docker buildx ls NAME/NODE DRIVER/ENDPOINT STATUS PLATFORMS mybuilder * docker-container mybuilder0 unix:///var/run/docker.sock inactive default docker default default running linux/amd64, linux/arm64, linux/ppc64le, linux/s390x, linux/386, linux/arm/v7, linux/arm/v6

That's it — now Docker will use our new builder that's capable of building for multiple platforms.

Step 4 — Build a multi-arch image

OK, now we can finally build a multi-arch image! To do that, we'll first need an example app.

Let's create a simple Go program that echoes back the host's runtime architecture:

$ cat hello.go package main import ( "fmt" "runtime" ) func main() { fmt.Printf("Hello, %s!

", runtime.GOARCH) }

And let's create a Dockerfile to containerize this app:

$ cat Dockerfile FROM golang:alpine AS builder RUN mkdir /app ADD . /app/ WORKDIR /app RUN go build -o hello . FROM alpine RUN mkdir /app WORKDIR /app COPY --from=builder /app/hello . CMD ["./hello"]

This is a multi-stage Dockerfile that builds our app with the Go compiler and creates a minimal Alpine Linux image with the resulting binary.

Now let's build a multi-arch image with buildx that supports arm, arm64, and amd64, and push it to Docker Hub all in one go:

$ docker buildx build -t mirailabs/hello-arch --platform=linux/arm,linux/arm64,linux/amd64 . --push

Yup, that's it. We now have a multi-arch Docker image for arm, arm64, and amd64 available on Docker Hub! When you run docker pull mirailabs/hello-arch , Docker will take care of fetching the matching image for your host architecture.

How does this buildx magic work, you ask? Well, behind the scenes, buildx builds three Docker images (one for each of arm, arm64, and amd64) using QEMU and binfmt_misc as needed. When it's done building, it will create a Docker manifest list which contains pointers to the three images. In other words, a "multi-arch image" is really just a manifest list with links to images built per architecture.

Step 5 — Test the multi-arch image

Let's quickly test our multi-arch image and make sure everything is working as expected. Since we have already set up binfmt_misc, we can actually run any of our images on our development machine, regardless of architecture.

First, we list the digests for each of our images:

$ docker buildx imagetools inspect mirailabs/hello-arch Name: docker.io/mirailabs/hello-arch:latest MediaType: application/vnd.docker.distribution.manifest.list.v2+json Digest: sha256:bbb246e520a23e41b0c6d38b933eece68a8407eede054994cff43c9575edce96 Manifests: Name: docker.io/mirailabs/hello-arch:latest@sha256:5fb57946152d26e64c8303aa4626fe503cd5742dc13a3fabc1a890adfc2683df MediaType: application/vnd.docker.distribution.manifest.v2+json Platform: linux/arm/v7 Name: docker.io/mirailabs/hello-arch:latest@sha256:cc6e91101828fa4e464f7eddec3fa7cdc73089560cfcfe4af16ccc61743ac02b MediaType: application/vnd.docker.distribution.manifest.v2+json Platform: linux/arm64 Name: docker.io/mirailabs/hello-arch:latest@sha256:cd0b32276cdd5af510fb1df5c410f766e273fe63afe3cec5ff7da3f80f27985d MediaType: application/vnd.docker.distribution.manifest.v2+json Platform: linux/amd64

With these digests handy, we can run each image and observe the output:

$ docker run --rm docker.io/mirailabs/hello-arch:latest@sha256:5fb57946152d26e64c8303aa4626fe503cd5742dc13a3fabc1a890adfc2683df Hello, arm! $ docker run --rm docker.io/mirailabs/hello-arch:latest@sha256:cc6e91101828fa4e464f7eddec3fa7cdc73089560cfcfe4af16ccc61743ac02b Hello, arm64! $ docker run --rm docker.io/mirailabs/hello-arch:latest@sha256:cd0b32276cdd5af510fb1df5c410f766e273fe63afe3cec5ff7da3f80f27985d Hello, amd64!

That was pretty easy, wasn't it?

Conclusion

To recap, in this post we learnt about the challenges of supporting software on multiple CPU architectures, and how buildx, an experimental extension to Docker's build engine, can solve some of these challenges for us. Using buildx, we were able to quickly build a multi-arch Docker image for arm, arm64, and amd64 without a single change to our Dockerfile, and push it up to Docker Hub, from where any Docker-supported platform could transparently pull down the correct image for its architecture.

In the future, it is likely that buildx capabilities will become part of the standard docker build command and we will end up taking these features for granted. The tales of descending into the depths of the computing stack to cross-compile programs will soon be nothing more than ghost stories from a more primitive age.

Go forth and multi-arch without fear!

References