tl;dr - After a bad kernel upgrade ( pacman -sYu ) on my Arch-powered server I decided to go back to Container Linux, after being equal parts annoyed by Arch and encouraged by the Press release put out by red hat. This time, I spent much more time with the Ignition config files in conjunction with kubeadm and ended up with a bootable master node. Feel free to check out the tldr at the end.

After the relatively recent acquisition of CoreOS by Red Hat I switched off of Container Linux, the linux distribution maintained by CoreOS for use in “cloud-native” heavily containerized environments. I (prematurely) thought that Red Hat was going to kill it off in support of their own Atomic project, but it looks like Container Linux will live on, according to their recent press release. The premise of CoreOS is really attractive – it’s basically a distribution that is built to do not much more than run containers and be relatively secure while doing so, also with a good update mechanism to prevent updates that break the machine.

Along with this cluster install I’m also going to go from Kubernetes 1.9 -> 1.10 (partly the reason I was messing around with the cluster to begin with). After the kernel upgrade on the old machine, the next reboot was prompted with an error saying that the iptables kernel module was not found (which I could only see through a direct KVM connection, sshd couldn’t running). This is basically unacceptable – updating with pacman -Syu should not break my system on a restart, and I’m super tired of it happening (it’s happened once before). I’m a bit dissapointed because outside of this, Arch is super stable, well documented, and a relatively minimal OS without too many security holes. It moves at the pace that’s just about perfect for me, but it looks like it isn’t the best fit for me, as a lazy consumer, and especially in the context of servers as cattle. So rather than an in-place k8s upgrade (1.9 -> 1.10), it turned into a full cluster re-install – I’m pretty glad I don’t have any “production” critical workflows.

coreos + ignition + kubeadm = <3

While I was pretty frustrated to have my cluster need to be rebuilt, rebuilding the cluster from scratch allowed me to revisit how I build the cluster and learn a bit more about the ecosystem. Unfortunately, there’s a bit of context missing here that I haven’t written about yet, I actually build my cluster “the hard way” once (that blog post is still in the raw state, I haven’t released it), so this time I wanted to give kubeadm a try.

I chose kubeadm instead of other tools like kubespray or kops because it is (and has always been IIRC) the best choice for a “baremetal” installation, outside of a custom cluster install – so no Terraform or AWS related set up for me. kubeadm has the ideal interface for me - assuming the machine is set up properly initially, it’s down to kubeadm init and kubeadm join commands. Doesn’t get simpler than that.

One big change this time around was also that I chose to go with the hosted control plane. This means that all I need to set up on the server is the kubelet – and all control plane components ( apiserver , controller-manager , scheduler , kube-proxy ) are actually created/managed by the kubelet , by using manifests (stored @ /etc/kubernetes/manifests ). This was actually the pattern I used the very first time I set up Kubernetes on a server running CoreOS using their guide (that is now mostly gone). The next time that I did it, using the hard-way guide (again, I haven’t written about this yet), I started the control plane components outside kubelet and managed them with systemd .

OK now let’s get into the scripts I actually used to get everything done.

Tool download script

I use a baremetal provider I really like, Hetzner (famous for their Robot Market), and they provide a “rescue” image that one can use, so the starting point is there. It’s basically like running a LiveCD, you can access the server’s file systems and whatever else is necessary to install an OS. Unfortunately they don’t support CoreOS as an installable OS right now but this just means I needed to follow the CoreOS guide for installing Container Linux to disk.

You might want to read a little about Container Linux before getting reading on – don’t worry, I’ll wait.

To get Container Linux installed to disk, we’re going to need two tools primarily – ct (the Config Transpiler) and the coreos-install script. Here’s a quick script to download these two tools:

#!/bin/bash echo "Installing coreos-install..." curl -sSL https://raw.githubusercontent.com/coreos/init/master/bin/coreos-install > /bin/coreos-install chmod +x /bin/coreos-install echo "Installing ct..." curl -sSL https://github.com/coreos/container-linux-config-transpiler/releases/download/v0.8.0/ct-v0.8.0-x86_64-unknown-linux-gnu > /bin/ct echo "9f9d9d9c802a6dc875067295aa9d7f44f1e66914cc992cf215dd459ee2b719fde4ebfa036bb8488cfd87ae2efafc5d767de776fe11a4661fc45e8d54385953a4 ct" | sha512sum -c || (echo "ERROR: SHA512 Hash does not match for ct v0.8.0" && exit 1) chmod +x /bin/ct

This script could be better – a sha512sum check on coreos-install would be good, but I don’t know how often the coreos-install script changes, and the init repo from CoreOS it’s in doesn’t seem to do releases. I did include one for ct (since it’s @ v0.8.0 ), so that’s an example.

Ignition YAML configuration

The YAML configuration that is going to be fed into ct is pretty intense, but it completely lays out the process. On the CoreOS site, the listed latest Ignition Specification is 2.1, however if you download ct version 0.8.0 (like above) you can find the changed specification for it in the github repo.

Without any further ado, here is the (currently working) monstrosity:

# This config is meant to be consumed by the config transpiler, which will # generate the corresponding Ignition config. Do not pass this config directly # to instances of Container Linux. # NOTE: This configuration is meant to work with Config Transpiler v0.8.0 # The spec is available at (https://github.com/coreos/container-linux-config-transpiler/blob/v0.8.0/doc/configuration.md) passwd: users: - name: core ssh_authorized_keys: - ssh-rsa <REST OF SSH KEY> user@somewhere.com - ssh-rsa <REST OF ANOTHER SSH KEY> user@somewhere.com systemd: units: # Docker will be configured initially but we'll be using containerd exclusively and will disable it after containerd setup - name: docker.service enabled: true # containerd without docker as a shim, thanks to containerd.service.d/ overrides - name: containerd.service enabled: true - name: k8s-install.service enabled: true contents: | [Install] WantedBy=multi-user.target [Unit] Description=k8s installation script Wants=network-online.target After=network.target network-online.target [Service] Type=oneshot ExecStart=/ignition/init/k8s/install.sh - name: cni-install.service enabled: true contents: | [Install] WantedBy=multi-user.target [Unit] Description=cni plugin installation script Requires=k8s-install.service After=k8s-install.service [Service] Type=oneshot ExecStart=/ignition/init/cni/install.sh - name: containerd-install.service enabled: true contents: | [Install] WantedBy=multi-user.target [Unit] Description=containerd installation script Requires=cni-install.service After=cni-install.service [Service] Type=oneshot ExecStart=/ignition/init/cri-containerd/install.sh - name: kubeadm-install.service enabled: true contents: | [Install] WantedBy=multi-user.target [Unit] Description=kubeadm installation script Requires=containerd-install.service After=containerd-install.service [Service] Type=oneshot Environment="PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/opt/bin" ExecStart=/ignition/init/kubeadm/kubeadm-install.sh - name: k8s-setup.service enabled: true contents: | [Install] WantedBy=multi-user.target [Unit] Description=kubernetes setup script Requires=kubeadm-install.service After=kubeadm-install.service [Service] Type=oneshot User=core Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/bin" ExecStart=/ignition/init/k8s/setup.sh storage: filesystems: - mount: device: /dev/disk/by-label/ROOT format: xfs wipe_filesystem: true label: ROOT files: - path: /opt/bin/kubeadm filesystem: root mode: 493 # 0755 contents: remote: url: https://storage.googleapis.com/kubernetes-release/release/v1.10.2/bin/linux/amd64/kubeadm verification: hash: function: sha512 sum: fc96e821fd593a212c632a6c9093143fab5817f6833ba1df1ced2ce4fb82f1ebefde71d9a898e8f9574515e9ba19e40f6ab09a907f6b1b908d7adfcf57b3bf8b - path: /opt/bin/kubelet filesystem: root mode: 493 # 0755 contents: remote: url: https://storage.googleapis.com/kubernetes-release/release/v1.10.2/bin/linux/amd64/kubelet verification: hash: function: sha512 sum: 5cf4bde886d832d1cc48c47aeb43768050f67fe0458a330e4702b8071567665c975ed1fe2296cba5aea95a6de0bec4b731a32525837cac24646fb0158e2c2f64 - path: /opt/bin/kubectl filesystem: root mode: 511 # 0777 contents: remote: url: https://storage.googleapis.com/kubernetes-release/release/v1.10.2/bin/linux/amd64/kubectl verification: hash: function: sha512 sum: 38a2746ac7b87cf7969cf33ccac177e63a6a0020ac593b7d272d751889ab568ad46a60e436d2f44f3654e2b4b5b196eabf8860b3eb87368f0861e2b3eb545a80 - path: /etc/systemd/system/kubelet.service filesystem: root mode: 420 # 0644 contents: remote: url: https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.2/build/debs/kubelet.service verification: hash: function: sha512 sum: b9ca0db34fea67dfd0654e65d3898a72997b1360c1e802cab5adc4288199c1a08423f90751757af4a7f1ff5932bfd81d3e215ce9b9d3f4efa1c04a202228adc8 - path: /etc/systemd/system/kubelet.service.d/10-kubeadm.conf filesystem: root mode: 420 # 0644 contents: remote: url: https://raw.githubusercontent.com/kubernetes/kubernetes/v1.10.2/build/debs/10-kubeadm.conf verification: hash: function: sha512 sum: 32cfc8e56ec6e5ba93219852a68ec5eb25938a39c3e360ea4728fc71a14710b6ff85d0d84c2663eb5297d5dc21e1fad6914d6c0a8ce3357283f0b98ad4280ef7 - path: /ignition/init/cri-containerd/cri-containerd-1.1.0.linux-amd64.tar.gz filesystem: root mode: 420 # 0644 contents: remote: url: https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.1.0.linux-amd64.tar.gz verification: hash: function: sha512 sum: c5db1b99e1155ed0dfe945bc1d53842fd015cb891f8b773c60fea73f9fe7c3e0bda755133765aa618c08765eb13dbf244affb5a1572d5a888ff4298ba3d790cf - path: /ignition/init/cni/cni-plugins-v0.7.1.tgz filesystem: root mode: 420 # 0644 contents: remote: url: https://github.com/containernetworking/plugins/releases/download/v0.7.1/cni-plugins-amd64-v0.7.1.tgz verification: hash: function: sha512 sum: b3b0c1cc7b65cea619bddae4c17b8b488e7e13796345c7f75e069af93d1146b90a66322be6334c4c107e8a0ccd7c6d0b859a44a6745f9b85a0239d1be9ad4ccd - path: /ignition/init/canal/rbac.yaml filesystem: root mode: 493 # 0755 contents: remote: url: https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/canal/rbac.yaml verification: hash: function: sha512 sum: e045645a1f37b4974890c3e4f8505a10bbb138ed0723869d7a7bc399c449072dfd2c8c2c482d3baac9bf700b7b0cfdca122cb260e70b437fb495eb86f9f6cccc - path: /ignition/init/canal/canal.yaml filesystem: root mode: 493 # 0755 contents: remote: url: https://docs.projectcalico.org/v3.1/getting-started/kubernetes/installation/hosted/canal/canal.yaml verification: hash: function: sha512 sum: 5c4953d74680a2ff84349c730bba48d0d43e28b70217c59cdf736f65e198f362c25b534b686c10ed5370dbb1bca4a1326e42039ddee7eb7b0dd314cc08523b67 - path: /ignition/init/k8s/install.sh filesystem: root mode: 480 # 740 contents: inline: | #!/bin/bash # Unzip the kubernetes binaries if not already present test -d /opt/bin/kubeadm && echo "k8s binaries (kubeadm) already installed" && exit 0 # NOTE: If RELEASE is updated, the SHA512 SUMs will need to be as well echo -e "=> Installing k8s v1.10.2" echo "=> Cusomizing kubelet.service..." sed -i "s:/usr/bin:/opt/bin:g" /etc/systemd/system/kubelet.service sed -i "s:/usr/bin:/opt/bin:g" /etc/systemd/system/kubelet.service.d/10-kubeadm.conf systemctl daemon-reload systemctl enable kubelet systemctl start kubelet - filesystem: root path: /ignition/init/cri-containerd/install.sh mode: 480 # 740 contents: inline: | #!/bin/bash # Unzip the kubernetes binaries if not already present test -d /opt/containerd && echo "containerd binaries already installed" && exit 0 VERSION=1.1.0 echo -e "=> Installing containerd v${VERSION}" echo "=> Installing containerd...." cd /ignition/init/cri-containerd tar -C / -k -xzf cri-containerd-${VERSION}.linux-amd64.tar.gz echo "=> Copying /usr/local binaries to /opt/bin ...." mkdir -p /ignition/init/cri-containerd/unzipped tar -C unzipped -k -xzf cri-containerd-${VERSION}.linux-amd64.tar.gz cp -r unzipped/usr/local/bin/* /opt/bin systemctl start containerd echo "=> Adding dropins...." cat > /etc/systemd/system/kubelet.service.d/0-containerd.conf <<EOF [Service] Environment="KUBELET_EXTRA_ARGS=--container-runtime=remote --runtime-request-timeout=15m --container-runtime-endpoint=unix:///run/containerd/containerd.sock --volume-plugin-dir=/var/lib/kubelet/volumeplugins" EOF mkdir -p /etc/systemd/system/containerd.service.d/ cat > /etc/systemd/system/containerd.service.d/0-direct-containerd.conf <<EOF [Service] ExecStart= ExecStart=/opt/bin/containerd EOF echo "=> Triggering systemctl daemon-reload...." systemctl daemon-reload systemctl restart containerd - filesystem: root path: /ignition/init/cni/install.sh mode: 480 # 740 contents: inline: | #!/bin/bash # Unzip the kubernetes binaries if not already present test -d /opt/cni/bin && echo "CNI binaries already installed" && exit 0 VERSION=0.7.1 echo -e "=> Installing CNI (v${VERSION}) binaries to /opt/cni/bin" cd /ignition/init/cni mkdir -p /opt/cni/bin tar -C /opt/cni/bin -k -xzf cni-plugins-v${VERSION}.tgz - filesystem: root path: /ignition/init/kubeadm/kubeadm-install.sh mode: 480 # 740 contents: inline: | #!/bin/bash # Ensure kubeadm binary is present test -f /opt/bin/kubeadm || (echo "Failed to find kubeadm binary" && exit 1) # Exit if kubeadm has already been run (/etc/kubernetes folder would have been created) test -d /etc/kubernetes && echo "/etc/kubernetes is present, kubeadm should have already been run once" && exit 0 echo "=> Running kubeadm init..." /opt/bin/kubeadm init --cri-socket "/run/containerd/containerd.sock" --pod-network-cidr "10.244.0.0/16" # Disable docker (kubelet will use containerd runtime directly) sudo systemctl stop docker sudo systemctl disable docker echo "=> Running kubeadm post-install set up for user 'core'" mkdir -p /home/core/.kube cp -i /etc/kubernetes/admin.conf /home/core/.kube/config chown $(id -u core):$(id -g core) /home/core/.kube/config - filesystem: root path: /ignition/init/k8s/setup.sh mode: 493 # 0755 contents: inline: | #!/bin/bash # Ensure /etc/kubernetes is present (created by kubeadm) test -d /etc/kubernetes || (echo "/etc/kubernetes not present, ensure kubeadm has run properly" && exit 1) echo "=> Enabling workload running on the master node" kubectl taint nodes --all node-role.kubernetes.io/master- echo "=> Installing canal" kubectl apply -f /ignition/init/canal/rbac.yaml kubectl apply -f /ignition/init/canal/canal.yaml

This is a lot to take in, and is pretty hacky, but it worked for me, and I didn’t have to invest too much in learning another tool. This kind of setup is definitely best automated with some tool like Ansible, but since I have to work with this file from the rescue OS I’m leaving it as-is. Currently there doesn’t seem to be a way to submit an image/userdata currently for baremetal servers on Hetzner – you have to request a KVM to load an ISO, so jamming everything into this file is good enough for me. Terraform would also have been a good candidate here, but their support for baremetal providers was somewhat lacking last time I checked (maybe that’s changed these days).

Astute readers might have also noticed that I moved off of using kube-router – I had some problems getting it set up properly in this setup with kubeadm and got frustrated, enough to just go with [canal]canal instead, which I’ve used in the past. I don’t really have good descriptions of the crashes I was running into (generally kube-router just crashing with no output and restarting despite being on the highest verbosity level --v=3 ), but I’ll probably give kube-router another try sometime in the future.

The overall install process looked something like this:

Boot the server into recovery mode (this specific to my baremetal provider, the rescue image is like when you boot off a LiveCD ISO) (from the installation computer) scp download-tools.sh root@<machine>:/download-tools.sh && scp master/ignition.yaml root@<machine>:/ignition.yaml (on the node) # /download-tools.sh (on the node) # ct -in-file /ignition.yaml -out-file /ignition.json (on the node) # coreos-install -d /dev/sda -i /ignition.json

Improvements

A few ways I can think of to improve upon this:

A different systemd target than multi-user.target , so I can get more

, so I can get more Run Kubernetes cluster checks after waiting for the cluster to be ready

There’s probalby much more, but I just haven’t done much thinking :).

TLDR

From a recovery/LiveCD (Hetzner rescue mode) environment on my dedicated server I was able to install & initialize CoreOS (using ct and coreos-install ) using Ignition in a way that sets up a Kubernetes master node with kubeadm . For my specific usecase, the master taint is removed so I can run workloads on it, and presto, I have a easy-to-boot master node. The process looks like this:

0. Boot the server into recovery mode (this specific to my baremetal provider, the rescue image is like when you boot off a LiveCD ISO)

1. Copy a script to install ct and coreos-install to the machine (see above for the download-tools script contents)

$ scp download-tools.sh root@<machine>:/download-tools.sh && scp master/ignition.yaml root@<machine>:/ignition.yaml

2. Install the Container Linux toolchain (on the node)

$ /download-tools.sh`

3. Generate the Ignitinon JSON (on the node, check above for the YAML content)

$ ct -in-file /ignition.yaml -out-file /ignition.json`

4. Run coreos-install (on the node)

$ coreos-install -d /dev/sda -i /ignition.json`

Wrapup

It took a lot of iterations (LOTS of KVM debugging and machine reboots) to get to this point, but I’m pretty happy with it, and figured I might try and save some people some time if they happen to be doing something similar.

Hope you find the post useful!