﻿Update﻿: Hetzner Cloud now offers load balancers, so this is no longer required. Check their website for more information.

I am working on a Rails app that allows users to add custom domains, and at the same time the app has some realtime features implemented with web sockets. I’m using the Nginx ingress controller in Kubernetes, as it’s the default ingress controller and it’s well supported and documented. Unfortunately, Nginx cuts web sockets connections whenever it has to reload its configuration. When a user of my app adds a custom domain, a new ingress resource is created triggering a config reload, which causes disruptions with the web sockets connections. There are other ingress controllers like haproxy and Traefik which seem to have a more dynamic reconfiguration than Nginx, but I prefer using Nginx.

So one way I figured I could prevent Nginx’s reconfiguration from affecting web sockets connections is to have separate deployments of the ingress controller for the normal web traffic and for the web sockets connections. This way, when the Nginx controller for the normal http traffic has to reload its configuration, web sockets connections are not interrupted. To have multiple deployments of the Nginx controller in the same Kubernetes cluster, the controller has to be installed with a NodePort service or a LoadBalancer service. Unfortunately my provider Hetzner Cloud, while a great service overall at competitive prices, doesn’t offer a load balancer service yet, so I cannot provision load balancers from within Kubernetes like I would be able to do with bigger cloud providers.

Because of this, I decided to set up a highly available load balancer external to Kubernetes that would proxy all the traffic to the two ingress controllers. I did this using by installing the two ingress controller with a service of type NodePort, and setting up two nodes with haproxy as the proxy and keepalived with floating IPs, configured in such a way that there is always one load balancer active. This way, if one load balancer node is down, the other one becomes active within 1-2 seconds with minimal to no downtime for the app. In this post, I am going to show how I set this up for other customers of Hetzner Cloud who also use Kubernetes. Please note that if you only need one ingress controller, this is not really needed. You could just use one ingress controller configured to use the host ports directly.

Provisioning

The first thing you need to do, is create two servers in Hetzner Cloud that will serve as the two load balancers. It’s important that you name these severs lb1 and lb2 if you are following along with my configuration, to make scripts etc easier. You can use the cheapest servers since the load will be pretty light most of the time unless you have a lot of traffic; I suggest servers with Ceph storage instead of NVMe because over the span of several months I found that the performance, while lower, is kinda more stable - but up to you of course.

You will also need to create one or more floating IPs depending on how many ingress controllers you want to load balance with this setup. In my case I have two floating IPs, one for the ingress that handles normal http traffic, and the other for the ingress that handles web sockets connections. The names of the floating IPs are important and must match those specified in a script we’ll see later - in my case I have named them http and ws. keepalived will ensure that these floating IPs are always assigned to one load balancer at any time. You’ll need to configure the DNS settings for your apps to use these floating IPs instead of the IPs of the cluster nodes.

In order for the floating IPs to work, both load balancers need to have the main network interface eth0 configured with those IPs. On Debian system, you need to create a config file as follows (all the steps from now on myst be executed on each load balancer):

cat > /etc/network/interfaces.d/60-my-floating-ip.cfg <<EOF auto eth0:1 iface eth0:1 inet static address <floating IP 1> netmask 32 auto eth0:2 iface eth0:2 inet static address <floating IP 2> netmask 32 EOF

Then you need to restart the networking service to apply this configuration:

sudo service networking restart

If you use a CentOS/RedHat system take a lot at this page.

Installing keepalived

We’ll install keepalived from source because the version bundled with Ubuntu is old. First you need to install some dependencies so that you can compile the software:

apt update apt-get install build-essential libssl-dev

Then you can compile and install:

cd ~ wget [http://www.keepalived.org/software/keepalived-2.0.20.tar.gz](http://www.keepalived.org/software/keepalived-2.0.20.tar.gz) tar xzvf keepalived* cd keepalived-2.0.20 ./configure make sudo make install

Next, we need to create a service:

cat > /etc/systemd/system/keepalived.service <<EOF # # keepalived control files for systemd # # Incorporates fixes from RedHat bug #769726. [Unit] Description=LVS and VRRP High Availability monitor After=network.target ConditionFileNotEmpty=/etc/keepalived/keepalived.conf [Service] Type=simple # Ubuntu/Debian convention: EnvironmentFile=-/etc/default/keepalived ExecStart=/usr/local/sbin/keepalived --dont-fork ExecReload=/bin/kill -s HUP $MAINPID # keepalived needs to be in charge of killing its own children. KillMode=process [Install] WantedBy=multi-user.target EOF

and enable it:

sudo systemctl enable keepalived

Finally, we need a configuration file that will differ slightly between the primary load balancer (MASTER) and the secondary one (BACKUP). On the primary LB:

cat > /etc/keepalived/keepalived.conf <<EOF global_defs { script_user root enable_script_security } vrrp_script chk_haproxy { script "/usr/bin/pgrep haproxy" interval 2 } vrrp_instance VI_1 { interface eth0 state MASTER priority 200 virtual_router_id 33 unicast_src_ip <IP of the primary load balancer> unicast_peer { <IP of the secondary load balancer> } authentication { auth_type PASS auth_pass <a password - max 8 characters - that we'll be used by keepalived instances to communicate with each other> } track_script { chk_haproxy } notify_master /etc/keepalived/master.sh } EOF

On the secondary LB:

cat > /etc/keepalived/keepalived.conf <<EOF global_defs { script_user root enable_script_security } vrrp_script chk_haproxy { script "/usr/bin/pgrep haproxy" interval 2 } vrrp_instance VI_1 { interface eth0 state BACKUP priority 100 virtual_router_id 33 unicast_src_ip <IP of the secondary load balancer> unicast_peer { <IP of the primary load balancer> } authentication { auth_type PASS auth_pass <same password as before> } track_script { chk_haproxy } notify_master /etc/keepalived/master.sh } EOF

Note that we are going to use the script /etc/keepalived/master.sh to automatically assign the floating IPs to the active node. By “active”, I mean a node with haproxy running - either the primary, or if the primary is down, the secondary.

Before the master.sh script can work, we need to install the Hetzner Cloud CLI. This is a handy (official) command line utility that we can use to manage any resource in an Hetzner Cloud project, such as floating IPs.

To install the CLI, you just need to download it and make it executable:

cd ~ wget https://github.com/hetznercloud/cli/releases/download/v1.16.1/hcloud-linux-amd64.tar.gz tar xvfz hcloud-linux-amd64.tar.gz chmod +x hcloud

Then we can create the script:

cat > /etc/keepalived/master.sh << 'EOF' #!/bin/bash export HCLOUD_TOKEN='<a token you need to create in the Hetzner Cloud project that has the load balancer servers and the floating IPs>' ME=`hcloud server describe $(hostname) | head -n 1 | sed 's/[^0-9]*//g'` HTTP_IP_CURRENT_SERVER_ID=`hcloud floating-ip describe http | grep 'Server:' -A 1 | tail -n 1 | sed 's/[^0-9]*//g'` WS_IP_CURRENT_SERVER_ID=`hcloud floating-ip describe ws | grep 'Server:' -A 1 | tail -n 1 | sed 's/[^0-9]*//g'` if [ "$HTTP_IP_CURRENT_SERVER_ID" != "$ME" ] ; then n=0 while [ $n -lt 10 ] do hcloud floating-ip assign http $ME && break n=$((n+1)) sleep 3 done fi if [ "$WS_IP_CURRENT_SERVER_ID" != "$ME" ] ; then n=0 while [ $n -lt 10 ] do hcloud floating-ip assign ws $ME && break n=$((n+1)) sleep 3 done fi EOF

The script is pretty simple. All it does is check if the floating IPs are currently assigned to the other load balancer, and if that’s the case assign the IPs to the current load balancer. Specifically, this script will be executed on the primary load balancer if haproxy is running on that node but the floating IPs are assigned to the secondary load balancer; or on the secondary load balancer, if the primary is down.

Don’t forget to make the script executable:

chmod +x /etc/keepalived/master.sh

Then restart keepalived:

service keepalived restart

haproxy

haproxy is what takes care of actually proxying all the traffic to the backend servers, that is, the nodes of the Kubernetes cluster. Each Nginx ingress controller needs to be installed with a service of type NodePort that uses different ports. For example, for the ingress controller for normal http traffic I use the port 30080 for the port 80 and 30443 for the port 443; for the ingress controller for web sockets, I use 31080 => 80, and 31443 => 443.

First, we need to install haproxy:

apt install haproxy

Then we need to configure it with frontends and backends for each ingress controller. To create/update the config, run:

cat > /etc/haproxy/haproxy.cfg << 'EOF' global log /dev/log local0 log /dev/log local1 notice chroot /var/lib/haproxy stats socket /run/haproxy/admin.sock mode 660 level admin expose-fd listeners stats timeout 10s user haproxy group haproxy daemon maxconn 10000 # Default SSL material locations ca-base /etc/ssl/certs crt-base /etc/ssl/private # Default ciphers to use on SSL-enabled listening sockets. # For more information, see ciphers(1SSL). This list is from: # https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/ # An alternative list with additional directives can be obtained from # https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS ssl-default-bind-options no-sslv3 defaults log global mode tcp option tcplog option dontlognull timeout connect 5000 timeout client 10000 timeout server 10000 errorfile 400 /etc/haproxy/errors/400.http errorfile 403 /etc/haproxy/errors/403.http errorfile 408 /etc/haproxy/errors/408.http errorfile 500 /etc/haproxy/errors/500.http errorfile 502 /etc/haproxy/errors/502.http errorfile 503 /etc/haproxy/errors/503.http errorfile 504 /etc/haproxy/errors/504.http frontend http mode tcp bind <floating IP for http traffic>:80 option tcplog default_backend http frontend https mode tcp bind <floating IP for http traffic>:443 option tcplog default_backend https backend http balance roundrobin mode tcp server http1 <IP of Kubernetes cluster node 1>:30080 check send-proxy-v2 server http2 <IP of Kubernetes cluster node 2>:30080 check send-proxy-v2 ... server httpN <IP of Kubernetes cluster node N>:30080 check send-proxy-v2 backend https balance roundrobin mode tcp option ssl-hello-chk server http1 <IP of Kubernetes cluster node 1>:30443 check send-proxy-v2 server http2 <IP of Kubernetes cluster node 2>:30443 check send-proxy-v2 ... server httpN <IP of Kubernetes cluster node N>:30443 check send-proxy-v2 frontend ws mode tcp bind <floating IP for web sockets>:80 option tcplog default_backend ws frontend wss mode tcp bind <floating IP for web sockets>:443 option tcplog default_backend wss backend ws balance roundrobin mode tcp server ws1 <IP of Kubernetes cluster node 1>:31080 check send-proxy-v2 server ws2 <IP of Kubernetes cluster node 2>:31080 check send-proxy-v2 ... server wsN <IP of Kubernetes cluster node N>:31080 check send-proxy-v2 backend wss balance roundrobin mode tcp option ssl-hello-chk server ws1 <IP of Kubernetes cluster node 1>:31443 check send-proxy-v2 server ws2 <IP of Kubernetes cluster node 2>:31443 check send-proxy-v2 ... server wsN <IP of Kubernetes cluster node N>:31443 check send-proxy-v2 EOF

A few important things to note in this configuration:

mode is set to tcp. This is required to proxy “raw” traffic to Nginx, so that SSL/TLS termination can be handled by Nginx;

send-proxy-v2 is also important and ensures that information about the client including the source IP address are sent to Nnginx, so that Nginx can “see” the actual IP address of the user and not the IP address of the load balancer. Remeber to set use-proxy-protocol to true in the ingress configmap.

Finally, you need to restart haproxy to apply these changes:

service haproxy restart

If all went well, you will see that the floating IPs will be assigned to the primary load balancer automatically - you can see this from the Hetzner Cloud console. To ensure everything is working properly, shutdown the primary load balancer: the floating IPs should be assigned to the secondary load balancer. When the primary is back up and running, the floating IPs will be assigned to the primary once again. The switch takes only a couple seconds tops, so it’s pretty quick and it should cause almost no downtime at all.

Somehow I wish I could solve my issue directly within Kubernetes while using Nginx as ingress controller, or better that Hetzner Cloud offered load balancers, but this will do for now. Perhaps I should mention that there is another option with the Inlets Operator, which takes care of provisioning an external load balancer with DigitalOcean or other providers, when your provider doesn’t offer load balancers or when your cluster is on prem or just on your laptop, not exposed to the Internet. It’s an interesting option, but Hetzner Cloud is not supported yet so I’d have to use something like DigitalOcean or Scaleway with added latency; plus, I couldn’t find some information I needed in the documentation and I didn’t have much luck asking for this information. Load balancers provisioned with Inlets are also a single point of failure, because only one load balancer is provisioned in a non-HA configuration.

For now, this setup with haproxy and keepalived works well and I’m happy with it. It’s cheap and easy to set up and automate with something like Ansible - which is what I did.