Mozilla launched a “free, automated and open” certificate authority called Let’s encrypt. As the name suggests, it provides free certificates trusted by all (major) browsers and operating systems. I’m using it heavily (on this blog, for example).

This blog post shows how Syncthing can be used to deploy letsencrypt certificates in an environment with multiple servers (e.g. in a round-robin scenario) without adding a single-point-of-failure.

ACME

Let’s encrypt automated the process of requesting and authenticating a certificate using a protocol called ACME. The client requesting a new certificate uses a .well-known path on its webserver where it places a challenge, and Let’s encrypt retrieves this challenge for authentification.

The actual process is a little more complicated, though. If you want to know how it works in detail, I recommend Let’s encrypt’s excellent ACME documentation.

The problem in high availability setups

When using multiple servers for SSL termination (e.g. in the load-balancing scenario described in the picture below, where SSL termination is handled by the nginx instances) each one requires a certificate for the domain(s) they are serving.

In a setup that e.g. uses a round-robin, we can’t guarantee that the incoming request for the ACME challenge ends up on the server actually requesting the certificate. Furthermore, each server needs to request (and renew) its own certificates.

The cleanest solution I found for this problem is to share the .well-known challenge directory (and maybe even the certificate) between multiple servers.

Syncthing to the rescue!

The tool I found best to syncronize those directories was Syncthing. It is one of the most exiting tools for file-sharing, as it is completely decentralized and works without any central server (but can be configured to use one, if required), is fully peer-to-peer, open-soure, written in Go and cross-platform.

Syncthing fulfills all items on my wishlist:

Traffic between the instances is encrypted

The setup is automatically deployable

Instances can be easily added or removed

No single-point-of-failure (all nodes connect to each other, syncronizing the same directory between all machines)

No additional services required

I chose it to syncronize the /etc/nginx/certs directory. It shares the dhparams, SSL certificates and the ACME challenges between all nginx instances. Here’s what the shared directory looks like:

$ tree -a . ├── .stfolder ├── acme │ └── .well-known │ └── acme-challenge │ ├── 8xdoeH5OLPUij4xxxxxxxxxxxxxxxxxxxxxxxxxxxxx │ ├── cWaLNpzt_8v--xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx │ └── _wsvWIOyvP-45vt-xxxxxxxxxxxxxxxxxxxxxxxxxxx ├── dhparam.pem ├── www.example.com.crt └── www.example.com.key 3 directories, 7 files

Implementation

We’re using Chef to automate our infrastructure at flinc, but the process should be easily adaptable to a automation tool of your choice.

Syncthing is easily deployed, as there’s an official repository available:

Install Syncthing

# Use the official Syncthing apt repositories apt_repository 'syncthing-release' do uri 'http://apt.syncthing.net' distribution 'syncthing' components %w(release) key 'https://syncthing.net/release-key.txt' end package 'syncthing'

Set up systemd service

A systemd.service is quickly crafted from the provided example:

[Unit] Description = Syncthing - Open Source Continuous File Synchronization for %I Documentation = man:syncthing(1) After = network.target [Service] User = %i ExecStart = /usr/bin/syncthing -no-browser -no-restart -logflags=0 Restart = on-failure SuccessExitStatus = 3 4 RestartForceExitStatus = 3 4 [Install] WantedBy = multi-user.target

Generate Syncthing configuration and keys

We want to centrally manage our instances, so Syncthing certificates are stored centrally in Chef’s encrypted data bags, alongside their device IDs and API keys. Here’s how to generate and extract everything that’s required:

First, generate a new key-pair and save the device ID and API key for each node:

NODE = 1.nginx.example.com syncthing --generate = $NODE |grep ID |awk '{ print $5 }' > $NODE/device_id grep apikey $NODE/config.xml |cut -d \> -f2 |cut -d \< -f1 > $NODE/apikey rm $NODE/config.xml

The resulting key.pem and cert.pem will then be deployed into the .config/syncthing directory on the target machine.

After using Syncthing’s web-interface to configure the share, the resulting config.xml was then used to craft the following ERB template:

<configuration version= "16" > <!-- This is our shared folder. Scan it every 5s, so updates are syncronized quickly --> <folder id= "vm623-hxlsp" label= "letsencrypt" path= "/etc/nginx/certs/" type= "readwrite" rescanIntervalS= "5" ignorePerms= "false" autoNormalize= "true" > <!-- Share the folder between all nodes --> < % @nodes.each do |name, config| %> <device id= "<%= config['id'] %>" ></device> < % end %> <!-- Share settings. Default settings, with simple versioning --> <minDiskFreePct> 1 </minDiskFreePct> <versioning type= "simple" > <param key= "keep" val= "5" ></param> </versioning> <copiers> 0 </copiers> <pullers> 0 </pullers> <hashers> 0 </hashers> <order> random </order> <ignoreDelete> false </ignoreDelete> <scanProgressIntervalS> 0 </scanProgressIntervalS> <pullerSleepS> 0 </pullerSleepS> <pullerPauseS> 0 </pullerPauseS> <maxConflicts> 10 </maxConflicts> <disableSparseFiles> false </disableSparseFiles> <disableTempIndexes> false </disableTempIndexes> </folder> <!-- Make sure all nodes are connected to one another --> < % @nodes.each do |name, config| %> <device id= "<%= config['id'] %>" name= "<%= name %>" compression= "metadata" introducer= "false" > <address> < %= config['address'] %> </address> </device> < % end %> <gui enabled= "true" tls= "false" debugging= "false" > <address> 127.0.0.1:8384 </address> <apikey> < %= @apikey %> </apikey> <theme> default </theme> </gui> <options> <listenAddress> default </listenAddress> <!-- Disable announcement, as we're automatically adding all servers above --> <globalAnnounceServer> default </globalAnnounceServer> <globalAnnounceEnabled> false </globalAnnounceEnabled> <localAnnounceEnabled> false </localAnnounceEnabled> <localAnnouncePort> 21027 </localAnnouncePort> <localAnnounceMCAddr> [ff12::8384]:21027 </localAnnounceMCAddr> <maxSendKbps> 0 </maxSendKbps> <maxRecvKbps> 0 </maxRecvKbps> <reconnectionIntervalS> 60 </reconnectionIntervalS> <relaysEnabled> false </relaysEnabled> <relayReconnectIntervalM> 10 </relayReconnectIntervalM> <startBrowser> false </startBrowser> <natEnabled> false </natEnabled> <natLeaseMinutes> 60 </natLeaseMinutes> <natRenewalMinutes> 30 </natRenewalMinutes> <natTimeoutSeconds> 10 </natTimeoutSeconds> <urAccepted> 1 </urAccepted> <urUniqueID></urUniqueID> <urURL> https://data.syncthing.net/newdata </urURL> <urPostInsecurely> false </urPostInsecurely> <urInitialDelayS> 1800 </urInitialDelayS> <restartOnWakeup> true </restartOnWakeup> <autoUpgradeIntervalH> 12 </autoUpgradeIntervalH> <keepTemporariesH> 24 </keepTemporariesH> <cacheIgnoredFiles> false </cacheIgnoredFiles> <progressUpdateIntervalS> 5 </progressUpdateIntervalS> <symlinksEnabled> true </symlinksEnabled> <limitBandwidthInLan> false </limitBandwidthInLan> <minHomeDiskFreePct> 1 </minHomeDiskFreePct> <releasesURL> https://upgrades.syncthing.net/meta.json </releasesURL> <overwriteRemoteDeviceNamesOnConnect> false </overwriteRemoteDeviceNamesOnConnect> <tempIndexMinBlocks> 10 </tempIndexMinBlocks> </options> </configuration>

Deploy Syncthing configuration

Here’s how we deploy Syncthing keys and configuration from encrypted data bags to the nginx nodes (Note: It probably makes sense to use run Syncthing as the same user as nginx, as Syncthing needs to deploy a key that should only be readable by nginx and noone else):

# Set this to the home directory of your user (probably the same user running nginx) user = 'nginx' # Populate node information from data bag node_config = {} node_list . each do | node_name | config = Chef :: EncryptedDataBagItem . load( 'syncthing' , node_name, data_bag_secret) node_config [ node_name ] = {} node_config [ node_name ][ 'id' ] = config [ 'device_id' ] # Set address to "dynamic" if it's ourselves node_config [ node_name ][ 'address' ] = if node . name == node_name 'dynamic' else "tcp:// #{ node_name } . #{ node [ 'domain' ] } :22000" end end # Deploy Syncthing certificate (from data bag) local_config = Chef :: EncryptedDataBagItem . load( 'syncthing' , node . name, data_bag_secret) %w(key cert) . each do | k | # Show an error message if key couldn't be retrieved Chef . fatal( " #{ k } .pem is empty!" ) unless local_config [ k ] file "/home/ #{ user } /.config/syncthing/ #{ k } .pem" do mode 0 o600 owner user group user content local_config [ k ] end end # Deploy Syncthing configuration template "/home/ #{ user } /data/.config/syncthing/config.xml" do mode 0 o600 owner user group user source 'syncthing.config.xml.erb' variables nodes : node_config, apikey : local_config [ 'apikey' ] end # Restart Syncthing upon configuration/ key changes service "syncthing@ #{ user } " do subscribes :restart , "template[/home/ #{ user } /.config/syncthing/config.xml]" subscribes :restart , "template[/home/ #{ user } /.config/syncthing/key.pem]" subscribes :restart , "template[/home/ #{ user } /.config/syncthing/cert.pem]" action [ :enable , :start ] end

Restrict Syncthing to private backnet

We have a dedicated backnet for all environments. Syncthing should only be allowed on this specific backnet (in our case eth1 ). I’m using the iptables-ng cookbook to manage iptables.

# Allow Syncthing in backnet only iptables_ng_rule '50-syncthing' do rule [ '-i eth1 --protocol tcp --dport 22000 --match state --state NEW --jump ACCEPT' , '-i eth1 --protocol udp --dport 21025 --match state --state NEW --jump ACCEPT' ] end

Get the certificates and automate renewal

To actually request the certificate, the acme cookbook got you covered, which uses the ruby ACME library acme-client under the hood.

# Get some bonus points for generating your own Diffie-Hellmann parameters: execute 'openssl dhparam -out /etc/nginx/certs/dhparam.pem 2048' do creates '/etc/nginx/certs/dhparam.pem' notifies :restart , 'service[nginx]' end # Make sure acme-client gem is installed include_recipe 'letsencrypt::default' # Create a webroot for acme challenges directory '/etc/nginx/certs/acme' do owner user group user end # Deploy nginx site to answer ACME challenges template '/etc/nginx/conf.d/letsencrypt.example.com.conf' do mode 0 o644 source 'letsencrypt.nginx.erb' notifies :restart , 'service[nginx]' , :immediately not_if 'test -f /etc/nginx/certs/www.example.com.crt' end letsencrypt_certificate 'www.example.com' do alt_names %w(example.com) owner user group user fullchain '/etc/nginx/certs/www.example.com.crt' key '/etc/nginx/certs/www.example.com.key' method 'http' wwwroot '/etc/nginx/certs/acme' notifies :restart , 'service[nginx]' end # Remove temporary letsencrypt site file '/etc/nginx/conf.d/letsencrypt.example.com.conf' do notifies :restart , 'service[nginx]' , :immediately action :delete end

The temporary letsencrypt.nginx.erb

server { # This is for HAproxy with proxy_protocol, adapt if necessary listen [ :: ] :80 ipv6only = off proxy_protocol; # Serve well-known path for letsencrypt location /.well-known/acme-challenge { root /etc/nginx/certs/acme; default_type text/plain; } }

Also make sure to include something like this to your actual nginx site configuration, so challenges of automatic renewals can be answered:

server { # This is for HAproxy with proxy_protocol, adapt if necessary listen [ :: ] :80 ipv6only = off proxy_protocol; # Use remote_addr from proxy_protocol real_ip_header proxy_protocol; set_real_ip_from 10.13.37.0/24; # Serve well-known path for letsencrypt location /.well-known/acme-challenge { root /etc/nginx/certs/acme; default_type text/plain; } location / { return 301 https://<% = @domain %>$request_uri; } } server { # This is for HAproxy with proxy_protocol, adapt if necessary listen [ :: ] :443 ssl http2 ipv6only = off proxy_protocol; [ ... ] }

Wrap up

That’s it! We can now automatically request and renew free Let’s encrypt SSL certificates in our high availability setup! Syncthing will happily keep the certificates and challenges in sync, even if some nodes go down. More nodes can be added by simply adding the credentials to the syncthing data bag, and the configuration of all nodes will adapt automatically.

If you have some feedback, feel free to contact me. I’m also available for hire as a freelancer.