At Aller Media, we manage most of our ingress resources directly in GKE. This feels good, as we can define our public facing network infrastructure in much the same way as we define our app deployments. In yaml. In version control.

For historical reasons we have a few GCE LoadBalancer resources set up, and initially we supplied these with 1-year certificates from a vendor. Which is fine, only it will come back to nag you once a year, and you’ll have to enter the unpleasant underworld of paying money to use weird interfaces to create .pem’s that you later need to dissect to stack the intermediate chain in the correct order yada yada…

We would much rather have our automated cert-renewal loop that we use for kubernetes ingresses work for this use case as well.

Cert-manager

Back in March 2018, Let’s Encrypt released their v2 api, which allows for wildcard certificates. Cert-manager supports this api and lets us define certificate requests as yaml. To make this work, you will need a cert-manager issuer set up to use the dns01 validation method. Setting up cert-manager is beyond the scope of this post.

Create a wildcard certificate

This creates a certificate request for *.example.com, pushes through the validation cycle with Let’s Encrypt, and renders the new valid certificate in a kubernetes secret called wilcard-example-com-tls.

Updating the HTTPS loadbalancer

Digging around in the IAM roles documentation, we figured out that our future cronjobs for this project would need the compute.securityAdmin role for all our affected projects. We knew from the start that we’d need to dish out some privileges, but this is too much by half. So we created a new role following the documentation, and came up with this:

Create a new compute IAM role

We added this role to our organization and granted it to a service account, then placed the keyfile for this service account in a kubernetes secret.

Update-cert

So we have a service account, and a certificate. We need some bash.

Do all the things

Build this into a container with Google Cloud SDK

Or even mount the script into Googles official SDK image.

We make assumptions here about where to find our local certificate files, credentials for gcloud cli, etc. It all connects when we introduce our cronjob specification.

CronJob

Run periodically

We have our service account and certs mounted to where ‘update-cert’ can find them.

Every monday during office hours, this job will check if there is a diff between our letsencrypt certificate managed by cert-manager, and what is currently running in the selected target-https-proxy. A diff prompts a replacement and cleanup of the certificate in the target-https-proxy.

Failures in our cronjob setup are alerted on by prometheus/alertmanager. I would love to see cert-manager expose success/error rates to prometheus, however there are pull requests pending for this, so I’m confident this will be the case soon-ish.

Any combination of project, certificate, service account and target-proxy can be set up. All you need to manage several proxies is to add cronjobs.