Problem statement

The problem statement for this piece of work was as follows:

As an engineer

I want to be confident in my HelmRelease before they are deployed.

So that deploys are more likely to succeed than fail in Kubernetes.

The current problem

At Mettle, as discussed before, we follow GitOps principles religiously when deploying workloads to our Kubernetes clusters. However, in the past, we have been bitten with invalid HelmRelease manifests being deployed causing tiller to swallow errors and not bubble them to the surface.

The migration to Helm 3 made these errors appear when describing the HelmRelease itself (see below)



Conditions:

Last Transition Time: 2020-04-15T16:40:49Z

Last Update Time: 2020-04-15T16:40:49Z

Message: Chart fetch failed for Helm release 'redacted' in 'redacted'.

Reason: ChartFetchFailed

Status: False

Type: ChartFetched

Observed Generation: 1

Phase: ChartFetchFailed

Events:

Type Reason Age From Message

---- ------ ---- ---- -------

Warning FailedReleaseSync 58s helm-operator synchronization of release 'redacted in namespace 'redacted' failed: failed to prepare chart for release: chart unavailable: chart "redacted" version "2.0.34" not found in Status:Conditions:Last Transition Time: 2020-04-15T16:40:49ZLast Update Time: 2020-04-15T16:40:49ZMessage: Chart fetch failed for Helm release 'redacted' in 'redacted'.Reason: ChartFetchFailedStatus: FalseType: ChartFetchedObserved Generation: 1Phase: ChartFetchFailedEvents:Type Reason Age From Message---- ------ ---- ---- -------Warning FailedReleaseSync 58s helm-operator synchronization of release 'redacted in namespace 'redacted' failed: failed to prepare chart for release: chart unavailable: chart "redacted" version "2.0.34" not found in https://redacted-charts.storage.googleapis.com repository

The ideal solution

The ideal solution is for all components of a HelmRelease to be as thoroughly tested as possible before the HelmRelease itself is deployed to a cluster or clusters. These components include:

The helm chart itself (with the default values)

The helm chart with the values from the HelmRelease also applied

Helm Chart Testing

Our custom helm charts already have strict linting run against them as part of CI. We use the following tools:

helm/charting-testing

Our first linting job lints our helm charts against the upstream helm chart-testing docker container. This tool performs the following action:

ct is the the tool for testing Helm charts. It is meant to be used for linting and testing pull requests. It automatically detects charts changed against the target branch.

See CircleCI config below:

lint:

docker:

- image: quay.io/helmpack/chart-testing:v3.0.0-rc.1

steps:

- checkout

- run:

name: lint

command: ct lint --all --config test/ct.yaml

Kubeval

We run kubeval against all our helm charts, by looping around each helm chart, running a helm template and then performing strict kubeval linting against the output (see below):

#! /usr/bin/env bash



set -euo pipefail



mkdir -p /tmp/mettle



for chart in mettle/*; do



printf "

Checking %s

" "${chart#*/}"



helm template ${chart} > /tmp/${chart}.yaml



export KUBEVAL_SCHEMA_LOCATION=file:///usr/local/kubeval/schemas



kubeval --kubernetes-version 1.17.0 --strict --force-color --ignore-missing-schemas /tmp/${chart}.yaml



done

So we have our custom helm charts being linted and statically analyzed using both the upstream helm/chart-testing container and kubeval itself but nothing that would take our HelmRelease values and blend them with our default values, that would be the holy grail!

The missing piece of the puzzle

After doing some digging online, my good friend Stefan Prodan has already created hrval (https://github.com/stefanprodan/hrval-action).

The action works by:

Downloading the helm chart. Running a helm template using both the default values and also the values specified in the given HelmRelease. Running a strict kubeval on the rendered output.

This sounded perfect and was the missing piece of our ideal solution.

Adding to our toolkit container

At Mettle, we are already using CircleCI for all our Kubernetes linting and static analysis so we ported Stefan’s scripts into our toolkit container using his Dockerfile as a reference. Our Dockerfile now includes the following additional section:

# Install hrval scripts

COPY src/hrval.sh /usr/local/bin/hrval.sh

COPY src/hrval-all.sh /usr/local/bin/hrval

RUN chmod +x /usr/local/bin/hrval.sh

RUN chmod +x /usr/local/bin/hrval

Hooking it into our CI pipelines

If you’ve read my previous blog posts you will know at Mettle we leverage Kustomize heavily to try to keep our repositories as DRY (Don’t Repeat Yourself) as possible.

Therefore we wrote a bash script that would run hrval against the output of a kustomize build for a given environment (see below).

#! /usr/bin/env bash ENV=$1

IGNORE_VALUES=false

KUBE_VERSION=1.17.0

HELM_VERSION=v2 set -eu printf "Running hrval against all %s HelmReleases

" "${ENV}" mkdir -p /tmp/"${ENV}"

kustomize build kustomize/"${ENV}" -o /tmp/"${ENV}"

hrval /tmp/"${ENV}"/ ${IGNORE_VALUES} ${KUBE_VERSION} ${HELM_VERSION}

Then in our CircleCI config we simply run this script per environment:

defaults: &defaults

working_directory: ~/project

docker:

- image: quay.io/mettle/kubernetes-toolkit:1.17.2 hrval-for-sbx:

<<: *defaults

steps:

- checkout

- run:

name: kubeval helmreleases for sbx

command: bash -c "bin/hrval-for-environment sbx"

Note: We run hrval against both our custom helm charts as we as upstream ones to make sure our HelmRelease manifests are as compliant as possible.

Summary

Adding hrval to our toolkit and CI pipelines, has provided us with possibly the most complete CI workflow possible from a kubeval perspective and it ticks all the points on our ideal workflow. After setting this up it has already found two issues:

We were referencing a helm chart versions that does not exist. We were passing invalid values in our HelmRelease

We are hoping this is going to save us from similar issues in the future. 🤞