Think twice before using Helm

Beyond hype — a critical look at Helm

Helm — The package manager for Kubernetes.

Sounds quite nice, right? It will simplify your release process a lot, but sometimes it will give you a hard time, that’s life!

Recently Helm has been promoted to an official top-level @CloudNativeFdn project and is widely used in the community. That means something but I would like to briefly share with you my concerns around Helm.

This blog post expresses my reason for not believing the hype.

What’s the real value of Helm?

After some time I’m still unclear about the value it adds. It doesn’t provide anything special. What value does Tiller bring (server-side part)?

Many Helm charts are far from perfect and require some effort to make them usable on your Kubernetes cluster, e.g. lack of RBAC, Resource Limits or Network Policies ; which means that you cannot install Helm chart as a binary and forget what’s inside.

I would love to have someone who can explain it to me, especially in terms of secure multi-tenant production environment instead of just glorifying how cool it is based on a hello world example.

“Talk is cheap. Show me the code.” Linus Torvalds

This is another auth & access control layer

Someone compared Tiller to “a giant sudo server”. For me it’s just another authorization layer with a lack of access control and additional TLS certs to maintain. Why not leverage an Kubernetes API to rely on an existing security model with proper audit and RBAC?

Is it just a glorified templating tool?

It’s all about rendering and linting go-template files using configuration from values.yaml then applying the rendered Kuberentes manifest along with the corresponding metadata stored in ConfigMap.

What can be replaced by few simple commands:

$ # render go-template files using golang or python script

$ kubectl apply --dry-run -f .

$ kubectl apply -f .

I have observed that teams used to have values.yaml per environment or even rendered it from values.yaml.tmpl before using it.

It doesn’t make sense for Kubernetes Secrets which are often encrypted and versioned in the repository. You can either use the helm-secrets plugin in order to do that or override it using --set key=value — but it still adds another layer of complexity.

Helm as an infrastructure lifecycle management tool

Forget about it. It won’t work, especially for core Kubernetes components like kube-dns, CNI provider, cluster autoscaler, etc. These components have a different lifecycle and Helm doesn’t fit there.

My experience with Helm shows that it works fine for simple deployments using basic Kubernetes resources which can be easily recreated from scratch and don’t have a complex release process.

Sadly, Helm can’t handle more advanced and frequent deployments including Namespace, RBAC, NetworkPolicy, ResourceQuota or PodSecurityPolicy.

I know it may offend someone who is super obsessed with Helm, but this is a sad truth.

Helm state

The Tiller server stores information in ConfigMaps located inside of Kubernetes. It does not need its own database.

Unfortunately, ConfigMap limit is restricted to 1MB because of etcd limits.

Hopefully, someone will come up with an idea to extend the ConfigMap storage driver to compress the serialized release before storing it, but for me, this still doesn’t solve the actual issue.

Random failures and error handling

This is something that worries me the most — I can’t rely on it.

Error: UPGRADE FAILED: "foo" has no deployed releases

IMHO this is one of the annoying problems with Helm.

If the first release failed, then every subsequent attempt will return an error saying that it cannot upgrade from an unknown state.

The following PR “fixes” it by adding the --force flag, which actually hides the problem by doing helm delete & helm install --replace underneath.

however, most of the time you will end up cleaning up the entire release.

helm delete --purge $RELEASE_NAME

Error: release foo failed: timed out waiting for the condition

For example, if ServiceAccount is missing or RBAC doesn’t allow the creation of a specific resource, Helm will return the following error message:

Error: release foo failed: timed out waiting for the condition

Unfortunately, the root cause of it is not visible in Helm:

kubectl -n foo get events --sort-by='{.lastTimestamp}'

Error creating: pods "foo-5467744958" is forbidden: error looking up service account foo/foo: serviceaccount "foo" not found

Helm fails-successfully

There are some edge cases in which Helm fails-successfully without doing anything. For example, sometimes it doesn’t update Resource Limits.

helm init runs tiller with a single replica — not HA

Tiller is not HA by default, the PR below is still open:

This may cause downtime one day…

Helm 3? Operators? The future?

The next version of Helm will bring some additional promising features like:

single-service architecture with no client/server split — no more Tiller

embedded Lua engine for scripting

pull-based DevOps workflow, a new Helm Controller project will be started

For more details take a look at Helm 3 Design Proposal.

I really like the idea of a Tiller-less architecture but I’m not sure about Lua scripting because it can add additional complexity to the charts.

Recently I have observed a big shift towards operators which are more Kubernetes native than Helm charts.

I really hope that the community will sort this out (with our help ofc) sooner rather than later, but in the meantime, I prefer to use Helm as little as possible.

Don’t get me wrong — it’s just my personal point of view after spending some time building a hybrid-cloud deployment platform on top of Kubernetes.