At Kubecost, we help teams monitor and manage Kubernetes spend. Teams commonly implement their first Kubernetes chargeback or showback solution with our software, and we frequently get asked questions about how to make this process go as smoothly as possible. This guide shares some of the different approaches to cost monitoring, best practices, and pitfalls that we’ve seen.

Our advice is distilled from our experience directly implementing chargeback at dozens of enterprises and from using chargeback systems at our previous companies. These experiences have taught us that situational variables matter, so think of these as general frameworks to apply to your organization’s specific situation. Reach out (team@kubecost.com) if you want to discuss!

Why bother with Kubernetes cost monitoring?

Kubernetes is an amazingly powerful platform that provides a set of APIs to dynamically provision compute/infrastructure resources. The Kubernetes platform is commonly used in dynamic, multi-tenant environments. The combination of these can enable teams to ramp resource consumption and costs quickly without clear visibility into why costs increased. Risks of overspending in this environment are furthered by the ability to easily provision expensive resources (e.g. GPUs) and by programmatic provisioning tools like cluster autoscaling. All of these mean that uncaught bugs or oversights can cause major cost overruns. Implementing a cost monitoring solution, along with a culture of awareness, can help catch these issues faster or even avoid them altogether.

Implementing cost monitoring can also help avoid the tragedy of the commons phenomenon, which is not uncommon in multi-tenant Kubernetes environments. The tragedy of the commons in this context is when all individual application engineering teams overprovision resources to ensure their applications are performant without appreciating the full cost of these resources. With every team overprovisioning pods/deployments/clusters independently, the company can be grossly inefficient with resources in aggregate. Small nudges towards awareness and accountability can help overcome many of these pitfalls and can also help avoid other security risks and developer productivity losses. The latter is true because teams often become more aware of abandoned or orphaned cloud resources, e.g. old deployments or persistent volumes.

Anecdotally, we have seen organizations consistently reduce spend by 30–70% after implementing cost monitoring solutions. We’ve just recently worked with a team that was unknowingly overspending by ~500%. Limited monitoring can have an immediate effect on reducing costs, but we have seen that implementing showback or chargeback can lead to greater savings and infrastructure hygiene.

Cost monitoring approaches

Kubernetes chargeback is where an organization implements an accounting system to allocate k8s infrastructure costs and related cloud spend (e.g. databases and storage buckets) back to the individual teams or business units that consumed these resources. These internal or external clients actually receive bills for the resources they have consumed. Successful rollouts typically have broad organizational buy-in with expectations clearly communicated to engineering team leads.

Showback is a similar concept but does not actually cross-charge costs and instead broadly shares cost allocation data with teams for informational purposes only. While there are key differences between showback and chargeback, the majority of lessons we have learned apply to both.

There are also limited monitoring approaches where a subset of teams (e.g. DevOps and/or Finance) observe spend and then react appropriately. This can work in small organizations but is difficult in larger, multi-tenant Kubernetes environments.