Monitor excessive resource requests with Prometheus Working hours autoscaling Nodes autoscaling Horizontal pod autoscaling Vertical pod autoscaling

Extra: More tips for reducing costs on cloud

Hi! Today I want to share how I achieved a great cost reduction in my cluster just by using the tools above.

This article describes many open source projects, but this is not intended to be a detailed guide. The focus is on setting up them quickly to taste every project.

I know open source can be hard to be approached, so I will include all necessary piece of code to deploy these projects, as well as a minimum description and links to read for further documentation. Please read the full doc if you decide to go on with any software.

Last comment, I will play only with CPU metrics, but the same code is always valid for memory metrics.

Monitor excessive resource requests with Prometheus

First of all, I will show how to measure how much resources are being wasted in your Kubernetes cluster.

Prometheus is Kubernetes admin’s best friend. The easiest way to deploy it is via this Helm chart:

helm install stable/prometheus

If you already have a Prometheus instance in your cluster, be sure you have deployed and monitored kube-state-metrics and you have configured a job to monitor node’s cAdvisor. If you are using the helm chart linked above, you are ready to continue.

I am going to measure resource efficiency via the relation between the cores requested for a container and the actual usage of it. The larger this figure is, the more resources are reserved unnecessarily.

Don’t be afraid of the query below, it’s actually simply and you will learn some Prometheus tricks 😉

label_replace(

label_replace(

kube_pod_container_resource_requests_cpu_cores{},

"pod_name",

"$1",

"pod",

"(.+)"

),

"container_name",

"$1",

"container",

"(.+)"

)

/

on(pod_name,namespace,container_name)

sum(

rate(

container_cpu_usage_seconds_total{pod_name=~".+"}[60m]

)

)

by (pod_name,namespace,container_name)

Numerator is the CPU request for the container. The metric kube_pod_container_resource_requests_cpu_cores is provided by kube-state-metrics. I use label_replace function twice to convert pod and container tags to match pod_name and container_name tags in the denominator (label matching is necessary to divide to different metrics).

Denominator is the CPU usage, rated for a 60 min interval to avoid peaks. on keyword is used to limit the tags to be matched between both parts.

This is the output I got in my test cluster:

Chart for CPU request/usage per container

Yes, one container is taking 25.000 times more resources than it is actually using 😒.

With this metric you can configure an alert to notify you every time a request exceeds the usage that much.

But I am not fan of receiving lot of mails or Slack messages about staff that can be automated. We will learn how to automatically improve this metric in Vertical Pod Autoscaler section.

Working hours autoscaling

For my case, the biggest savings actually come from avoiding unnecessary workloads. It is normal to deploy tools or environments that are used only on working hours.

Thankfully, there is a useful tool to schedule when to scale your deployments to 0 (kudos to hjacobs):

This repository contains YAMLs to deploy this controller in your cluster:



cd kube-downscaler

kubectl apply -f deploy/ git clone https://github.com/hjacobs/kube-downscaler.git cd kube-downscalerkubectl apply -f deploy/

Next, you only need to annotate the deployments you want to be downscaled:

kubectl annotate deploy <MY DEPLOY> "downscaler/uptime=Mon-Fri 07:30–20:00 Europe/Madrid"

This one will be scaled to zero at 20:00 and will return to the previous amount of replicas at 7:30, from Monday to Friday.

Used along with the next tool, it will help you to save lots of money while you sleep.

Nodes autoscaling

Cluster autoscaler is an essential plugin for any Kubernetes cluster. It takes care of the total amount of resources available in your cluster, and communicates with your cloud provider to scale it up or down. After all, it is not real saving if you don’t use less nodes.

It can be deployed with this Helm chart in kubeapps. See the documentation to configure it depending on your cloud provider. It is also advisory to read the FAQ.

Cluster autoscaler just works. With a few configuration it can auto-discover the autoscaling groups of your nodes and you won’t have a pod in pending state anymore. It will launch a new node every time the available CPU or memory is not enough to deploy a new pod. Plus, it also downscales when necessary.

Extra saving: you can scale a node group to 0 just by adding a tag to your autoscaling group, learn how here.

Horizontal Pod Autoscaling

This one is useful in case you have multiple replicas of your service to avoid outages when there are peaks of traffic. Use this feature to adjust the number of replicas to the actual traffic.

Kubernetes has improved its ability to horizontally scale pods release after release. It takes only one command to configure autoscaling in any deployment, and that can be done using CPU and memory boundaries, but also any other custom metric.

I will help you with prerequisites in case your cluster doesn’t already comply with them. The only prerequisite is the metrics-server, but to deploy it you need to enable the aggregation layer.

Aggregation layer is the piece of Kubernetes that allow to extend the default apiserver with additional APIs (as the metrics API we need). It can be enabled by adding these parameters to the apiserver:

--requestheader-client-ca-file=/etc/kubernetes/certificates/ca.crt

--requestheader-allowed-names=

--requestheader-extra-headers-prefix=X-Remote-Extra-

--requestheader-group-headers=X-Remote-Group

--requestheader-username-headers=X-Remote-User

--proxy-client-cert-file=/etc/kubernetes/certificates/apiserver.crt

--proxy-client-key-file=/etc/kubernetes/certificates/apiserver.key

Yes, in this example I am using the same certificates I already have for the apiserver, and set the wildcard in requestheader-allowed-names. I don’t do that in my production cluster, it’s only to proceed faster. Find explanations for all those parameters in the apiserver reference.

We are set to deploy metrics-server. First clone the repo, it provides YAMLs to deploy everything.



cd metrics-server git clone https://github.com/kubernetes-incubator/metrics-server.git cd metrics-server

We just need to add a parameter in the file deploy/1.8+/metrics-server-deployment.yaml, this is the resulting file:

# File: deploy/1.8+/metrics-server-deployment.yaml

........

........

........

command:

- /metrics-server

- --source=kubernetes.summary_api:''

- --requestheader-allowed-names=

Save, exit and run:

kubectl create -f deploy/1.8+/

Check the pod logs to verify it is working properly.

All right! I am going to create a test deployment that uses 0.9 cores:

kubectl create -f - <<EOF

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: scaling-test

namespace: default

spec:

replicas: 1

template:

metadata:

labels:

app: scaling-test

spec:

containers:

- name: hamster

image: k8s.gcr.io/ubuntu-slim:0.1

command: ["/bin/sh"]

args: ["-c", "while true; do timeout 0.9s yes >/dev/null; sleep 0.1s; done"]

EOF

Now, let’s configure a horizontal pod autoscaling (HPA) rule for CPU usage above 80%:

kubectl autoscale deploy scaling-test -min=1 -max=5 -cpu-percent=80

Check the newly created HPA resource and wait for events:

k describe hpa scaling-test

After a minute the deployment should scale to the max (5). You may see some errors in the firsts seconds.

Now edit the deployment and change the args to:

- args:

- -c

- while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done

This command will use only 50% of a core, so the deployment should downscale to 1 replica automatically.

Vertical Pod Autoscaling

This project is in alpha status but it is the most useful for me. Properly configured, it automatically sets the resource requests for your pods!

You have seen in the first section that my pods were requesting much more CPU cores than they actually use. Once vertical pod autoscaler was set up, this metrics went down drastically, adjusting my cluster size as needed, with no more effort.

Advice: It is not compatible with horizontal pod autoscaler by the moment. Don’t use both in the same deployment!

Vertical Pod Autoscaler also relies on the metrics-server, we learnt how to deploy it in the previous section. The other prerequisite is to enable admission webhooks in the apiserver. Just add these admission controllers in the apiserver paremeters:

--admission-control=ValidatingAdmissionWebhook,MutatingAdmissionWebhook

Now clone the repo and set up the autoscaler with the installation script:



cd autoscaler/vertical-pod-autoscaler

./hack/vpa-up.sh git clone https://github.com/kubernetes/autoscaler.git cd autoscaler/vertical-pod-autoscaler./hack/vpa-up.sh

This will create custom resource definitions and deploymets and RBAC configs for its three components:

Recommender: It monitors pod metrics and estimates the usage. Updater: It takes the recommendations, updates the verticalPodAutoscaler objects, and evicts pods if needed. Admission-controller: It updates pods at creation with the recommended requests.

See the logs for the three components in the namespace kube-system to check the deployment was successful.

Now create a deployment that use 0.5 cores but requests only 0.1, create a VPA(verticalPodAutoscaling) too and see what happens:

kubectl create -f - <<EOF

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: vertical-scaling-test

spec:

replicas: 1

template:

metadata:

labels:

app: vertical-scaling-test

spec:

containers:

- name: ubuntu

image: k8s.gcr.io/ubuntu-slim:0.1

resources:

requests:

cpu: 0.1

memory: 50Mi

command: ["/bin/sh"]

args: ["-c", "while true; do timeout 0.5s yes >/dev/null; sleep 0.5s; done"]

EOF kubectl create -f - <<EOF

apiVersion: "poc.autoscaling.k8s.io/v1alpha1"

kind: VerticalPodAutoscaler

metadata:

name: vertical-scaling-test

spec:

selector:

matchLabels:

app: vertical-scaling-test

EOF

After some minutes, you should see the pod was recreated with new requests. If you edit the pod to use less resources it will scale down again.

This project is very useful for development environments but don’t forget that this is an alpha version. You can also customize each component to set minimums, limits, etc. See the README file inside each folder here.

Extra tips for saving in Kubernetes

Avoid LoadBalancer service type. It’s better and cheaper to use Ingress objects and an ingress controller.

Use small persistent volumes if you are in cloud. Expand them when needed with this new feature.

Use the right instance type for your cluster, depending of your requirements for CPU and memory. Select the correct instance type to optimize usage.

I hope you can use these projects in your cluster to save some credit, let me know what you achieve!

ignaciomillan.com