Photo by Matt Artz on Unsplash

by Maria Terskikh

Kubernetes is certainly one of the technologies that grows the most in the IT infrastructure world. The reasons of this popularity is that it allows you to deploy easily almost any kind of application without the pain of provisioning a specific infrastructure for each component. However, this easiness and flexibility will only come if you manage a Kubernetes cluster wisely. This article contains some best practices that will allow you to manage your cluster efficiently to take the most out of it.

#1 Design your apps to scale horizontally whenever possible

When it was announced more than a year ago it was one of the most expected features of Kubernetes. We are talking of course about the HorizontalPodAutoscaler, a Kubernetes resource which enables you to natively scale the number of replicas of a Deployment or StatefulSet depending on different metrics (CPU usage, RAM usage or another custom metric).

The advantage is fundamental, since you will be able to achieve a cost-effecitve infrastructure with a high level of performance.

In order to do that, you have to keep in mind a basic principle for you application design: whatever replica is processing a task, the result should be the same. This means we can route a request to any active replica without worrying about whether it is the replica that should be processing our request.

For stateless components, this is simple to achieve, since they are naturally scalable. For stateful components it’s trickier: all the active replicas must be aware of what replica is responsible for processing a task, which means some sort of synchronization between them. For example, in the case of user session data in a web application, this should be stored in a shared storage like Redis, Memcache, or even as a cookie on the user’s workstation.

Sometimes it is difficult to achieve horizontal scalability, although not impossible. This is especially true when you have databases like MySQL and PostgreSQL that were not designed for horizontal scalability. However, there are workarounds like using database proxies and/or active-active replication, but they are not really native implementations and may need some plumbing.

#2 Don’t reinvent the wheel, use Helm

Helm is becoming the de facto standard for Kubernetes application deployment, and some people even consider it the ‘App Store’ for Kubernetes (or yum, or apt, or snap).

In the official repositories you can find a lot of already-packaged applications like Redis, Postgres, MySQL which you can use without the pain of “kubernitizing” them:

https://github.com/helm/charts/tree/master/stable

https://github.com/helm/charts/tree/master/incubator

Of course, you can (and probably should) package your own apps as Helm charts. You can store your Helm charts in a private repository. The repository is a standard HTTP server serving files: Github, S3, GCP storage or even Chartmuseum, an HTTP server with the basic Helm chart repository functions.

#3 Take Kubernetes security seriously

As for any infrastructure, security should be one of your focal points. Many companies and IT departments see the security from the perspective of a cost center. Maybe this was relevant a long time ago, when security threats were not so frequent and their impact was rather low. Nowadays, a simple DDoS attack can cause millions of dollars of losses and a data leak may produce a serious negative reputation impact.

More and more companies start to leverage their efforts on security as a competitive advantage, something that differentiates them from the competition. The cost center becomes then a profit center.

From this point of view, here are some recommendations you can follow to harden your Kubernetes cluster:

Use Kubernetes Secrets to store sensitive information, and don’t forget to rotate the secrets frequently.

Don’t start your containerized applications as a root user.

Isolate your master nodes (especially access to the Kubernetes’ API server) and the worker nodes using firewall rules. Allow only the incoming traffic when necessary.

Have a clear IAM policy, define the roles in Kubernetes and which users can access which resources. Don’t forget to revoke non-used credentials.

For more information about managing permissions in Kubernetes, read our previous article.

#4 Monitor and log (almost) everything

In any infrastructure it is fundamental to set up monitoring and logging processes that help you see what really happens in your Information System and provide a satisfactory level of performance. This is especially true for microservice architecture for two main reasons:

With an increased number of services you have an increased level of complexity since there is a larger number of potential root causes in an incident. So you need to monitor the largest scope possible in order to be able to identify the root cause of an incident as soon as possible.

The underlying infrastructure is usually based on a third-party cloud provider, which usually bills you on a pay-per-use basis. Hence, you should closely watch what resources you are using in order to adapt the used capacity to you real needs. By doing this you will optimize both your performace and your costs.

There are several solutions to implement logging and monitoring in Kubernetes (check our previous article on this topic), but regardless of what you use you need to monitor two main things:

Resource usage in nodes and containers (CPU, RAM and Storage)

Application-specific metrics: number of requests, request duration, number of errors,…

If your Information System is complex, it is also recommendable to set up a correlation dashboard between logging and monitoring (Elasticsearch does this very well).

#5 Train your teams and support them!

It is probably the most important advice on our list. It is true not only for Kubernetes but also for any innovation within a team. Technology, tools and processes are certainly important, but if the stakeholders don’t know how to take the most out of them, they will not see any advantage and will probably avoid using Kubernetes.

So the recommendation here is to plan a skills ramp up whenever there is a migration. Because at the end, it’s not about technology. It’s all about people.

Thanks for reading! Please, comment if there’s something you’d like to add. Don’t forget to follow us on Twitter and join our Telegram chat to stay tuned!