Image created by Dooder — Freepik.com

by Nikita Mazur

Logging and monitoring your IT infrastructure is a fundamental task, and it’s even more important and critical in microservice architecture. In this article we’ll have a look at some of the most popular options available for monitoring your Kubernetes cluster considering 5 axes:

Ease of installation

Pricing

Monitoring features

Logging features

Alerting features

Prometheus or Heapster + InfluxDB + Grafana

This stack consists of Heapster or Prometheus as a data aggregator, InfluxDB as storage backend, and Grafana as a data visualization platform.

Prometheus has become the de facto standard for Kubernetes data aggregation, but Heapster is also a good option if you want something simple and easy to install.

Ease of installation

There are Helm charts for all the components, so you can deploy them easily on Kubernetes. However, it’s not a completely plug- and-play solution, so there can be specific configurations to perform.

One of the good things of this stack is that Grafana has multiple dashboards ready to use in Kubernetes, so there’s no need to build them by yourself.

Pricing

All the components are open source. Enjoy the freedom! You will only pay your hosting costs.

Monitoring

Nice and simple UI to monitor you containers and node metrics like CPU and RAM. The metrics scraped by Prometheus are very rich, but you can also add your own data exporters for custom metrics (you would need a skilled developer to do it though).

Logging

Unfortunately, logging features are not available in this setup.

Alerting

Alerting features are available within the Prometheus’ alertmanager module. You can configure it through configuration files or Prometheus UI, which is not very user friendly to be honest.

Prometheus + ELK stack (ElasticSearch + Logstash + Kibana)

In this stack Prometheus is used as a data aggregator, ElasticSearch as storage backend, Logstash as a logging manager and Kibana as a data visualization platform.

It’s probably the most flexible solution, but also the most difficult to install and configure.

Ease of installation

Medium to hard. It will take you a little bit of effort to setup everything. Unlike Grafana and SaaS alternatives, Kubernetes dashboards are not readiliy available here, so you have to create them by yourself in Kibana. It may be a time-consuming task, but once done, they are easy to manage.

There are Helm charts for all the components, so you can deploy them easily on Kubernetes.

Pricing

All the components are open source, so they are free! You can also get an Elasticsearch instance as SaaS, but you have to pay for it, and in general it’s more expensive than hosting it by yourself.

Monitoring

You can do almost everything with this setup, it’s very flexible. Kibana offers a large number of graphs and dashboards, so you can draw any type of data.

Just as with the previous stack, Prometheus provides a broad variety of metrics, but you can also build your own data exporter.

Logging

Logging management has become one of the core use cases for the ELK stack. There features available on the ELK stack are aplenty, including logs filtering, grouping and correlation.

Alerting

In this setup, the alerting features can be implemented at two levels:

• At alertmanager module in Prometheus through configuration files (not very user-friendly though)

• At ElasticSearch Watcher module, part of X-Pack. It makes it easy to setup alerts, but you have to install the X-Pack module on the ELK stack before using it.

Datadog

This is a popular solution for those who prefer having a fully managed SaaS solution. In this setup Datadog will manage everything from data collection to data visualization.

Ease of installation

Very easy. It’s arguably the most ‘plug-and-play’ solution. Once the Kubernetes yaml files are deployed, you will almost inmediately see the statistics across your containers and clusters. The Kubernetes yaml files for data collection agents can be deployed through Helm — you can find a chart in the official repository.

Pricing

Starting from $15 per node per month. There is a free plan for users having less than 5 nodes. The paid plan includes APM features.

Monitoring

Nice and simple UI to monitor the containers and hosts metrics like CPU and RAM. The configuration is quite straightforward.

It is also possible to scrape data from a Prometheus instance deployed on the monitored Kubernetes cluster.

Logging

Logging features like collection, parsing and search are available. But they are not as advanced as in ELK since logging is quite a new feature in Datadog.

Alerting

Available and easy to configure. Nice user interface.

NewRelic

NewRelic is a very famous company offering application monitoring services in SaaS. However, Kubernetes monitoring here is still in beta with fewer features than other platforms offer.

NewRelic will manage everything from data collection to data visualization.

Ease of installation

Easy to install like Datadog. Helm chart is available in the official repository if you want to deploy the agents quickly.

Pricing

Starting from $0.72, so it comes at quite an affordable price. However, the APM feature is not included, you will need to have an additional subscription (which starts at $10 per node per month).

Monitoring

Similar to Datadog. However there is no native implementation of Prometheus data scraping, which is a bit unfortunate. But still, the available metrics should be enough for most users.

Logging

Logging features are not available natively, so you will need a plugin.

Alerting

Can be setup easily.

Conclusion

We have just reviewed the most common and efficient ways to monitor your Kubernetes clusters. There are many other alternatives like Dynatrace, Spunk or AppDynamics; I didn’t write about them here since I find them less universal and more focused on Enterprise-grade clients.

In any case, the choice will depend on whether you want something ready to be used as SaaS, and in that case you will choose Datadog or NewRelic, or you can afford to setup an self-hosted solution with open source components, and in that case the first two options are probably the best.

What tools do you use to monitor your clusters?

Feel free to ask questions — I’ll be glad to help! Don’t forget to follow us on Twitter and join our Telegram chat to stay tuned!

You might also want to check our Containerum project on GitHub. We need you feedback to make it stronger — you can submit an issue, or just support the project by giving it a ⭐. Your support really matters to us!