Monitoring and logging systems are essential parts of DevOps culture. This article will be 3 parts, each part will be about monitoring, logging, and alarming. In the first part, we will talk about how to collect metrics from the Kubernetes cluster and how to visualize them. In the second part, we will talk about how to collect logs from applications which run on Kubernetes. Finally, we will learn how to set alarms for example if an application produces errors or if memory usage more than 80 percent.

EMG STACK

Setting up monitoring for Kubernetes clusters allows you to track system and application resource usage. There are many alternative stacks for Kubernetes monitoring such as Prometheus-Grafana-Alertmanager, Telegraf-InfluxDB-Chronograf-Kapacitor(TICK stack) etc. Today I’am introducing a different approach using Elasticsearch-Metricbeat-Grafana(EMG).

Metricbeat is a lightweight metric collector. It collects metrics from system and services with various modules. We can easily collect system metrics with the system module and Kubernetes metrics with Kubernetes module. After collecting and sending metrics to Elasticsearch (look at Diagram 1) we could use Kibana for visualization as Elastic offers default as you see below.

Kibana Metricbeat

I used Kibana for a while then I have decided to build the same visualizations in Grafana using Elasticsearch as data source because Grafana gives you better visualization options as you see below. (this is my personal opinion of course :) )

Grafana Metricbeat

You need to install metricbeat agents in every node in Kubernetes cluster as you see in the Diagram 1 so you can collect metrics. Using DaemonSet ensures running metricbeat on each node in the cluster(except master nodes). DaemonSets do not schedule on master nodes by default. In order to schedule it on master you have to add a toleration into the pod spec There is example in the below.

tolerations: - key: node-role.kubernetes.io/master operator: Exists effect: NoSchedule - operator: "Exists" effect: "NoExecute" - operator: "Exists" effect: "NoSchedule"

You could download official manifest files from github provided by Elastic. After downloading and installing metricbeat manifest files now you can configure your Grafana dashboard. You could start with example dashboard which I’ve shared on grafana.com. There are many metrics metricbeat produces you could create different dashboards then me.