Not a long time ago we discussed how to build a Mesos cluster. Today I want to speak about how to monitor it. This question might be very simple in case of monolithic applications, but when we speak about dozens (or even more than 1) of servers, the problem becomes a bit more complicated.

Requirements

Let's start from the requirements that we want from our monitoring system:

gather linux -specific metrics (cpu, mem, disk);

-specific metrics (cpu, mem, disk); gather mesos -specific metrics (job statuses, errors);

-specific metrics (job statuses, errors); gather docker -specific metrics (containers start/stop);

-specific metrics (containers start/stop); gather haproxy -specific metrics (amount of sessions, response times, errors);

-specific metrics (amount of sessions, response times, errors); gather application-specific metrics (if applicable);

have a pretty UI;

send notification to Slack;

One may say "Use Zabbix, Luke". "Nooo!" - will be my answer :) One day I'll write my thoughts about this platform, but not today. Today we'll speak about Prometheus :)

Prometheus

What is Prometheus? In short, that is an open-source monitoring system based on time-series database with embedded notification and visualization mechanisms.

Prometheus is based on the pooling approach. That means you have to install a set of so-called "exporters" on your hosts and register them in the main server configuration. Thanks god, that is not a problem with Ansible. Also, you have to install a separate service for sending notifications ("alertmanager") and a separate service for data visualization (plain old "grafana" which is already integrated with Prometheus).

Installation

Fortunately, all the necessary components can be installed as docker containers. So, let's start from "exporters" which should be installed on all the hosts that you want to monitor.

Exporters

Most of all, we want to gather linux-specific metrics such as cpu, rss memory and free disk space. For that, you can use node-exporter:

- name: Install Node Exporter shell: docker run -d --restart=always --log-driver=journald \ \ -p 9100:9100 --net="host" \ --name prom-node-exporter \ \ prom/node-exporter:0.12.0 tags: node-exporter

The second thing to monitor is docker and its containers. Some time ago the prometheus folks had "container-exporter", but for now they recommend using cadvisor. To be honest, I'm not a fan of it (mostly due to the "device is busy" issue which seems to be with us forever), but it works fine with Mesos, so... why not?

- name: Install Cadvisor shell: docker run -d --restart=always --log-driver=journald \ \ --privileged=true \ -v /:/rootfs:ro \ -v /var/run:/var/run:rw \ -v /sys:/sys:ro \ -v /var/lib/docker/:/var/lib/docker:ro \ -v /var/log/syslog:/var/log/syslog \ \ -p 3002:8080 \ --name prom-container-exporter \ \ google/cadvisor:v0.23.1 tags: container-exporter

The next step is HaProxy metrics. We can get them with haproxy-exporter, but not without problems. Current implementation doesn't expose such valuable metric as average response time (which is available in HaProxy since v1.5). There is a pool request for fixing this, but seems that maintainers are kind of disagree with the proposed solution. Anyway, you can build your own docker image, based on the PR or use my assembly.

# https://github.com/prometheus/haproxy_exporter - name: Install HaProxy Exporter shell: docker run -d --restart=always --log-driver=journald \ \ -p 9101:9101 --net="host" \ --name prom-haproxy-exporter \ \ krestjaninoff/prometheus_haproxy_exporter \ -haproxy.scrape-uri="http://localhost:9090/haproxy?stats;csv" tags: haproxy-exporter

And finally, it's time to add exporters for Mesos metrics! Unfortunately, I did't find docker images for this project on DockerHub (despite the fact that Dockerfile is provided on GitHub), so I made my own ones.

# https://github.com/mesosphere/mesos_exporter - name: Install Mesos Master Exporter shell: docker run -d --restart=always --log-driver=journald \ \ -p 9110:9110 --net="host" \ --name prom-mesos-master-exporter \ \ krestjaninoff/prometheus_mesos_exporter:latest \ -master=http://$(netstat -nl | grep 5050 | awk '{print $4}') -timeout=60s tags: mesos-master-exporter

and

# https://github.com/mesosphere/mesos_exporter - name: Install Mesos Slave Exporter shell: docker run -d --restart=always --log-driver=journald \ \ -p 9110:9110 --net="host" \ --name prom-mesos-slave-exporter \ \ krestjaninoff/prometheus_mesos_exporter:latest \ -master=http://$(netstat -nl | grep 5051 | awk '{print $4}') -timeout=60s tags: mesos-slave-exporter

I deliberately set "-p" attribute to show the port number which is used by the exporters. You can test the exporters' output by requesting them through http: curl localhost:9100/metrics .

Of course, that isn't all the exporters that Prometheus has. And, of course, you always can write your own exporter :)

Server

The next step is Prometheus server itself. But before starting it, we have to set up its configuration file where we list all our nodes (I've limited the amount of masters/nodes to make this article more readable):

/etc/prometheus/prometheus.yml

# See https://prometheus.io/docs/operating/configuration/ global: scrape_interval: 1m scrape_timeout: 10s evaluation_interval: 1m rule_files: - '/etc/prometheus/alert.rules' # A scrape configuration containing exactly one endpoint to scrape: scrape_configs: - job_name: 'node' target_groups: - targets: ['10.1.2.132:9100'] labels: {'host': 'master1'} - targets: ['10.1.2.184:9100'] labels: {'host': 'node1'} - targets: ['10.1.3.74:9100'] labels: {'host': 'node2'} - targets: ['10.1.4.62:9100'] labels: {'host': 'node3'} - job_name: 'container' target_groups: - targets: ['10.1.2.132:3002'] labels: {'host': 'master1'} - targets: ['10.1.2.184:3002'] labels: {'host': 'node1'} - targets: ['10.1.3.74:3002'] labels: {'host': 'node2'} - targets: ['10.1.4.62:3002'] labels: {'host': 'node3'} - job_name: 'mesos' target_groups: - targets: ['10.1.2.132:9100'] labels: {'host': 'master1'} - targets: ['10.1.2.184:9100'] labels: {'host': 'node1'} - targets: ['10.1.3.74:9100'] labels: {'host': 'node2'} - targets: ['10.1.4.62:9100'] labels: {'host': 'node3'} - job_name: 'haproxy' target_groups: - targets: ['10.1.2.132:9101'] labels: {'host': 'master1'}

Keep in mind that Prometheus also can retrieve information about exporters from consul, marathon, kubernetes and amazon ec2.

For now let's leave "/etc/prometheus/alert.rules" empty and start Prometheus server:

# https://prometheus.io/docs/introduction/install/ - name: Install Prometheus shell: docker run -d --restart=always --log-driver=journald \ \ -v /etc/prometheus/prometheus.yml:/etc/prometheus/prometheus.yml \ -v /etc/prometheus/alert.rules:/etc/prometheus/alert.rules \ -v /var/lib/prometheus:/var/lib/prometheus \ \ -p 9090:9090 --net="host" \ --name prom-prometheus \ \ prom/prometheus:0.19.2 \ -config.file=/etc/prometheus/prometheus.yml \ -storage.local.retention=8760h \ -storage.local.path=/var/lib/prometheus \ -alertmanager.url=http://localhost:9093 \ -web.external-url=http://youhost.com/prometheus/ \ -web.enable-remote-shutdown=false \ -web.console.libraries=/etc/prometheus \ -web.console.templates=/etc/prometheus tags: prom-prometheus

Here we are! Now our server periodically "scrapes" all the metrics endpoints that we specified in the config file and stores them into its own time series database. Time to open prometheus web UI and take a look what we can do!

So, now we can take a look at average cpu usage

100 - (avg by (host) (irate(node_cpu{mode="idle"}[5m])) * 100)

containers start time

container_start_time_seconds{image=~"artifactory.yourcompany.com/(x1|x2).*"}

or any other of dozens different metrics that we get from our exporters!

These and many, many other metrics are available for you now! Plus you have a special query language which allows you to query the time series database. Unfortunately, the official documentation is quite poor for examples, so I would recommend you take a look at this article from DO or this one from somewhere else.

Alerts

Of course, having an ability to analyze such data is very important, but we want more convenient UI and real time notifications! Let's consider the last ones. For that, we have to set up the notification rules:

/etc/prometheus/alert.rules

ALERT InstanceDown IF up == 0 FOR 10m LABELS { severity = "page" } ANNOTATIONS { summary = "Instance {{$labels.host}} is down", description = "{{$labels.host}} of job {{$labels.job}} has been down for more than 5 minutes" } ALERT InstanceHighCpu IF 100 - (avg by (host) (irate(node_cpu{mode="idle"}[5m])) * 100) > 5 FOR 10m LABELS { severity = "page" } ANNOTATIONS { summary = "Instance {{$labels.host}}: cpu high", description = "{{$labels.host}} has high cpu activity" } ALERT InstanceLowMemory IF node_memory_MemAvailable < 268435456 FOR 10m LABELS { severity = "page" } ANNOTATIONS { summary = "Instance {{$labels.host}}: memory low", description = "{{$labels.host}} has less than 256M memory available" } ALERT InstanceLowDisk IF node_filesystem_avail{mountpoint="/etc/hosts"} < 10737418240 FOR 10m LABELS { severity = "page" } ANNOTATIONS { summary = "Instance {{$labels.host}}: low disk space", description = "{{$labels.host}} has less than 10G FS space" } ALERT ContainerStarted IF time() - container_start_time_seconds{image=~"artifactory.yourcompany.com/.*"} <= 60 and ( container_start_time_seconds{image=~"artifactory.yourcompany.com/.*"} % (60 * 60 * 24) / (60 * 60) < 6.8 or container_start_time_seconds{image=~"artifactory.yourcompany.com/.*"} % (60 * 60 * 24) / (60 * 60) > 7.5 ) LABELS { severity = "page" } ANNOTATIONS { summary = "Container {{$labels.image}} started", description = "Container {{$labels.image}} has been started on {{$labels.host}}" } ALERT ContainerStopped IF time() - container_last_seen{image=~"artifactory.yourcompany.com/.*"} > 60 * 5 LABELS { severity = "page" } ANNOTATIONS { summary = "Container {{$labels.image}} stopped", description = "Container {{$labels.image}} has been stopped on {{$labels.host}}" }

Notice "ContainerStarted" rule. PromQL might be a good thing for dealing with time series data, but when you have to deal with ordinary administration problems (like turn of container restart check for a specific period of time, e.g. deployment time) - you have some problems: distinguish scalar values and vectors, lack of date-specific functions and many other surprises.

But anyway, we've just configured our alerts! Now we can see them on a special page:

Notifications

That's good! But how to make this message appear in Slack? For that, we have to set up our final configuration file:

/etc/prometheus/alertmanager.com

# https://prometheus.io/docs/alerting/configuration/#slack-receiver-<slack_config> # https://prometheus.io/blog/2016/03/03/custom-alertmanager-templates global: route: receiver: 'slack' group_wait: 30s group_interval: 5m repeat_interval: 4h receivers: - name: 'slack' slack_configs: - api_url: 'https://hooks.slack.com/services/XXX/YYY' channel: 'your-monitoring' text: '{{ .CommonAnnotations.description }}'

We just wrote a simple route which pushes all the alerts into pre-registred Slack incoming web-hook. Here I want to draw your attention to the following facts:

we don't use "resolved" messages since they are annoying for containers start/stops;

the way we use annotations' text is undocumented;

Prometheus authors have recently changed configuration format, so even those few pages that you could google now contain irrelevant data :(

But let's goo further! Now all we need is --love-- to install alertmanager itself:

# https://github.com/prometheus/alertmanager - name: Install AlertManager shell: docker run -d --restart=always --log-driver=journald \ \ -v /etc/prometheus:/etc/prometheus/ \ \ -p 9093:9093 \ --name prom-alertmanager \ \ prom/alertmanager:0.1.1 \ -config.file=/etc/prometheus/alertmanager.conf tags: prom-alertmanager

That's it! Set up your channel and receive notification right into your mobile phone!

Visualization

And the final step. As I told you before Prometheus server has an UI, but it is designed only for specific purpose - data querying. Also, it has so-called "consoles" which allows to represent data in a bit pretty way

http://youcompany.prometheus/consoles/node.html

but, as you can see, that is not an appropriate solution :)

So, if you want to have a nice dashboard with graphics and histogram, guys from Prometheus recommend you use Grafana, which is integrated with Prometheus out of the box!

# https://www.digitalocean.com/community/tutorials/how-to-add-a-prometheus-dashboard-to-grafana # https://prometheus.io/docs/visualization/grafana/ - name: Create Grafana as persistent volume storage shell: docker run -d --log-driver=journald \ \ -v /var/lib/grafana \ --name prom-grafana-storage \ \ busybox:latest - name: Start Grafana shell: docker run -d --restart=always --log-driver=journald \ \ --volumes-from prom-grafana-storage \ \ -e "GF_SERVER_ROOT_URL=http://youcompany/grafana/" \ -e "GF_AUTH_ANONYMOUS_ENABLED=true" \ -e "GF_AUTH_BASIC_ENABLED=false" \ \ -p 3000:3000 \ --name prom-grafana \ \ grafana/grafana:3.0.4

After you install Grafana, you have to set up Prometheus as a datasource, and set up appropriate metrics:

Finally, your monitoring is ready. Enjoy it :)