Hi all, there is a very quick guide how to configure a system monitoring for one or more servers using a modern stack of technologies, like Grafana, Docker and Telegraf with Influxdb.

The main goal for this article is to show how to start getting system metrics from your servers quick and easy, without spending lot of time for configuring big and complicated monitoring systems. Especially if you only need to take care for few Web servers instead of monitoring big company infrastructure with hundreds of devices.

We’ll use Docker, for the quick deployment our monitoring system, also it’ll gives us a freedom to use any software with dependency free and keep our system clean after. Topical, You can look at Docker from this side too, as a cross-platform package system.

Also, we’ll use part of Tick stack, namely an Influxdb base to store our metrics and Telegraf, like an agent on remote system, for nice and pretty graphs we’ll take Grafana.

Legend:

server1 — monitoring server

server2, server3 and so on — servers from which we need metrics

On server1 we’ll prepare our monitoring system, for this we need a Linux with installed Docker and Docker Compose on it.

First create a folder for our project, for example /opt/monitoring:

server1$ mkdir /opt/monitoring && cd /opt/monitoring

Inside directory we need to create docker-compose.yml file with Grafana and Influxdb services:

version: "2"

services:

grafana:

image: grafana/grafana

container_name: grafana

restart: always

ports:

- 3000:3000

networks:

- monitoring

volumes:

- grafana-volume:/var/lib/grafana influxdb:

image: influxdb

container_name: influxdb

restart: always

ports:

- 8086:8086

networks:

- monitoring

volumes:

- influxdb-volume:/var/lib/influxdb networks:

monitoring: volumes:

grafana-volume:

external: true

influxdb-volume:

external: true

As you can see we use our own docker network for these services with name monitoring and external volumes to store data and configurations, we use external volumes to prevent any data loss after containers restarting.

Actually, using own network for the group of containers is a good practice, for easy logical separating containers which belongs to same project and also to get a easy build-in service discovery, from docker engine.

Now we need to create this Docker network and volumes:

server1$ docker network create monitoring

server1$ docker volume create grafana-volume

server1$ docker volume create influxdb-volume

Make sure that all was created fine:

server1$ docker network ls

NETWORK ID NAME DRIVER SCOPE

8a744bc6ce04 bridge bridge local

a9fe3f026042 host host local

75c2b515def9 monitoring bridge local

c1a42ddaa998 none null local



server1$ docker volume ls

DRIVER VOLUME NAME

local 69c5364fab3baa7b1c9418ace9c91dfcf13e54f0adce247136d887e46a347baf

local grafana-volume

local influxdb-volume

As we can see the network and volumes was created OK, now we need to prepare the Influxdb parameters, for this we’ll run the container with some environment variables for creating database and users:

server1$ docker run --rm \

-e INFLUXDB_DB=telegraf -e INFLUXDB_ADMIN_ENABLED=true \

-e INFLUXDB_ADMIN_USER=admin \

-e INFLUXDB_ADMIN_PASSWORD=supersecretpassword \

-e INFLUXDB_USER=telegraf -e INFLUXDB_USER_PASSWORD=secretpassword \

-v influxdb-volume:/var/lib/influxdb \

influxdb /init-influxdb.sh

We run this container with –rm key, this will only create configs and remove the container after.

Well all preparations are done, and we ready to start our new monitoring system, will do it by using docker-compose, go to the /opt/monitoring Dir and run:

server1$ docker-compose up -d



Creating network "monitoring_monitoring" with the default driver

Creating grafana

Creating influxdb



server1$ docker ps



CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

8128b72bdf44 grafana/grafana "/run.sh" 23 seconds ago Up 20 seconds 0.0.0.0:3000->3000/tcp grafana

c00416d0d170 influxdb "/entrypoint.sh infl…" 23 seconds ago Up 21 seconds 0.0.0.0:8086->8086/tcp influxdb

OK, all containers was created and started, so our monitoring system ready to serve incoming requests. We expose few ports, as you can see in docker-compose file, the 8086 HTTP API port for Influxdb data and port 3000 for Grafana web UI.

And we almost done with our new monitoring system, it’s really quick and easy using Docker. To fully complete we only need to configure Grafana a bit, create a dashboard and new data source for Influxdb.

For this will go to our server1 public_ip:3000 (192.168.0.1:3000 in our example) in browser, and login to the Grafana web UI for very first time using:

login: admin

password:admin

Then Grafana will ask you to change password, and after that you’ll get inside:

Select the Add data source menu to tell Grafana where to get the Influxdb data:

There we need to select Type = InfluxDB, give the Name for this data source, then put the URL using our influxdb container name as address. As I say previously Docker give to us a easy service discovery so.

OK, we also need to insert the Database name and user/password for our database, these parameters was created by previously running the Influxdb container.

Click on Save & Test to see that your data source is OK:

Great we just added our influxdb as data source for Grafana, for the time economy we’ll take a prepared dashboard that contains most popular parameters, go to the grafana.com and select one you like. For example this:

Copy the number 914 and then insert it in your grafana import menu:

That’s all, we only need to install telegraf on systems that we want to control, and configure it to send data to influxdb on server1.

You can install telegraf as package or compile the latest version and just copy it to the remote server.

Change the telegraf config and configure influxdb database parameters.

Also enable plugins that you need.

###############################################################################

# OUTPUT PLUGINS #

############################################################################### # Configuration for sending metrics to InfluxDB

[[outputs.influxdb]]

## The full HTTP or UDP URL for your InfluxDB instance.#### Multiple URLs can be specified for a single cluster, only ONE of the urls = ["http://server1_ip:8086"] ## The target database for metrics; will be created as needed.

database = "telegraf" ## If true, no CREATE DATABASE queries will be sent. Set to true when using## Telegraf with a user without permissions to create databases or when the## database already exists.

skip_database_creation = true ## Name of existing retention policy to write to. Empty string writes to## the default retention policy. Only takes effect when using HTTP.# retention_policy = "" ## Write consistency (clusters only), can be: "any", "one", "quorum", "all".## Only takes effect when using HTTP.# write_consistency = "any" ## Timeout for HTTP messages.

timeout = "5s" ## HTTP Basic Auth

username = "telegraf"

password = "secretpassword"

Well done :) now we have a pretty nice dashboard for minimum of time:

You can install Telegraf on more then one server, and also send system metrics to one InfluxDB base, then you’ll need only choose the server name in Grafana dashboard to view them.

One more important thing is about the duration of saving data in InfluxDb, by default it set to 7 days, so if you need more then this, exec in to influxdb container and change the retention policy manually.

Well, I guess now we have a nice monitoring system in 5 min :)

Good luck.