Overview

In this post, I am will be go over using the Collectd input on Logstash for gathering hardware metrics. For Logstash output, I am going to be using Elastcisearch. Logstash will allow us to centralize metrics from multiple computers into Elasticsearch. On top of Elasticsearch, I am using going to be using Kibana to display these metrics.

To this Kibana dashboard, we could add additional metrics for the processes taxing the system being monitored. This would effectively show a cause and effect story in one integrated dashboard.



Gist e94ad12dfe84426971bd

Pre-requisites

I am using one VM running Ubuntu 14.04 for this post. You may have to change these steps, as needed, to match your environment. Also, make sure you have updated Java before starting.

First, we have our Linux daemon gathering metrics.

Collectd – Used to collect metrics from our system(s). We could be using Collectd in multiple systems to collect each system’s metrics.

Then we have the rest of the tools as follows

Logstash – Used to transport and aggregate our metrics from each system into our destination. This destination could be a file or a database or something else. Logstash works on a system of plugins for input, filtering and output. In this case, our input is Collectd and out output is Elasticsearch.

Elastcisearch – Used to store and search our collected metrics. This is our Logstash output.

Kibana – Used to display our metrics stored in Elasticsearch.

The combination of the last three tools is commonly called the ELK stack.

Installing Everything

Collectd

The easiest way to install Collectd is to use apt-get. Install both collectd and collectd-utils.

sudo apt-get update sudo apt-get install collectd collectd-utils

Elk Stack

Before we can install Logstash, we need to add the Logstash apt repo to our system first.

wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add - deb http://packages.elasticsearch.org/logstash/1.4/debian stable main

Then similarly to Collectd, install Logstash,

sudo apt-get update sudo apt-get install logstash

Elasticsearch has not apt repo so… getting and installing Elasticsearch looked like this for me.

wget -O - http://packages.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-1.1.1.deb sudo dpkg -i elasticsearch-1.1.1.deb

Kibana comes with Elastcisearch so there is no installation needed for it.

Configuring Everything

Collectd

For Collectd, we have to create a configuration file.

# For each instance where collectd is running, we define # hostname proper to that instance. When metrics from # multiple instances are aggregated, hostname will tell # us were they came from. Hostname "ilos-stats" # Fully qualified domain name, false for our little lab FQDNLookup false # Plugins we are going to use with their configurations, # if needed LoadPlugin cpu LoadPlugin df &amp;lt;Plugin df&amp;gt; Device "/dev/sda1" MountPoint "/" FSType "ext4" ReportReserved "true" &amp;lt;/Plugin&amp;gt; LoadPlugin interface &amp;lt;Plugin interface&amp;gt; Interface "eth0" IgnoreSelected false &amp;lt;/Plugin&amp;gt; LoadPlugin network &amp;lt;Plugin network&amp;gt; Server "192.168.1.43" "25826" &amp;lt;/Plugin&amp;gt; LoadPlugin memory LoadPlugin syslog &amp;lt;Plugin syslog&amp;gt; LogLevel info &amp;lt;/Plugin&amp;gt; LoadPlugin swap &amp;lt;Include "/etc/collectd/collectd.conf.d"&amp;gt; Filter ".conf" &amp;lt;/Include&amp;gt;

Each plugin will gather different pieces of information. For an extensive list of plugins and their details, go to the Collectd Plugins page.

The configuration above will offer a solid set of metrics to begin our monitoring task. It will provide details on cpus, harddrive, network interface, memory and swap space. This may be about 5% of what Collectd can gather!

Logstash

For the purposes of this post, Logstash’s sole task is to pick up metrics from Collectd and deliver them to Elasticsearch. For this, we are going to define one input and one output.

input { udp { port =&amp;gt; 25826 # 25826 matches port specified in collectd.conf buffer_size =&amp;gt; 1452 # 1452 is the default buffer size for Collectd codec =&amp;gt; collectd { } # specific Collectd codec to invoke type =&amp;gt; collectd } } output { elasticsearch { cluster =&amp;gt; logstash # this matches out elasticsearch cluster.name protocol =&amp;gt; http } }

Elasticsseach

The only edits we are making to the Elasticsearch configuration is to revise both our cluster and node name.

cluster.name: logstash node.name: ilos

Kibana

Just like at installation time, there is no configuration necessary for Kibana, yay!

Testing The Setup

After starting all these services (Collectd, Logstash, Elasticsearch and Kibana), lets validate our setup.

Lets make sure everything is up and running. Lets go to Kibana URL (9292) and load up the default Logstash Dashboard.

I’ve created a gist of this dashboard to share. If the default dashboard loaded successfully, you can either follow this post to create your own dashboard or you can just grab the GIST from Kibana itself (try Load > Advanced…) and edit at will; enjoy.

If some of the displays do not register any metrics, it is most likely because the attribute names on your system differ from mines. Just edit the queries as needed.

Gist e94ad12dfe84426971bd

Reference Dashboard

To get our feet wet, lets create the simplest of dashboards. We are going to create one visualization showing the list of events coming from Collectd which will serve as our reference for creating all the displays we want.

From Kibana’s home, select the last link from the right pane, Blank dashboard. Select a Time filter from the dropdown so that we bring some events in. Note how now we have a filter under filtering. Click the upper right gear, name this dashboard Collectd. Click Index tab, select day for timestamping and check Preload fields. Save. Click Add a row, name this row Events, click Create row and Save. There is now a column with three buttons on the left edge of the browser, click the green one to Add a panel. Select Table for panel type, title this All Events and pick a span of 12. Save. From the upper right menu, Save your dashboard.

This panel is showing the events as they come into Elasticsearch from Collectd. Way to go Logstash! From these events, it extracts fields for reporting on.

This is a good place to start thinking of attributes and metrics. Each of the fields shown is an attribute we can report metrics on. In this post’s case, these fields will vary depending on the plugins defined in our Collectd configuration. You can think about objects if you want as well. Either way, depending on the plugin, we will a different set of attributes to report on. In turn, if we had a different Logstash input, we would end up with a completely different set of attributes.

Each record includes the following bits of information over time.

host – this matches the hostname defined in collectd.conf. Handy attribute for aggregating from multiple event sources.

plugin – Matches one of the plugins defined in collectd.conf

plugin_instance – Means of grouping a measurement from multiple instances of a plugin. For example, say we had a plugin of cpu with a type_instance of system, on a dual cpu machine, we would have plugin_instance 0 and 1.

collectd_type – Mostly follows plugin.

type_instance – These are the available metrics per plugin.

value – This is the actual measurement for said type_instance for each plugin_instance for each plugin…

type – Collectd for this exercise.

Collectd Dashboard

For each plugin loaded, lets list their attributes of interest, their types, instances and additional attributes. These will be used to write the Kibana queries which we will use later on to filter the Collectd data for each display we create.

cpu

plugin: cpu

type_instance: wait, system, softirq, user, interrupt, steal, idle, nice

plugin_instance: 0, 1, 2, 3

Kibana queries

plugin: "cpu" AND plugin_instance: "0" plugin: "cpu" AND plugin_instance: "1" plugin: "cpu" AND plugin_instance: "2" plugin: "cpu" AND plugin_instance: "3"

Each of these should have an alias, such as: cpu1, cpu2, cpu3, cpu4.

In the Kibana dashboard we just created, go ahead and add each of these as individual queries on top.

Now we can setup a display as complicated or as simple as possible. Lets try a few.

As before, lets start by creating a new row. We will dedicate this one for CPUs. (Add row: CPU) To this row, lets add a Terms panel. A panel is the same as a display type and is divided in up to 12 sections called panels. Name this cpu1. With 4 cpus for me and 12 available sections per row, I select a span of 3 to fit all cpus on one row. In Parameters, lets select terms_stats as the Terms mode. For Stats type, select max. For Field, type type_instance. For Value field, type value. For Order, select max. For Style, lets select pie. For Queries, do selected and click the query labeled cpu1 we recently created. Save.

Your work should look like this

Repeat these steps for each additional Kibana query above. Ensure to save your dashboard after this step. We’ve come far!

You should have something like this to show.

df

plugin: df

type_instance: reserved, used, free

plugin_instance: root

Kibana queries

plugin: "df" AND plugin_instance: "root"

Following the same pattern as we used to cpus, we end up with something that looks like this.

interface

plugin: interface

plugin_instance: eth0

addtnl attributes: rx, tx

Kibana queries

plugin: "interface" AND plugin_instance: "eth0"

Here we have only one query, just like for the df plugin but we have 2 distinct attributes to filter for, one for received and one for transmit. We just use the same filter twice but specify either rx or tx in the value field.

The creation of the rx visual, for example, looks something like this.

Adding another identical one but for the value field, now of tx, will result in a display combo similar to this.

memory

plugin: memory

type_instance: free, buffered, cached, used

Lets try something different for memory, create the following queries

Kibana queries

plugin: "memory" AND type_instance: "free" plugin: "memory" AND type_instance: "buffered" plugin: "memory" AND type_instance: "cached" plugin: "memory" AND type_instance: "used"

Alias appropriately and select colors of choice. This display is a bit different than the rest, if only because we combine all the type_instance into one histogram. It took me longer to figure this one out.

Use this one as a reference and change as needed.

You should end up with something like this.

swap

plugin: swap

type_instance: cached, in, free, out, used

Kibana queries

plugin: "swap" AND type_instance: "free" plugin: "swap" AND type_instance: "in" plugin: "swap" AND type_instance: "out" plugin: "swap" AND type_instance: "cached" plugin: "swap" AND type_instance: "used"

Similar to our memory display, our swap display ought to look a lot like this.

That was quite the journey. Hopefully, you’ve ended with a dashboard similar to the one shown in the beginning. Most likely, it would only take you a few minutes to make it better; go for it.

Using this dashboard as a base, you could aggregate these same metrics from multiple computers into a single dashboard. This beats having to go into each computer in use to try to figure out what is being taxed and what is not. Furthermore, implementing additional Collectd Plugin, would provide information of the cause of this loads. For example, there are plugins for database(s), for monitoring the JVM and many others.

Maybe that will be the purpose of my next post, enjoy.