Greetings friends, today I bring you another one of those hidden gems that you like so much. In addition to being free and being able to display it in a few minutes, it has a potential that many commercial tools would like.

Today we are about to create four fresh Grafana Dashboards within minutes, at the end of the blog, we can have some Dashboards (in plural friends) similar to these:

vSphere Overview Dashboard

vSphere Hosts Overview Dashboard

vSphere Datastore Overview

vSphere VM Overview

Telegraf Plugin for VMware vSphere

My friend Craig told me that an official Telegraf plugin for vSphere had been released a few days ago, so the first thing I did was to go to his GitHub and check it out:

The plugin is pure joy, not only because it speaks directly with the vCenter SDK, but also because we can monitor all the following parameters:

Cluster Stats Cluster services: CPU, memory, failover CPU: total, usage Memory: consumed, total, vmmemctl VM operations: # changes, clone, create, deploy, destroy, power, reboot, reconfigure, register, reset, shutdown, standby, vmotion

Host Stats: CPU: total, usage, cost, mhz Datastore: iops, latency, read/write bytes, # reads/writes Disk: commands, latency, kernel reads/writes, # reads/writes, queues Memory: total, usage, active, latency, swap, shared, vmmemctl Network: broadcast, bytes, dropped, errors, multicast, packets, usage Power: energy, usage, capacity Res CPU: active, max, running Storage Adapter: commands, latency, # reads/writes Storage Path: commands, latency, # reads/writes System Resources: cpu active, cpu max, cpu running, cpu usage, mem allocated, mem consumed, mem shared, swap System: uptime Flash Module: active VMDKs

VM Stats: CPU: demand, usage, readiness, cost, mhz Datastore: latency, # reads/writes Disk: commands, latency, # reads/writes, provisioned, usage Memory: granted, usage, active, swap, vmmemctl Network: broadcast, bytes, dropped, multicast, packets, usage Power: energy, usage Res CPU: active, max, running System: operating system uptime, uptime Virtual Disk: seeks, # reads/writes, latency, load

Datastore stats: Disk: Capacity, provisioned, used



Impressive! right?, if you do not have yet Telegraf, InfluxDB and Grafana follow these steps (these for Grafana), but for some of you, who already have followed the whole series in Spanish, we only have to update our system to receive the vSphere plugin for Telegraf:

sudo apt-get upgrade 1 sudo apt - get upgrade

We will be able to see the telegraf package with an update, so we will say yes when it asks us to update:

Reading package lists... Done Building dependency tree Reading state information... Done Calculating upgrade... Done The following packages have been kept back: linux-generic-lts-utopic linux-headers-generic-lts-utopic linux-image-generic-lts-utopic The following packages will be upgraded: bind9-host curl dnsutils filebeat influxdb libbind9-90 libcurl3 libcurl3-gnutls libdns100 libglib2.0-0 libglib2.0-data libisc95 libisccc90 libisccfg90 liblwres90 telegraf tzdata 17 upgraded, 0 newly installed, 0 to remove and 3 not upgraded. Need to get 50.8 MB of archives. After this operation, 17.6 MB of additional disk space will be used. Do you want to continue? [Y/n] y 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Reading package lists . . . Done Building dependency tree Reading state information . . . Done Calculating upgrade . . . Done The following packages have been kept back : linux - generic - lts - utopic linux - headers - generic - lts - utopic linux - image - generic - lts - utopic The following packages will be upgraded : bind9 - host curl dnsutils filebeat influxdb libbind9 - 90 libcurl3 libcurl3 - gnutls libdns100 libglib2 . 0 - 0 libglib2 . 0 - data libisc95 libisccc90 libisccfg90 liblwres90 telegraf tzdata 17 upgraded , 0 newly installed , 0 to remove and 3 not upgraded . Need to get 50.8 MB of archives . After this operation , 17.6 MB of additional disk space will be used . Do you want to continue ? [ Y / n ] y

Once we have the package installed, we only need to configure the telegraf.conf, let’s create a the next file under /etc/telegraf/telegraf.d/vsphere-stats.conf with the next content inside it:

## Realtime instance [[inputs.vsphere]] ## List of vCenter URLs to be monitored. These three lines must be uncommented ## and edited for the plugin to work. interval = "60s" vcenters = [ "https://someaddress/sdk" ] username = "someuser@vsphere.local" password = "secret" vm_metric_include = [] host_metric_include = [] cluster_metric_include = [] datastore_metric_exclude = ["*"] max_query_metrics = 256 timeout = "60s" insecure_skip_verify = true ## Historical instance [[inputs.vsphere]] interval = "300s" vcenters = [ "https://someaddress/sdk" ] username = "someuser@vsphere.local" password = "secret" datastore_metric_include = [ "disk.capacity.latest", "disk.used.latest", "disk.provisioned.latest" ] insecure_skip_verify = true force_discover_on_init = true host_metric_exclude = ["*"] # Exclude realtime metrics vm_metric_exclude = ["*"] # Exclude realtime metrics max_query_metrics = 256 collect_concurrency = 3 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 ## Realtime instance [ [ inputs .vsphere ] ] ## List of vCenter URLs to be monitored. These three lines must be uncommented ## and edited for the plugin to work. interval = "60s" vcenters = [ "https://someaddress/sdk" ] username = [email protected]" password = "secret" vm_metric_include = [ ] host_metric_include = [ ] cluster_metric_include = [ ] datastore_metric_exclude = [ "*" ] max_query_metrics = 256 timeout = "60s" insecure_skip_verify = true ## Historical instance [ [ inputs .vsphere ] ] interval = "300s" vcenters = [ "https://someaddress/sdk" ] username = [email protected]" password = "secret" datastore_metric_include = [ "disk.capacity.latest" , "disk.used.latest" , "disk.provisioned.latest" ] insecure_skip_verify = true force_discover_on_init = true host_metric_exclude = [ "*" ] # Exclude realtime metrics vm_metric_exclude = [ "*" ] # Exclude realtime metrics max_query_metrics = 256 collect_concurrency = 3

Of course, we will also have to un-comment all the parameters of the plugin.

Once done, if we are not using a valid SSL CA, or if the CA it is not installed on the Grafana, InfluxDB, Telegraf server, please uncomment this as well:

insecure_skip_verify = true 1 insecure_skip_verify = true

Another option is to download the SSL from our vCenter to our Telegraf, to trust it:

openssl s_client -servername YOURVCENTER -connect YOURVCENTER:443 </dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' >/etc/ssl/certs/vcsa.pem 1 openssl s_client - servername YOURVCENTER - connect YOURVCENTER : 443 < / dev / null | sed - ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p' > / etc / ssl / certs / vcsa . pem

Let’s finally restart the telegraf service:

service telegraf restart 1 service telegraf restart

Verifying that we are ingesting information with Chronograf

The normal thing to these heights, if we have made well all the steps, is that already we are sending information compiled by Telegraf towards InfluxDB, if we realize a search using the wonderful Chronograf, we will be able to verify that we have information:

All the variables of this new vSphere plugin for Telegraf are stored in vsphere_* so it’s really easy to find them.

Grafana Dashboards

It is here where I have worked really hard, since I have created the Dashboards from scratch selecting the best requests to the database, finishing colors, thinking which graphic and how to show it, and in addition everything is automated so that it fits with your environment without any problem and without having to edit you anything manually. You can find the Dashboards here, once imported the four, you can move between them with the top menu on the right, now it’s time to download them, or know the ID at least of them:

How to easily import the Grafana Dashboards

So that you don’t have to waste hours configuring a new Dashboard, and ingesting and debugging queries, I’ve already created four wonderful Dashboards with everything you need to monitor our environment in a very simple way, it will look like the image I showed you above.

From our Grafana, we will make Create – Import

Select the name you want and enter one by one the IDs: 8159, 8162, 8165, 8168, which are the unique IDs of the Dashboard, or the URLs:

https://grafana.com/dashboards/8159

https://grafana.com/dashboards/8162

https://grafana.com/dashboards/8165

https://grafana.com/dashboards/8168

With the menu at the top right, you can switch between the Dashboards of Hosts, Datastores, VMs and of course the main one of Overview: Some of the improvements that this Dashboard includes are the variable selections at the top left, depending on what you select, you will be able to see only the Cluster, ESXi, or VM you are interested in. Please leave your feedback in the comments.

That’s all folks, if you want to follow the full Blog series about Grafana, InfluxDB, Telegraf, please click on the next links:

Note: If facing the error “Task Name: Remote View Manager, Status: The request refers to an unexpected or unknown type” please read the next Blog entry.

Like this: Like Loading...