Before we can start pushing data into InfluxDB though, we need an instance of it, don’t we? 😋 To keep things simple, we’ll just run one via docker:

docker run -p 8086:8086 --env=INFLUXDB_DB=mydb influxdb:alpine

The above command starts an ephemeral InfluxDB instance with it’s API port exposed to localhost and also pre-creates an empty database mydb . InfluxDB supports segregating multiple data-sets from each other, optionally requiring user authentication on top, but we won’t be using that here.

Exporting metrics into InfluxDB from Go

With our InfluxDB instance up and running, lets get back to our demo server’s metrics and instead of printing them periodically to the console, push them to the time series database.

First thing’s first, we need to be able to talk to the database from Go. Luckily InfluxDB is written in Go, so naturally there’s an official client library too for it, which we can simply import github.com/influxdata/influxdb/client/v2 .

Exporting the metrics boils down to creating an HTTP client to the database server; transforming our custom Go metrics into InfluxDB time series points; and finally pushing them.

Since we replaced our console logger with the above reporting, executing the demo (full code) will just sit there silently. If we however look at the InfluxDB logs, we’ll find a steady stream of write operations:

[httpd] 172.17.0.1 - - [19/Nov/2018:16:39:03 +0000] "POST /write?consistency=&db=mydb&precision=ns&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" 9fece4fb-ec19-11e8-8009-0242ac110002 9768

[httpd] 172.17.0.1 - - [19/Nov/2018:16:39:08 +0000] "POST /write?consistency=&db=mydb&precision=ns&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" a2e97c8a-ec19-11e8-800a-0242ac110002 4109

[httpd] 172.17.0.1 - - [19/Nov/2018:16:39:13 +0000] "POST /write?consistency=&db=mydb&precision=ns&rp= HTTP/1.1" 204 0 "-" "InfluxDBClient" a5e52ec1-ec19-11e8-800b-0242ac110002 4228

Visualizing metrics

Ok, we’ve managed to export the metrics we supposedly needed into this thing called a time series database, but that didn’t get us closer to analyzing them. If anything, we’re seemingly further away than the console logs, which we could at least take a peek at.

The final step is to create meaningful visualizations of these time data points. Although I’m sure there are many tools that could be used, one of the current leaders is an open source project called Grafana, also written in Go (sources)!

As with InfuxDB, we don’t want to install any messy dependencies to our local machine, so we’re going to launch a Grafana instance via docker too:

docker run -p 3000:3000 grafana/grafana

You can access your instance via http://localhost:3000 and sign in with the default credentials admin / admin .

After logging in, you’ll be greeted with a dashboard, telling you that the next thing you need to do is to add a new data source. Most options are fairly self explanatory, perhaps the only curious one is the “Browser” access. It’s telling Grafana to access InfluxDB through the browser, saving us an step linking the two docker containers.

You’ll see a green “Data source is working” notification if everything is set up correctly. The last step is to create a new dashboard (side menu plus button), add a graph visualization and fill it with data. You can find the Edit button if you click the “Panel title” header bar.

Before editing the actual fields, let’s recap the metrics we have exported from our file server simulator:

We have two measurement streams exported: connections and bandwidth , the former containing the count field whilst the latter the egress . To create our first visualization, select connections for the “select measurement”; pick count for “value” inside “filed(value)”; and remove “time($__interval)” from the query rule. You should end up with something like this:

You can close the chart editor with the small X on the top right corner (might be middle-right on your screen if the chart preview is also shown). Since most probably you have only a few data points fed into the time series database yet, adjust the time range from “Last 6 hours” to “Last 5 minutes” on the top right.

Repeat the same for bandwidth.egress and voilà, you have 2 beautiful charts on how the connection count and egress bandwidth evolves within your app, along with historical data retention and infinite analysis capabilities!

Digging further into Grafana is out of scope, but I wholeheartedly recommend exploring all its capabilities, as you’ll find it an exceedingly capable tool. Next up however, anonymity!

Anonymous metrics

If you are running your own private infrastructure and want to collect metrics to track what your machines are doing, you are pretty much done. Go, have a blast with your newly found knowledge!

If, however, you are a software vendor wanting to collect telemetry for devices you don’t necessarily own, you’ll quickly run into resistance: people will freak out (when Caddy introduced metrics, hacker-news blew up)!

Honestly, is anonymity a legitimate concern? Yes, yes it absolutely is! When the select few internet giants are data mining your every movement to feed you ads and use all your personal data for perfecting their own services, it’s natural to have a huge backlash against sweeping up metadata.

If you’re Microsoft, Facebook, Google or Apple, then you have a “get out of jail free” card because you’ve already locked the entire world into your ecosystem. If you’re a small fish however, you have to improvise… people will much more readily accept telemetry collection if you can prove it’s not possible to identify them (and as long as you don’t go overboard and collect too many details).

In the case of the Caddy web-server, do I care that it uploads how many requests it served yesterday? Nope! Do I care that the telemetry server knows *I* served that many requests yesterday? Hell yeah! Whilst there are understandable reasons for collecting metrics, there is no technical reason whatsoever for collecting personal metadata along the way.

We kill people based on metadata. ~General Michael Hayden (NSA)

But how can we break the link between the collection and the identification of the measurements? The answer is to tumble the data stream through the Tor network… and we’re going to do that from Go!

Wait, we can use Tor from within Go? Yep, I have an article on it, go read it!