Announcing Prometheus SQL Exporter

At JustWatch we rely on different databases to store and process our data. There are some specialized ones like ElasticSearch and AeroSpike, but the most versatile one for many of our use cases is still PostgreSQL. We enjoy using many of PostgreSQLs advanced features.

In order to keep our data safe we need to monitor the status of these databases. There were some existing exporters for this task, but they didn’t really fit our use case. They were either not working up to spec or too focused on the systems monitoring part. What we needed was more of a general approach that would allow us to export any KPI from SQL to Prometheus.

Out of this need the SQL Exporter was created. It’s a very small codebase but it allows for very flexible configuration and together with the right recording rules and Grafana dashboards proved to be very helpful.

The core concept of this exporter is based on the idea that a proper SQL query can easily be mapped onto a set of labels and one or more numbers that make up an valid Prometheus metric.

Walkthrough

In the end, our SQL exporter makes monitoring any metric you can imagine as simple as writing a SQL query that exposes the labels of a time series as text fields and any metrics as float fields. Add a database connection and a schedule – and you’re done.

To better explain how this works let’s use an example. We’ll use the following config to explain the available options and how the time series are built.

--- jobs: - name: "per-db-stats" interval: '15m' keepalive: false connections: - 'postgres://postgres_exporter@host/db1' - 'postgres://postgres_exporter@host/db2' queries: - name: "pg_statio_user_tables" help: "IO Stats" labels: - "schemaname" - "relname" values: - "heap_blks_read" - "heap_blks_hit" - "idx_blks_read" - "idx_blks_hit" query: | SELECT schemaname::text , relname::text , heap_blks_read::float , heap_blks_hit::float , idx_blks_read::float , idx_blks_hit::float FROM pg_statio_user_tables;

This configuration defines one job to export the PostgreSQL I/O statistics tables to Prometheus. These tables are different in each database, so we need to connect to each database in turn and collect all of the metrics. Two properties of the config struct come in handy here: Being able to terminate the connection between the different runs and the possibility to defined multiple connection strings over which the exporter will iterate.

The most interesting part are of course the queries. Each job takes any number of different queries that will be executed sequentially on all connection URLs. Each query produces an metric named after the query name with an sql_ prefix. In this example the exported metric will be named sql_pg_statio_user_tables .

Running this query on one of our stage databases yields the following ouput (edited for brevity):

schemaname | relname | heap_blks_read | heap_blks_hit | idx_blks_read | idx_blks_hit ------------+--------------------+----------------+---------------+---------------+-------------- grafanadb | plugin_setting | 0 | 0 | 6 | 4076514 grafanadb | alert | 0 | 0 | 12 | 2038576 grafanadb | preferences | 0 | 0 | 3 | 1019123 grafanadb | dashboard | 57 | 312 | 115 | 65

As we defined the columns schemaname and relname as labels the fully qualified metrics name will look like this:

sql_pg_statio_user_tables{schemaname="grafanadb",relname="dashboard"}

Now there are different values per table, but as each fully-qualified time-series in Prometheus can only have one float value per point in time, we need to introduce one more label: col . This will contain the value of the column that each value was read from, e.g.

sql_pg_statio_user_tables{schmaname="grafanadb",relname="dashboard",col="heap_blks_read"} = 57.0

As the queries can be run on multiple connections we need this information in the labels as well:

sql_pg_statio_user_tables{schmaname="grafanadb",relname="dashboard",col="heap_blks_read",database="db1",host="host",user="postgres_exporter"} = 57.0

Deployment

We deploy the SQL Exporter in Kubernetes and use Grafana to display the metrics.

The repository contains everything to get started, including example Alerts and Recording Rules for Prometheus.

There is no such thing as a free lunch

My database professor at the university always used to say There is no such thing as a free lunch .

While this exporter comes with great flexibility it requires some caution. It’s very easy to overload either your Database or Prometheus. The database may suffer from too expensive queries running too often and the Prometheus server can easily be overwhelmed by a huge number of timeseries.

Closing Remarks

We showed how one can easily export system metrics, KPIs, and basically everything that is available to SQL queries to Prometheus in a very straight-forward, time-saving way. We hope you enjoy this little tool like we do and if you have any suggestions feel free to open an Issues or PR on the GitHub project.

This post was brought to you by Dominik Schulz, Head of Infrastructure. If you like to work with Go, Kubernetes and the Google Cloud Platform, we’re currently hiring Data Engineers and SREs.

Stay tuned.