This post explains how you can use Prometheus relabeling configuration to manipulate metrics to keep your storage clean and not pollute it with unnecessary data.

Use cases:

Drop unnecessary metrics Drop unnecessary time-series (metrics with the specific labels) Drop sensitive or unwanted labels from the metrics data Amend label format of the final metrics

These scenarios when implemented will make it so Prometheus applies some logic to the scraped data before they are ingested and permanently stored.

In my case, we are monitoring Docker containers 🐳 with Prometheus and cadvisor and there is a big chunk of metrics data we don’t need to scrape and store. There is no easy way to tell Prometheus to not scrape the specific metrics, however you can do a trick with relabeling in the config file.

Drop unnecessary metrics

- job_name: cadvisor

...

metric_relabel_configs:

- source_labels: [__name__]

regex: '(container_tasks_state|container_memory_failures_total)'

action: drop

It will make it so metrics with names container_tasks_state and container_memory_failures_total are completely dropped and will not be stored in the database. __name__ is a reserved word for a metric name. This helps to reduce disk space usage dramatically.

Drop unnecessary time-series

- job_name: cadvisor

...

metric_relabel_configs:

- source_labels: [id]

regex: '/system.slice/var-lib-docker-containers.*-shm.mount'

action: drop - source_labels: [container_label_JenkinsId]

regex: '.+'

action: drop

This snippet will drop all time-series having a label pair like this id="/system.slice/var-lib-docker-containers.*-shm.mount" or the label container_label_JenkinsId . This is not necessary should it belong to a single metric. It will apply to all metrics having the pre-defined label set. This may help to avoid pollution of the metrics data from unnecessary garbage. With container_label_JenkinsId this is especially useful when you have Jenkins running slaves in Docker containers and you don’t want them to mess with the underlying host container metrics, e.g. Jenkins server container itself, system ones etc.

Drop sensitive or unwanted labels from the metrics

- job_name: cadvisor

...

metric_relabel_configs:

- regex: 'container_label_com_amazonaws_ecs_task_arn'

action: labeldrop

This snippet will drop the label with name container_label_com_amazonaws_ecs_task_arn from all metrics and time-series under the job. This is useful when you don’t want Prometheus to log sensitive data for security reasons. In my case, I prefer to not store AWS resource identifiers as I don’t even need them. Note, with label dropping you need to ensure that the final metrics after label drop are still uniquely labeled and not resulting in duplicate time-series with different values.

Amend label format of the final metrics

- job_name: cadvisor

...

metric_relabel_configs:

- source_labels: [image]

regex: '.*/(.*)'

replacement: '$1'

target_label: id - source_labels: [service]

regex: 'ecs-.*:ecs-([a-z]+-*[a-z]*).*:[0-9]+'

replacement: '$1'

target_label: service

There are 2 rules to execute. The first one takes metrics with a label name image , matches the regex .*/(.*) and puts the last part of it into a new label id . So say container_memory_rss{image="quiq/logspout:20170306"} becomes container_memory_rss{id="logspout:20170306"} .

The second rule parses container names which ecs-agent creates on AWS ECS instances by default and extracts the recognized service name of it and puts into service label. This way you can operate with human-readable container service names.

Thanks for reading.