https://prometheus.io is an open source time series database that focuses on capturing measurements and exposing them via an API. I love Prometheus because it it so simple; it’s minimalism is its greatest feature. It achieves this by pulling metrics from instrumented applications, not pulling like many of its competitors. In other words Prometheus “scrapes” the metrics from the application.

This means that it works very well in a distributed, cloud-native environment. All of the services are unburdened by load on the monitoring system. This has knock on effects meaning that HA is supported through simple duplication and scaling is supported through segmentation.

Are your team Prometheus experts? Winder Research offers Prometheus monitoring training to help teams run monitoring in production. Check out the Prometheus training course then get in touch to organise dates.

Architecture

Prometheus architecture

Prometheus scrapes metrics from instrumented applications, either directly or via an intermediary push gateway. Think of this as a form of buffer. It accepts and stores pushed metrics, and exposes a scrapable API for prometheus.

The bulk of the codebase and the effort spent by the authors goes into how the timeseries data is stored. This comprises, by default, of a file-based data store that scales well and is efficient.

Prometheus contains a simple query language that allows you to evaluate and aggregate the time series data. A query forms the basis of all monitoring tasks, which include visualisation of current statistics and alerting.

Finally, Prometheus exposes a REST API for external consumers, such as dashboards.

Where Does it Work Well?

Prometheus is ideally suited to mission critical microservice-based applications because of its inherent simplicity. When all of your applications are failing, Prometheus will be still running.

It handles multi-dimensional data well through the use of key-value labels. It is fast and easy to filter data based upon their labels. For example, you would use the same metric name for the same type of data, only distinguishing context (e.g. the status code) through labels.

Where Does it Not (try to) Work Well?

Remember that Prometheus scrapes data. The time at which Prometheus performs that scrape is not guaranteed. Hence, if you have a use case that requires accurate second-by-second scrapes, this may not be a good choice.

Also, Prometheus is unreservedly HTTP focused. If you operate in a monolithic SOAP based, or RPC based environment where HTTP isn’t used, you may have integration problems.

Prometheus vs…

The authors have done a good job of describing Prometheus in terms of it’s competitors. I’d recommend seeing the Prometheus documentation to see a good comparison.

But generally, I think Prometheus’ minimalism is the unique selling point. The pull model, the focus on using simple distribution mechanisms and the ease of getting started all means that it is easier to run in production. The potential downside is that it is unashamedly developer focused, which means it might be difficult to sell to business users, especially with all the pretty business intelligence marketing materials around.

Data Model

All of the data is stored as a time series. I.e. a measurement with a timestamp. Measurements are known as metrics. Each time series is uniquely identified by a metric name and a set of key-value pairs, a.k.a. labels.

This means that labels represent multiple dimensions of a metric. A combination of a metric name and a label yields a single metric. In other words, each time you create a new key-value pair on a metric you will get a new timeseries in the database. Hence, be very careful that your key-value pairs are constrained. Do not store things such as IDs or email addresses, which are unbounded.

An observation (they call it a sample!) is a combination of a float64 value and a millisecond precision timestamp. Prometheus does not store strings! It is not for logging!

Given a metric name and key-value labels, the following format is used to address metrics:

<metric name>{<label name>=<label value>, ...}

Types of Metrics

Prometheus caters for different types of measurements by having four different types of metrics.

Counter : A cumulative metric that only ever increases. (E.g. requests served, tasks completed, errors occured)

: A cumulative metric that only ever increases. (E.g. requests served, tasks completed, errors occured) Gauge : A metric that can arbitrarily go up or down. (E.g. temperature, memory usage)

: A metric that can arbitrarily go up or down. (E.g. temperature, memory usage) Histogram : Binned measurement of a continuous variable. (E.g. latency, request duration, age)

: Binned measurement of a continuous variable. (E.g. latency, request duration, age) Summary: Similar to a histogram, except the bins are converted into an aggregate (e.g. 99% percentile) immediately.

If you are developing REST based services, then you can usually measure most of the things you are interested in with a histogram. This is because you not only get a series of bins, but also a count and a total sum of the data too. So you can use a histogram to calculate request rates, error rates and durations with only a single metric.

But bear in mind that histograms and calculating quantiles can be expensive (compared to a simple increment of a counter!).

Also, I would avoid using a summary, because it locks you into picking a fixed aggregate. With a histogram you can change your mind later (e.g. switching from 99th-percentile to 95th-percentile).

How are names generated?

When Prometheus scrapes an instance, it automatically adds certain labels to the scraped time series:

job : The job name that the target belongs to

: The job name that the target belongs to instance : The host:port combination of the target’s url that was scraped.

- up{job="<job-name>", instance="<instance-id>"} : 1 if the instance is healthy, i.e. reachable, or 0 if the scrape failed.

scrape_duration_seconds{job="<job-name>", instance="<instance-id>"} : duration of the scrape.

And a couple of others. The up time series is particularly useful for availability monitoring. For all other metrics, you decide the metric name and labels.

Metric Naming Best Practices

Names are vitally important. They infer meaning in moments of crisis. And because Prometheus is so permissive, organisations should generate a naming convention so that:

You know what to call your metrics during development

Users can understand what your metric means with just a glance

Labels should be chosen so that they differentiate the context of the metric. Everyone should use the same convention.

Consistent Domain-based Prefixes

A prefix is the first word in a metric name. Often called a namespace by client libraries. Choose a prefix that defines the domain of the measurement.

Examples:

HTTP related metrics should all have a prefix of http : http_request_duration_seconds

Application specific metrics should refer to a domain: users_total

Process level metrics are exported by many libraries by default: process_cpu_seconds_total

Consistent Units

Use SI (International System of Units) units.

E.g.

Seconds

Bytes

Metres

Not milliseconds, megabytes, kilometres, etc.

A Single Unit Per Metric

Do not mix metrics. E.g.:

One instance that reports seconds with another reporting hours

Aggregate metrics. E.g. bytes/second. Use two metrics instead.

Suffix Should Describe the Unit

In plural form.

Examples:

http_request_duration_seconds

node_memory_usage_bytes

http_requests_total (for a unit-less accumulating count)

(for a unit-less accumulating count) process_cpu_seconds_total (for an accumulating count with unit)

Mean the Same Thing

The metric name should mean the same thing across all label dimensions.

E.g.

The metric http_requests_total means the same thing, despite having different labels as below:

http_requests_total{status=404}

http_requests_total{status=200}

http_requests_total{status=200, path="/users/"}

Testing That Names Make Sense

One tip provided by the authors is very useful:

either the sum or average over all dimensions should be meaningful (though not necessarily useful)

E.g.

the capacity of various queues in one metric is good, while mixing the capacity of a queue with the current number of elements in the queue is not.

Generally, split metrics into single unit types.

Labelling Best Practices

Labels should be used to differentiate context. For example, since http requests mean the same thing, they should be grouped within the same metric, irrespective of the service generating the metric. To differentiate this service from others we apply labels to denote this service’s context. For example:

api_http_requests_total - differentiate request types: type="create|update|delete” api_request_duration_seconds - differentiate request stages: stage="extract|transform|load”

Do not put context in metric names as it will cause confusion. Also, when we perform aggregations, it’s really nice to see the aggregations for different contexts.

1 Label === 1 Dimension

Each new key-value pair represents a new dimension. If you pick a key that has many values, this dramatically increases the amount of data that must be stored.

For example, DO NOT STORE:

ID’s

Email addresses

Timestamps of any kind

Anything unbounded

Generally, All key values must be a bounded set.

Resources

Much of this information comes from the excellent Prometheus documentation. But some advice is my own.