I investigated using FaaS for scalable Prometheus rule evaluation in part 1 of this series. I found FaaS works great for rules driven by simple models, but the performance characteristics of FaaS added hard constraints for more computationally intensive multi-variate models as discussed in part 2. I will investigate using the data-sidecar service for executing multi-variate models against Prometheus data in this post.

The Data

Like the previous posts, this example uses a model attempting to predict future cryptocurrency prices. The models used here are multi-variate LSTM networks trained and executed with TensorFlow. This post focuses on how to continuously execute the models in a production environment rather than the models themselves. These models are very similar to the ones described in part 2, so refer back to that post to learn about how the models work. The prediction data is available live at http://www.predictatron.net/. Select the “sidecar” option to view data from models described in this post.

Predicted btc/usd exchange rate. Black series is predicted value, green series is the actual value.

data-sidecar

The data-sidecar service is a GO program developed at FreshTracks.io for the purpose of analysis of Prometheus time series data. The sidecar continuously and efficiently queries Prometheus time series data, performs data analysis at a specified interval, and then exposes the result as a Prometheus metrics endpoint so it can be scraped back into prometheus.

Prediction metrics exposed by the sidecar

FreshTracks created the data-sidecar for running proprietary models for the purposes of adaptive thresholds, forecasting, and anomaly detection of Kubernetes cluster metrics data. We can’t share these algorithms, so the open source code base contains only a couple of very simple adaptive threshold and anomaly detection methods.

TensorFlow in Python and GO

Our LSTM network is constructed with TensorFlow. The TensorFlow GO library is adequate for executing models, but is difficult to use for constructing and training models because it does not implement the full TensorFlow API. Use Keras and Python to build, train, and persist the network instead, similar to the models used in part 2. The trained network is persisted to disk to be executed by the data-sidecar at runtime. The model building and training code is available in the Predictatron repo.

Python code to build, train, and persist a TensorFlow graph in a format that data-sidecar can import.

The data-sidecar master branch does not contain the logic to execute TensorFlow models, so the examples in this post are based off this TensorFlow enabled fork. The data-sidecar imports persisted TensorFlow models on start-up.

Go code for importing persisted TensorFlow graphs.

The data-sidecar continuously queries all series data tagged with the label ft_target="true" from a Prometheus instance, and executes the TensorFlow model against the series data whenever new datapoints are available. Model meta parameters such as the number of input and output steps are defined in the model’s params.json file.

Go code that executes a model against series data and exposes the result to a Prometheus endpoint.

Running the data-sidecar Service

The data-sidecar must be carefully configured for everything to work properly. Significantly, the data-sidecar must be configured to use the same query steps and time duration as the model was trained with. In this example the models were trained on series containing datapoints at 1 minute intervals. Configure the following command line options to match our model.

prom=http://prom.predictatron.net:9090 : Prometheus source

resolution=60 : time between datapoints (60 seconds)

lookback=60 : historic steps to include in initial query (60 minutes)

tfpath=/models/lstm/model-btc_usd-5m : location of persisted models

Predictatron runs data-sidecar within a Docker container running on Google Compute Engine.

Dockerfile for Predictatron’s data-sidecar.

Results and Scaling

Running multi-variate models within data-sidecar is much more performant than the FaaS approach demonstrated in part 2. While I didn’t do any formal benchmarking, the data-sidecar can handle many input series and many models with reasonable resource constraints. data-sidecar scales horizontally in 2 dimensions. Both input series and models can be sharded among different data-sidecar instances. This is the most promising approach to scaling Prometheus time series data analysis, and the reason that FreshTracks.io developed the data-sidecar.

The prediction data from this post is live at http://www.predictatron.net/. Select the “sidecar” option to view data from models described here.

Running Custom Models

The open-source version of data-sidecar described here does not yet have a plug-and-play model interface, but the TensorFlow example code used for this post should be a good starting point if you’re willing to write a little GO code. Create a Scorer struct to implement your model’s algorithm, and wire it up in main.go to execute against incoming series data. Then configure your Prometheus instance with a starget at sidecar-url/metrics to scrape the resulting metrics back into Prometheus.

Prometheus configuration for scraping data-sidecar.

Future improvements to data-sidecar may include a plug-and-play model interface to make it easier to execute custom models with minimal code changes, as well as supporting online models that continuously update in response to new data.