Kubernetes and the custom metrics support

We have covered a lot so far, let’s go through it again before moving on. We have several instances of the same application, and we’d like to add or remove them based on the evolution of a shared metric exposed by the application: the consumer record-lag.

This looks like a perfect job for Kubernetes!

Kubernetes maintains the number of deployments asked for a given application. It can also natively scale (in or out) based on CPU usage or memory consumption. Fortunately, since version 1.6, it can also scale applications on custom metrics. This feature requires to enrich the original Kubernetes APIs with additional adapters. Among all possible implementation of adapters, we chose the adapters based on Stackdriver to create a bridge between Kubernetes custom metrics and our Kafka-Streams JMX metrics:

Fig 2 : Exporting metrics from Kafka-Streams

We first expose JMX metrics of the streaming application in Prometheus format. Each application instance has a sidecar prometheus-to-sd to scrap the metrics and send them to Stackdriver. Now lags can be plotted, but also queried by a metric server. At this point, the metric server custom-metrics-stackdriver-adapter feeds the Kubernetes master with the new custom metric values.

Now let’s put all the pieces together.

Expose the JMX metrics in a Prometheus format

Prometheus is an open-source monitoring and alerting toolkit and one of the first software member of CNCF. It defines a display format for metrics. This format, which is becoming a reference, is used in the following part of the experiment. To do so, we use the jmx-exporter project to format the metrics from our application.

We add a few JVM parameters to the streaming app:

java -cp ...

-Djava.rmi.server.hostname=127.0.0.1

-Djava.rmi.server.port=7071

-javaagent:/<>/jmx_prometheus_<version>.jar=9001:/<>/config.yaml

This way metrics are exposed on port 7071 and we can access them as a formatted version through HTTP on the port 9001. The config.yaml file describes the metrics exposed.

With this block of configuration we are able to export all the metrics among kafka.consumer of type consumer-fetch-manager-metrics with the information records-lag (for the input topic GAME-FRAME-RS). We duplicate this block of config for each input topic partitions. Note that we assign the type GAUGE to this configuration, this is required. (complete file)

Building the Docker image of the streaming app

Metrics can now be queried in HTTP on a single machine in development mode.

Finally, we need to package everything inside a docker image to take advantage of this new feature inside the pod of Kubernetes. The build tool used here is Gradle and by adding the Docker plugin we can configure the project as follow:

In a few lines we declare:

The Dockerfile to build

The entry point of the streaming app (Main class)

Name, version and repository where to upload the image

At the root of the container we add the following files:

$ tree -l 3

#/

#└── opt

# └── kos-stream

# ├── config.yaml

# └── jmx_prometheus_javaagent-0.3.1.jar

These two files are referenced from the JVM parameters (see the previous section).

Metrics aggregation to Stackdriver

Now that metrics are exposed on the address and port of a Kubernates Pod, the next step is to use prometheus-to-sd from the k8s-stackdriver project. To do so, we include an image in the streaming application pod like a sidecar. Its goal is to scrap the metrics and send them to Stackdriver. By doing so, metrics will be both persisted and displayed in a dashboard.