Kubernetes InitContainers are a neat way to run arbitrary code before your container starts. It ensures that certain pre-conditions are met before your app is up and running. For example it allows you to:

run database migrations with Django or Rails before your app starts

ensure a microservice or API you depend on to is running

Unfortunately InitContainers can fail and when that happens you probably want to be notified because your app will never start. Kube-state-metrics exposes plenty of Kubernetes cluster metrics for Prometheus. Combining the two we can monitor and alert whenever we discover container problems. Recently, a pull-request was merged that provides InitContainer data.

The metric kube_pod_init_container_status_last_terminated_reason tells us why a specific InitContainer failed to run; whether it's because it timed out or ran into errors.

To use the InitContainer metrics deploy Prometheus and kube-state-metrics. Then target the metrics server in your Prometheus scrape_configs to ensure we're pulling all the cluster metrics into Prometheus:

- job_name : 'kube-state-metrics' static_configs : - targets : [ 'kube-state-metrics:8080' ]

kube_pod_init_container_status_last_terminated_reason contains the metric label reason that can be in five different states:

Completed

OOMKilled

Error

ContainerCannotRun

DeadlineExceeded

We want to be alerted whenever a metric that is not 'Completed' is scraped because that means an InitContainer has failed to run. Here is an example alerting rule.

groups : - name : Init container failure rules : - alert : InitContainersFailed expr : kube_pod_init_container_status_last_terminated_reason{reason!="Completed"} == 1 annotations : summary : '{{ $labels.container }} init failed' description : '{{ $labels.container }} has not completed init containers with the reason {{ $labels.reason }}'

Happy monitoring!