Recently I wrote “Continuous profiling in Go with Profefe”, an article about the new shiny open source project I am contributing to.

TLDR: Profefe is a registry for pprof profiles. You can push them embedding an SDK in your application or you can write a collector (cronjob) that gets profiles and push the tar via the Profefe API. Side by side with the profile you have to send other information like:

Type: represents the profile type such as mutex, goroutines, CPU and so on

Service: identifies the source for this profile, for example, the binary name

InstanceID: identifies where it comes from, for example, pod name or server hostname

Labels: are optional key/value pairs that you can use at query time to filter profiles. If you are building the same service with two different Go versions to check for performance degradation you can label the profiles with go=1.13.4 for example.

The article has way more content but that’s enough. You can keep reading with only this information.

Kubernetes

As you know at InfluxData we use Kubernetes, our services already expose the pprof HTTP handler and we can not instrument all the services with the Profefe SDK, for those reasons we had to write our own collectors capable of getting pprof profiles via the Kubernetes API and to push them into Profefe. That’s why we decided to go with a different approach. I wrote a project called kube-profefe. It acts as a bridge between the Profefe API and Kubernetes. The repository provides two different binaries:

A kubectl plugin that you can install (even via krew) that servers useful utilities to interact with the profefe API (profefe at the moment does not have a CLI) and to capture profiles from running pod.

A collector that can run as a cronjob, it goes pod by pod looking for profiles to collect and it will push them to Profefe.

Architecture

In order to configure the collector or to capture profiles from a running container, it leverages pod annotations. Only the pods with the annotation pprof.com/enable=true will be taken into consideration from kube-profefe. Other annotations are optional or they have default values. This one is the unique one that has to be set to make kube-profefe aware of your pod.

The example above shows a Pod spec that enables profefe capabilities:

apiVersion : v1 kind : Pod metadata : name : influxdb-v2 annotations : " profefe.com/enable" : " true" " profefe.com/port" : " 9999" spec : containers : - name : influxdb image : quay.io/influxdb/influxdb:2.0.0-alpha ports : - containerPort : 9999

As you can see there are other annotations such as profefe.com/port by default is 6060. In this case it is pointed to 9999 because that’s where the pprof HTTP handler runs in InfluxDB v2. A full list of annotations is maintained in the project’s README.md.

There is not a lot more to know about the underling mechanism that enpowers kube-profefe, we are gonna deep dive on both components: the kubectl plugin and the collector.

Kubectl-profefe: the kubectl plugin

A kubectl plugin is nothing more than a binary located in your $PATH with the prefix name “kubectl-”. In my case the binary is released with the name kubectl-profefe, when located in your $PATH you will be able to run a command like:

$ kubectl profefe --help It is a kubectl plugin that you can use to retrieve and manage profiles in Go. Usage: kubectl-profefe [ flags] kubectl-profefe [ command ] Available Commands: capture Capture gathers profiles for a pod or a set of them. If can filter by namespace and via label selector. get Display one or many resources help Help about any command load Load a profile you have locally to profefe Flags: -A , --all-namespaces If present, list the requested object ( s ) across all namespaces. Namespace in current context is ignored even if specified with --namespace . --as string Username to impersonate for the operation --as-group stringArray Group to impersonate for the operation, this flag can be repeated to specify multiple groups. --cache-dir string Default HTTP cache directory ( default "/home/gianarb/.kube/http-cache" ) --certificate-authority string Path to a cert file for the certificate authority --client-certificate string Path to a client certificate file for TLS --client-key string Path to a client key file for TLS --cluster string The name of the kubeconfig cluster to use --context string The name of the kubeconfig context to use -f , --filename strings identifying the resource. -h , --help help for kubectl-profefe --insecure-skip-tls-verify If true , the server 's certificate will not be checked for validity. This will make your HTTPS connections insecure --kubeconfig string Path to the kubeconfig file to use for CLI requests. -n, --namespace string If present, the namespace scope for this CLI request -R, --recursive Process the directory used in -f, --filename recursively. Useful when you want to manage related manifests organized within the same directory. (default true) --request-timeout string The length of time to wait before giving up on a single server request. Non-zero values should contain a corresponding time unit (e.g. 1s, 2m, 3h). A value of zero means don' t timeout requests. ( default "0" ) -l , --selector string Selector ( label query ) to filter on, supports '=' , '==' , and '!=' . ( e.g. -l key1 = value1,key2 = value2 ) -s , --server string The address and port of the Kubernetes API server --token string Bearer token for authentication to the API server --user string The name of the kubeconfig user to use Use "kubectl-profefe [command] --help" for more information about a command.

This output should look very familiar to you, there are a lot of options usable with any other kubectl native command. Mainly around authentication: –user, –server, –kubeconfig, –client-certificate… Or around pod selection: -l, –selector, -n, –namespace, –all-namespaces. If you are curious about how to write a friendly kubectl plugin I wrote “kubectl flags in your plugin” check it out.

This plugin, even if it is not native, uses the same authentication mechanism in use from the kubectl so, where ever the kubectl works, this plugin should work as well.

The pod selectors -l, -n, for example, are useful when running the command:

$ kubectl profefe capture

Capture, as the name suggests, goes straight to one or more pods and it downloads or pushes to profefe various profiles. It is very flexible, you can capture pprof profiles from a specific pod (or multiple pods) by ID:

$ kubectl profefe capture <pod-id>,<pod-id>...

NB: just remember to use the namespace where the pods are running with the flag -n or –namespace.

You can use the pod selectors to collect multiple profiles:

$ kubectl profefe capture -n web

Captures profiles from all the pod with the pprof.com/enable=true annotation running in the pod namespace and it will store them under the /tmp directory. You can change the output directory with --output-dir . If you do not want to store them locally you can push them to profefe specifying its location via --profefe-hostport .

The are other combinations for the capture command and you can get profiles from profefe, I will leave the rest to you!

Hero image via Pixabay

Kprofefe: the collector

The main responsability for the collector is to make the continuous profiling magic to happen! It uses the same mechanism we already saw for the capture kubectl plugin but it is a single binary and it can run as a cronjob.

apiVersion: batch/v1beta1 kind: CronJob metadata: name: kprofefe-allnamespaces namespace: profefe spec: concurrencyPolicy: Replace jobTemplate: metadata: spec: template: spec: containers: - args: - --all-namespaces - --profefe-hostport - http://profefe-collector:10100 image: profefe/kprofefe:v0.0.8 imagePullPolicy: IfNotPresent name: kprofefe restartPolicy: Never serviceAccount: kprofefe-all-namespaces serviceAccountName: kprofefe-all-namespaces schedule: '*/10 * * * *' successfulJobsHistoryLimit: 3

You can run a single cronjob that will over all the pods across all the namespaces or you can deploy multiple cronjobs, playing with the label selector (-l) and the namespace selector (-n) you can configure the ownership for every running cronjob. The reasons to split in multiple cronjobs can be:

Scalability: one cronjob is not enough, so you can have one per namespace for example

Time segmentation: if you have a single cronjob it means that all the pods profiles will get captured with the same frequency, but you will may want to get high frequent profiles for a specific subset of applications and less dencity for others.

Documentation about “Label and Selector” for your reference.

Note: serviceAccount is required only if you have RBAC enabled (you should) because the collector needs access to Kubernetes API to list/view pods across all namespaces in this case.

Conclusion