Sun 13 November 2016

Prometheus is a monitoring toolkit. Let's set it up on Kubernetes and test how it works by scraping HTTP request metrics from hello web application which also runs in the same cluster.

First of all, we need Kubernetes cluster running. It's easy to bootstrap one via Google Container Engine.

$ gcloud container clusters create my-k8

The command above might complain that zone is not currently set. Choose the one you like and try to create a cluster again.

$ gcloud compute zones list $ gcloud config set compute/zone asia-east1-b

You should now have a Kubernetes cluster. If there are any issues, check out Udacity course which helped me run Kubernetes in GKE.

First Start Prometheus is available as Docker image and can be run locally quickly. $ docker run -p 9090 :9090 prom/prometheus:v1.2.1 When a container is started, the Prometheus expression browser should be accessible on http://localhost:9090 . Now let's achieve the same results with Kubernetes. $ kubectl run prometheus-deployment --image = prom/prometheus:v1.2.1 --port = 9090 The command above created Kubernetes Deployment that runs prometheus Docker image. By default deployed applications are visible only inside the Kubernetes cluster. To see whether Prometheus started up, we can run a proxy between terminal and Kubernetes cluster. $ kubectl proxy Starting to serve on 127.0.0.1:8001 In another cloud shell, we can call Kubernetes API. $ kubectl get pods NAME READY STATUS RESTARTS AGE prometheus-deployment-2234044252-of29y 1/1 Running 0 8m $ curl http://127.0.0.1:8001/api/v1/proxy/namespaces/default/pods/prometheus-deployment-2234044252-of29y:9090/metrics Another option is to run a port forwarding: $ kubectl port-forward prometheus-deployment-2234044252-of29y 8080 :9090 $ curl http://127.0.0.1:8080/metrics Check Prometheus container logs: $ kubectl logs -f po/prometheus-deployment-2234044252-of29y time="2016-10-29T05:02:38Z" level=info msg="Starting prometheus (version=1.2.1, branch=master, revision=dd66f2e94b2b662804b9aa1b6a50587b990ba8b7)" source="main.go:75" And finally we can run a shell inside the Pod's container and have a look at Prometheus config file. $ kubectl exec prometheus-deployment-2234044252-of29y -it /bin/sh root$ cat /etc/prometheus/prometheus.yml Its advisable to describe Deployments in configuration files, so we can have a better visibility and version control over our cluster. The following Deployment manifest is similar to what we achieved with kubectl run command. apiVersion : extensions/v1beta1 kind : Deployment metadata : name : prometheus-deployment spec : replicas : 1 # tells deployment to run 1 pod matching the template below template : # crete pods using pod definition in this template metadata : labels : # these key value pairs will be attached to pods app : prometheus-server spec : containers : - name : prometheus image : prom/prometheus:v1.2.1 ports : - containerPort : 9090 # port we open in the container Let's delete prometheus-deployment we created via kubectl run command $ kubectl delete deployment prometheus-deployment and re-create the Deployment from a file (it is available in git repository): $ git clone https://github.com/marselester/prometheus-on-kubernetes.git $ cd ./prometheus-on-kubernetes/ $ kubectl create -f kube/prometheus/deployment-v1.yml $ kubectl get deployments NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE prometheus-deployment 1 1 1 0 40s

Prometheus Service We have a Prometheus Pod running. Now we need Kubernetes Service to let external clients access it. $ kubectl expose deployment prometheus-deployment --type = NodePort --name = prometheus-service The assigned port can be found in NodePort output of Service description: $ kubectl describe service prometheus-service # ... NodePort: <unset> 32514/TCP # ... We have exposed the Service on an external port 32514 on all nodes in our cluster. Now create a firewall rule to allow external traffic. $ gcloud compute firewall-rules create prometheus-nodeport --allow = tcp:32514 Let's use one of the external IPs from our Kubernetes cluster instances to see Prometheus expression browser. $ gcloud compute instances list You should be able to access Prometheus on http://<EXTERNAL_IP>:32514 . Let's delete the Service and create it via Service config file. $ kubectl delete service prometheus-service $ kubectl create -f kube/prometheus/service-v1.yml Since the Prometheus Pod exposes 9090 port and has app: prometheus-server label, our config should be as following: apiVersion : v1 kind : Service metadata : name : prometheus-service spec : selector : # exposes any pods with the following labels as a service app : prometheus-server type : NodePort ports : - port : 80 # this Service's port (cluster-internal IP clusterIP) targetPort : 9090 # pods expose this port # Kubernetes master will allocate a port from a flag-configured range (default: 30000-32767), # or we can set a specific port number (in our case). # Each node will proxy 32514 port (the same port number on every node) into this service. # Note that this Service will be visible as both NodeIP:nodePort and clusterIp:port nodePort : 32514

Prometheus Config So far we've been using the default Prometheus config which is part of a Docker image. For sure we will need to update it so Prometheus can collect metrics from our example app. Let's take the default config as a starting point and store it in Kubernetes ConfigMap. The config can be copied from the running container or from the git repository. $ kubectl exec prometheus-deployment-2234044252-of29y -it cat /etc/prometheus/prometheus.yml Next we need to create a ConfigMap entry for the prometheus.yml file: $ kubectl create configmap prometheus-server-conf --from-file = prometheus.yml = kube/prometheus/config-v1.yml Now let's mount prometheus-server-conf ConfigMap volume to our Prometheus Pod $ kubectl apply -f kube/prometheus/deployment-v2.yaml and store metrics in emptyDir volume, so we don't lose them when a container in the Pod crashes. $ kubectl apply -f kube/prometheus/deployment-v3.yml

Sending App Metrics We have a Prometheus running in Kubernetes but we have no application to monitor. I wrote hello-app/v1 web app that exposes /hello HTTP endpoint. Björn Rabenstein in his talk explains how to instrument your code to expose metrics to Prometheus. Our application is not forced to use Prometheus client to expose metrics. We can create /metrics HTTP endpoint manually in the following text format: # HELP http_requests_total Number of HTTP requests. # TYPE http_requests_total counter http_requests_total{code="200",method="get"} 2384 But it's much easier to use a library (see hello-app/v2). import "github.com/prometheus/client_golang/prometheus/promhttp" // ... http . Handle ( "/metrics" , promhttp . Handler ()) Now the app has /metrics endpoint with Go runtime metrics (number of goroutines, GC statistics). Our app exposes an important endpoint /hello . func helloHandler ( w http . ResponseWriter , r * http . Request ) { status := doSomeWork () w . WriteHeader ( status ) w . Write ([] byte ( "Hello, World!

" )) } Let's instrument helloHandler() to count HTTP requests and their durations. First, we need to define metrics. import "github.com/prometheus/client_golang/prometheus" var ( // How often our /hello request durations fall into one of the defined buckets. // We can use default buckets or set ones we are interested in. duration = prometheus . NewHistogram ( prometheus . HistogramOpts { Name : "hello_request_duration_seconds" , Help : "Histogram of the /hello request duration." , Buckets : [] float64 { 0.01 , 0.025 , 0.05 , 0.1 , 0.25 , 0.5 , 1 , 2.5 , 5 , 10 }, }) // Counter vector to which we can attach labels. That creates many key-value // label combinations. So in our case we count requests by status code separetly. counter = prometheus . NewCounterVec ( prometheus . CounterOpts { Name : "hello_requests_total" , Help : "Total number of /hello requests." , }, [] string { "status" }, ) ) // init registers Prometheus metrics. func init () { prometheus . MustRegister ( duration ) prometheus . MustRegister ( counter ) } Second, measure a request duration in seconds and increase the counter in the helloHandler() function. func helloHandler ( w http . ResponseWriter , r * http . Request ) { var status int defer func ( begun time . Time ) { duration . Observe ( time . Since ( begun ). Seconds ()) // hello_requests_total{status="200"} 2385 counter . With ( prometheus . Labels { "status" : fmt . Sprint ( status ), }). Inc () }( time . Now ()) status = doSomeWork () w . WriteHeader ( status ) w . Write ([] byte ( "Hello, World!

" )) } hello-app/v3 is used in further examples.

Hello App Demo Next step is to run the web app in Kubernetes. A Docker image of the hello-app/v3 is available on Docker Hub. $ kubectl apply -f kube/hello/deployment-v1.yml $ kubectl port-forward hello-deployment-1471727270-eaknp 8000 :8000 $ curl localhost:8000/hello Hello, World! $ curl localhost:8000/metrics ... # HELP hello_request_duration_seconds Histogram of the /hello request duration. # TYPE hello_request_duration_seconds histogram hello_request_duration_seconds_bucket{le="0.01"} 0 hello_request_duration_seconds_bucket{le="0.025"} 0 hello_request_duration_seconds_bucket{le="0.05"} 0 hello_request_duration_seconds_bucket{le="0.1"} 1 hello_request_duration_seconds_bucket{le="0.25"} 1 hello_request_duration_seconds_bucket{le="0.5"} 1 hello_request_duration_seconds_bucket{le="1"} 1 hello_request_duration_seconds_bucket{le="2.5"} 1 hello_request_duration_seconds_bucket{le="5"} 1 hello_request_duration_seconds_bucket{le="10"} 1 hello_request_duration_seconds_bucket{le="+Inf"} 1 hello_request_duration_seconds_sum 0.083953974 hello_request_duration_seconds_count 1 # HELP hello_requests_total Total number of /hello requests. # TYPE hello_requests_total counter hello_requests_total{status="500"} 1 The Service creation is similar to what we have already done before. We use 32515 NodePort here. $ kubectl apply -f kube/hello/service-v1.yml $ gcloud compute firewall-rules create hello-nodeport --allow = tcp:32515 Now it is possible to see the app's metrics from Internet. $ curl http://<EXTERNAL_IP>:32515/metrics

Hello Prometheus Since the web app is run on Kubernetes, we can configure Prometheus to scrape metrics from /metrics HTTP endpoint of hello-app Pods. global : scrape_interval : 5s evaluation_interval : 5s scrape_configs : - job_name : 'prometheus' static_configs : - targets : [ 'localhost:9090' ] - job_name : 'hello' # The information to access the Kubernetes API to discover targets. kubernetes_sd_configs : - api_servers : - 'https://kubernetes.default.svc' # Prometheus assumes it is being run inside a Kubernetes pod. in_cluster : true # Only pods should be discovered. role : pod # Prometheus collects metrics from pods with "app: hello-server" label. # Prometheus gets 'hello_requests_total{status="500"} 1' # from hello:8000/metrics and adds "job" and "instance" labels, so it becomes # 'hello_requests_total{instance="10.16.0.10:8000",job="hello",status="500"} 1'. relabel_configs : - source_labels : [ __meta_kubernetes_pod_label_app ] regex : hello-server action : keep Update a ConfigMap of Prometheus config and re-create a Prometheus Pod so it picks up changes. $ kubectl create configmap prometheus-server-conf \ --from-file = prometheus.yml = kube/prometheus/config-v2.yaml \ -o yaml \ --dry-run | kubectl replace -f - Finally, the app's metrics should show up in Prometheus expression browser http://<EXTERNAL_IP>:32514 . You can try to query hello_requests_total which shows how many requests we've served since the beginning. Let's see how many requests we've served in last 5 minutes normalised per second (QPS) with rate(hello_requests_total[5m]) query. Here I gave an example of how to run Prometheus in Kubernetes cluster and collect metrics from a simple web app. However recently I have encountered CoreOS kube-prometheus which makes it easier. Have a look at The Prometheus Operator: Managed Prometheus setups for Kubernetes for more details.

Category: Infrastructure Tagged: prometheus kubernetes monitoring golang Google Container Engine

Comments

Please enable JavaScript to view the comments powered by Disqus.