Most of the time, we scale our Kubernetes deployments based on metrics such as CPU or memory consumption, but sometimes we need to scale based on external metrics. In this post, I’ll guide you through the process of setting up Horizontal Pod Autoscaler (HPA) autoscaling using any Stackdriver metric; specifically we’ll use the Request Per Second from a Google Cloud HTTP/S Load Balancer.

Autoscaling Kubernetes Horizontal Pod Autoscaler with Stackdriver Metrics

Let’s Go!

First let’s create a new Google Kubernetes Engine (GKE) cluster:

gcloud beta container clusters create "hpa-with-stackdriver-metrics" --zone "us-central1-a" \

--username "admin" \

--cluster-version "1.10.7-gke.6" \

--machine-type "n1-standard-1" \

--image-type "COS" \

--disk-type "pd-standard" \

--disk-size "100" --scopes \ "https://www.googleapis.com/auth/devstorage.read_only","https://www.googleapis.com/auth/logging.write","https://www.googleapis.com/auth/monitoring","https://www.googleapis.com/auth/servicecontrol","https://www.googleapis.com/auth/service.management.readonly","https://www.googleapis.com/auth/trace.append"

--num-nodes "3" \

--enable-cloud-logging \

--enable-cloud-monitoring \

--addons HorizontalPodAutoscaling,HttpLoadBalancing \

--enable-autoupgrade --enable-autorepair

Note the `enable-cloud-monitoring` which will allow us to read from the Stackdriver Monitoring metrics.

Deploy Custom Metrics Stackdriver Adapter

The custom metrics adapter is responsible for importing stackdriver metrics to the Kubernetes API, this will enable the HPA to consume these metrics and act upon them. You can see more details about that in the troubleshooting section below..

To grant GKE objects access to metrics stored in Stackdriver, you need to deploy the Custom Metrics Stackdriver Adapter in your cluster.

In order to run Custom Metrics Adapter you must grant your user the ability to create required authorization roles by running the following command:

kubectl create clusterrolebinding cluster-admin-binding \

--clusterrole cluster-admin \

--user "$(gcloud config get-value account)"

And now let’s deploy the actual adapter that will enable us to read metrics from Stackdriver

kubectl create -f https://raw.githubusercontent.com/GoogleCloudPlatform/k8s-stackdriver/master/custom-metrics-stackdriver-adapter/deploy/production/adapter.yaml

Create a Deployment

Now, let’s deploy a simple nginx application that will be scaling later on based on the RPS measured by HTTP/S Load Balancer.

create this file: deployment.yaml

apiVersion: apps/v1

kind: Deployment

metadata:

name: nginx

spec:

selector:

matchLabels:

app: nginx

replicas: 1

template:

metadata:

labels:

app: nginx

spec:

containers:

- name: nginx

image: nginx:1.8

ports:

- containerPort: 80

--- apiVersion: v1

kind: Service

metadata:

name: nginx

labels:

app: nginx

spec:

type: NodePort

ports:

- port: 80

protocol: TCP

selector:

app: nginx

Now let’s deploy it:

kubectl apply -f deployment.yaml

Create LoadBalancer Ingress

create ingress file: ingress.yaml

apiVersion: extensions/v1beta1

kind: Ingress

metadata:

name: basic-ingress

spec:

backend:

serviceName: nginx

servicePort: 80

And apply the ingress

kubectl apply -f ingress.yaml

Create HorizontalPodAutoscaler object

This is where the magic happens,

we use an external metric*, with metricName:

loadbalancing.googleapis.com| https|

Note: you can find the list of all Stackdriver metrics here or you can use the Metrics Explorer.

We should also use a metricSelector, to make sure we are using only our specific load balancer metrics, so we use a metricSelector.

let’s find our LB forwarding rule:

$ kubectl describe ingress basic-ingress

Name: basic-ingress

Namespace: default

Address: 35.190.3.165

Default backend: nginx:80 (10.48.2.11:80)

Rules:

Host Path Backends

---- ---- --------

* * nginx:80 (10.48.2.11:80)

Annotations:

backends: {"k8s-be-32432--ffd629d77b6630de":"HEALTHY"}

forwarding-rule: k8s-fw-default-basic-ingress--ffd629d77b6630de

target-proxy: k8s-tp-default-basic-ingress--ffd629d77b6630de

url-map: k8s-um-default-basic-ingress--ffd629d77b6630de

now we can add the label match to our config (notice the label: “forwarding_rule_name” )

metricSelector:

matchLabels:

resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de

The final file will look like this: hpa.yaml

apiVersion: autoscaling/v2beta1

kind: HorizontalPodAutoscaler

metadata:

name: nginx

spec:

minReplicas: 1

maxReplicas: 5

metrics:

- external:

metricName: loadbalancing.googleapis.com| https| request_count

metricSelector:

matchLabels:

resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de

targetAverageValue: "1"

type: External

scaleTargetRef:

apiVersion: apps/v1

kind: Deployment

name: nginx

notice the we have used targetAverageValue, this specifies how much of a total value of metric each replica can handle. This is useful when using metrics that describe some work or resource that can be divided between replicas, in our case each replica can handle a single (i.e. 1) RPS. You should, of course, change this according to your needs.

Let’s test everything

Let’s start by driving traffic to our load balancer.

As you can see from the above command :

kubectl describe ingress basic-ingress

Our Ingress Public IP address is : 35.190.3.165

now let’s start hitting that endpoint 🥊 with some requests:

while true ; do curl -Ss -k --write-out '%{http_code}

' --output /dev/null http://35.190.3.165/ ; done

now let’s see if our HorizontalPodAutoscaler is affected:

kubectl describe hpa nginx-hpa

at this point you might see some warnings since the metric is not populated yet, but after a few minutes we see that the Metrics section is populated:

Name: nginx-hpa Namespace: default Labels: <none> Annotations: kubectl.kubernetes.io/last-applied-configuration={"apiVersion":"autoscaling/v2beta1","kind":"HorizontalPodAutoscaler","metadata":{"annotations":{},"name":"nginx-hpa","namespace":"default"},"spec":{"ma... CreationTimestamp: Wed, 31 Oct 2018 18:18:28 +0200 Reference: Deployment/nginx Metrics: ( current / target ) "loadbalancing.googleapis.com|https|request_count" (target average value): 1034m / 1 Min replicas: 1 Max replicas: 5

And in the “Events” section we can see:

Events: Type Reason Age From Message

... Normal SuccessfulRescale 2m horizontal-pod-autoscaler New size: 2; reason: external metric loadbalancing.googleapis.com|https|request_count(&LabelSelector{MatchLabels:map[string]string{resource.labels.forwarding_rule_name: k8s-fw-default-basic-ingress--ffd629d77b6630de,},MatchExpressions:[],}) above target

We have a liftoff! 🚀

Troubleshooting:

An easy way to see if the metric is being imported to the Kubernetes external metrics api is to browse the api manually. I will also help you to check whether you have used the metricSelector correctly.

First thing, we run the kubernetes proxy

kubectl proxy --port=8080

And then we can access from our localhost:

And this is an excerpt of the result:

{

"kind": "ExternalMetricValueList",

"apiVersion": "external.metrics.k8s.io/v1beta1",

"metadata": {

"selfLink": "/apis/external.metrics.k8s.io/v1beta1/namespaces/default/loadbalancing.googleapis.com%7Chttps%7Crequest_count"

},

"items": [

{

"metricName": "loadbalancing.googleapis.com|https|request_count",

"metricLabels": {

"metric.labels.cache_result": "DISABLED",

"resource.labels.backend_target_type": "BACKEND_SERVICE",

"resource.labels.backend_name": "k8s-ig--ffd629d77b6630de",

...

"resource.labels.forwarding_rule_name": "k8s-fw-default-basic-ingress--ffd629d77b6630de",

...

},

"timestamp": "2018-11-01T08:41:30Z",

"value": "2433m"

}

]

}

Voilá!