(update EKS has fixed their HPA https://aws.amazon.com/blogs/opensource/horizontal-pod-autoscaling-eks/)

So EKS was finally released GA and they have vanilla upstream kubernetes!

… not so much, and many users have noticed that the core feature that makes pods elastic and scalable is broken. You might see this error if you try to use (deprecated) heapster based HPA:

AbleToScale True SucceededGetScale the HPA controller was able to get the target’s current scale

ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)

and

Warning FailedGetResourceMetric 2m (x5270 over 1d) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)

This is because the flag

--horizontal-pod-autoscaler-use-rest-clients

is set to true by default in kubernetes 1.9+

this is a good thing because it forces you to use the new HPA which supports custom metrics allowing you to scale on things other than just CPU.

The problem is that with this flag true you need aggregation enabled on the cluster allowing you to create home-baked ApiServers. Aggregation is not enabled so if you try to deploy metrics-server or a custom prometheus based metrics-server you’ll get an error that looks like

Could not create the API server: configmaps "extension-apiserver-authentication" not found

So EKS basically locks you into aggregation based HPA and then turns aggregation off leaving EKS users stuck.

My team spoke to AWS and they said they don’t currently have a workaround… great so what do EKS users do? (Move to GKE) Run all their pods at max scale using up resources requiring more nodes and paying AWS more?

This really ticked me off so here’s a solution to get HPA working until AWS fixes EKS:

We made a custom hpa controller that keys off your existing HPA resources and will scale up and down. Is it as good as native kubernetes HPA? No. But it works to scale up and down based on CPU and supports custom queries as well. Just run one per namespace you want to scale in and you’re ready to go.

I’m new to go so comments and contributions are appreciated. We’re running this in lower environments now and planning to move to production after it’s tested thoroughly but please test thoroughly for yourself and play with the configurations before you rely on it to keep your systems up on production.

Comes with a little UI too: