We’ve recently written about how we’re using Microscaling on our production MicroBadger deployment — adjusting the number of containers in real time according to the amount of work that needs to be done. And we’ve also talked about how that’s making a significant difference to our AWS bill. Here’s a bit more detail on how that saving was achieved.

6 hours of microscaling

A classic microservices set-up

We have a few different containerized microservices in the MicroBadger system, but the ones we’re using Microscaling with, called size and inspector, are both doing work to inspect details about container images. They both listen on queues for instructions about which images to inspect. The work they do is pretty bursty, largely due to refresh jobs.

We’re using SQS to communicate between our services. With SQS the receiving service polls for messages, and one thing we never got around to is moving from the default poll timeout to a longer poll. You’re charged by the API call in SQS, so it costs you (albeit a tiny, tiny amount) every time you make a poll — regardless of the length you wait, or whether there’s a message to receive.

Before we introduced Microscaling we were constantly running 10 instances of each of the two services. Each instance polled SQS on a regular basis — so the total amount of polling is proportional to the number of instances. A big proportion of those polls were empty.

Using engineering where it’s needed

Both the poll times, and smoothing out the refresh jobs, are examples where we could reduce our costs by applying some engineering effort. But for every organisation engineering effort is limited, and there are always competing ideas clamouring for implementation. Why prioritize working on something that doesn’t really affect the user experience? Especially when we had another solution that we had already developed in the form of our open-source Microscaling Engine.

Microscaling FTW

The Microscaling Engine looks at the length of the SQS queues for size and inspector work. If work is building up it adds more instances of the appropriate service; when queues start shrinking, those instances are removed.

We’re using a very simple add one / remove one algorithm to determine how many instances we need. The downside of this approach is a tendancy to oscillate, adding and removing containers frequently, which can be seen particularly in the graph of the size containers. But we’ve not seen any bad side-effects of this oscillation in practice, so there’s no real need for anything more sophisticated.

Do-it-yourself

The Microscaling code comes with plug-ins for scaling on Kubernetes, Docker and Mesos / Marathon, and for measuring work-to-do from NSQ and Azure Queues as well as SQS. It’s designed to be extensible and we’d love to see people contributing more plug-ins to suit their needs.

If you try out Microscaling for yourself, please do let us know how you get on!