Killing Kubernetes

How we shipped faster by rethinking our whole stack

I’ve decided that strong engineering strategy comes down to one piece of advice: “do less of everything > do more of the right things > ship faster.”

This is why we killed our Kubernetes stack.

As engineers, we’re judged largely by our technical expertise.

Understanding the latest hotness will bring kudos with your peers and help you land your next job. “100 pods you say? Pa, we’re running 100,000 pods and have to use Amazon snowball to move our data around!”.

Do you really need it to ship your product though?

Sure, it doesn’t matter if it’s a side project. But if there’s real business impact and you’re making a technical choice that involves more moving parts, more ramp-up time for the rest of the team and more dependencies, then who are you helping?

Hacking is not the goal, it’s the process.

The real goal is shipping product to your customers. In fact, many tech companies use feature delivery as a key metric for performance reviews - so it should be a personal goal too.

Engineers and engineering leaders should be clear on what their goals are. Far too often though, pride takes over. Teams decide that just because they can design their own infrastructure that also means they should. 🛠

“Our engineers are as good as Google’s, shit some of them are from Google! So we should definitely be doing X”.

Do you have the same business goals as Google though? Are you building products with the same requirements? Do you have a 10,000 person platform team?

If not - you might want to rethink your use of serverless. We did, and it changed our ability to ship...

At Freetrade

...It was June 2018. There were (and still are) a mass of users who desperately wanted access to our zero fee investment app. There was a problem though - the team was small, and our initial implementation large. If we stood a chance of launching before the end of the year, things had to change - and fast.

One of the first things we did was audit our existing stack. What were the technology choices we’d made and why? Where could we cut the fat and how quickly? 🤔

Here’s what our setup roughly looked like in June:

Kubernetes configuration

Docker images

Kotlin services

Custom Gradle build scripts

Hazelcast

Cloud endpoints configuration

A deployment orchestration tool

Not a bad setup you might think? Yep, you’d be right. Containers are awesome, container orchestration tooling - esp k8s - is also awesome, and autoscaling on GCP is even better! However, back then, we didn’t need it.

In fact, not only did we not need it - but if we’d launched with this stack with just two engineers (our launch team size!) I’m confident our customers would have been very unhappy.

In the end, here’s what our launch stack looked like.

Firebase functions

The perceptive among you might have noticed that the second architecture has fewer moving parts. What you might not know is that the second implementation was:

Immediately cheaper (even at 20k users its still cheaper!)

Able to scale more quickly

Easier to debug

Easier to replicate for testing/disaster recovery

Has built-in logging support

Has built-in autoscaling

Most importantly though, it was a stack we could launch with confidence, and maintain in production. 👌

A lot of engineers are going to read this post and say “K8s isn’t hard, I could have launched it on my own!”. Again, they’d be right.

However - I don’t believe they could have supported it as well. Moreover, I don’t believe they’d be launching additional features so quickly - or that new feature launches would be so smooth.

Other engineers are going to read this and say. “Yes but this doesn’t work for our use case”. And you might also be right, or at least half right e.g. if you have very specific needs around performance, like machine learning, or some specific dependency that has hardware requirements.

Ultimately it all depends on what your core competency is as a business. For us, it’s providing the best investment platform available to retail consumers.

Nobody pays us to run servers or tweak infrastructure and we’re confident that Google are (for now) better at it than we are.

As Freetrade continues to scale, there have definitely been opportunities beyond Google’s Firebase functions. We’ve leveraged Google Cloud Composer to simplify the logic we had in our data pipeline. We’re starting to use Google Firestore for its improved support for transactions.

And we’ve also had to use a GCE and VPN instance for some upcoming features. However, we’ll continue to leverage low friction low maintenance serverless offerings from cloud providers as that’s what they excel in. We excel in building an investment app for our customers.

We’re now delivering an awesome, ever-iterating product to over 15,000 customers, with an architecture that can be pretty much understood by all the non-technical people in our wider team. 💪

There are instances where you need to own the entire stack, like mining bitcoin or processing ML data. But in nearly 20 years across tech companies, I know there are engineers doing things the hard way, and it’s stopping them from shipping.

So, basically: kill the ego, ship the product. 🙏