You’d be insane not to grow your startup on AWS

…and by “insane” I mean really, truly, what-was-I-thinking-was-I-out-of-my-mind? insane, not cool insane like the awe-inspiring Tesla insane button. Now, either you’re somewhat in agreement or you’re up-in-arms and ready to engage flamethrowers and lay down scorching arguments in the comments. Either way, follow along as I weave together the financial and technical story behind why I think life on AWS is so good for startups.

If you think you’ve got something new and special to share with the world in the form of a tech startup, you need to maximize the amount of energy you put into making your product as great as it can be.

If you’re wasting energy racking servers, scaling storage arrays, configuring routers, allocating bandwidth, understanding cooling, or hell, slogging diesel up flights of stairs to keep your servers running, you’re not maximizing the amount of energy you can instead direct towards adding unique value for your customers. That unique value is something customers pay for, hence you get more customers, hence you grow faster when you focus.

(*) There’s one anomaly in that data, where companies with annual revenue between $10M and $15M hosting their own servers are spending just 4% of revenue. But look at what happens to companies that grow beyond that. My theory is that companies at that scale are perhaps experiencing some false economies that they end up paying for later.

Don’t believe me? Look at these numbers from a recent roundup of 300+ SaaS software companies. On average, companies using third-party hosting like AWS are growing twice as fast as companies that self-host and are spending 25% less on hosting costs.* Also note the significant portion of companies that are planning on moving to AWS over the next three years. Those of you with itchy trigger fingers hovering over your flamethrowers should read that part again and let it sink in a bit.

We didn’t have the benefit of those stats back in 2009 when we launched our startup on a single AWS EC2 instance in US West (ok, that was slightly insane, but finances were ultra constrained back in the day). It was a bit of a leap of faith to go with AWS back then, given that I could have gone with a much cheaper virtual private server, or even slapped a physical server in an acquaintance’s colo closet.

We (me and our five other co-founders) were first-time entrepreneurs, so I was scared as hell of making a bad choice and spending too much on hosting. But I figured the extra expense would pay off in the form of added agility down the road.

Fast forward to 2015 and in retrospect, things have turned out really well.

(*) As an aside, you shouldn’t focus too much on over-optimizing here. In the world of SaaS, you’re harnessing machine capability for the benefit of your customers. The more work you can get machines to do for your customers, the more valuable your product.

Our initial year one budget had us spending 6% of revenue on hosting. We’ve been continually under that for the last four years, with last year coming in around 2.7%. Moving into 2015-16 fiscal, we’re budgeting 3% given that our SOA efforts will result in allocating more instances during the transition.*

Back to the technical side of our story, by 2011 we were actually making money and had migrated that existing initial instance to a front-end ELB, a handful of application servers, and a dedicated database server. Those moves were made with zero impact on customers, and point-and-click ease on our side. But I was still scared as hell, just for different reasons. What if the Amazon data centre went down? What if our database instance got scheduled for a reboot?

This is where all the naysayers start arming their flamethrowers and shouting “if you hosted your own, that wouldn’t happen!”. Except that they’re wrong. That stuff still happens even when you have your own infrastructure. The difference is that if you’re hosting your own, you have fewer tools for dealing with it, and you’ve likely succumbed to the false sense of security offered by the “all your eggs in one carefully handled basket” approach to availability.

In our case, we took Werner’s sage advice to launch in multiple availability zones one step further, and we fired up a replicated database over in another region (US East) along with some application servers.

We configured our databases for multi-master round-robin replication so that each coast had a near real-time copy of everything (we get sub-second replication latency on about 200 writes per second). We would have used RDS, but at that point in time, you couldn’t access the binary logs, and couldn’t configure replication manually.

I was lucky in that we had a natural partition in our application functionality such that I was able to point our application endpoints, via DNS, to either or both regions without the slightest disruption. This was a completely new capability for us, and we were able to leverage this to produce some great uptime numbers. Whenever we needed to do a database software update, or even an expensive database migration or database instance update, we were able to cut over to one region, perform and test the operation in the other region and then vice versa.

In addition to giving us tons of operational flexibility for doing routine maintenance with zero downtime, I was feeling pretty good when we responded to hurricanes et al in 2012 by just doing a DNS update and watching our traffic redirect to the opposite coast.

By the time 2013 hit, we had a significant business on our hands, and I was worried more than ever about the availability of our customers’ pages. We had been doing manual failover up to that point, but if something happened in the middle of the night, we still had the potential of some significant downtime before someone would wake up and deal with things.

I’ll leave you hanging at this point in our story for now, and promise that in Part 2 I’ll dig into the technical side of things a bit further; explaining how we ended up enabling simultaneous operations in multiple regions, smoothing out deployments, and I’ll offer a glimpse into what we’re planning next.

Want to know more? Think you have a strong case for self-hosting? Let’s hear it in the comments.

– Carl Schmidt, CTO