Serverless architectures have been trending the past few years, and many people see them as the future of backend infrastructure. Let’s talk about WTF serverless means and why your engineers might be talking about it. (Psst… I recommend reading my article on tech stacks before proceeding with this article.)

A bit of history

To understand serverless, you first need to understand the alternatives. While serverless is a huge shift in the way we manage infrastructure, in some ways it’s closer to how many non-technical people think the web works. Here’s a very quick history lesson of server management from the beginning of the Internet until now.

On Premise Server Farms

It shouldn’t surprise you to hear that for any webpage you visit and any app you use that connects to the Internet, there is at least one computer at the other end that is interacting with it (from here on out, I’ll just refer to applications, which include websites, mobile app backends, and anything else that runs on the Internet). In the beginning, it was pretty much that straightforward. Anyone could run a server from their home and configure it to run a web application. More common, though, were server farms: warehouses hosting racks upon racks of servers that would run people’s applications. Small applications would use a shared server that was also running other people’s applications. Medium sized applications would use one or more dedicated servers. Large applications would typically run their own server farms for economic reasons or to have more control over the infrastructure. The size indicators here generally refer to web traffic; whether the app is being used by a handful of users per day or thousands of users per second.

Sounds great, let’s just do that

While on premise (eg. self-hosted by the company using them) and shared servers work perfectly fine – and in fact are still popular today – they have some limitations. First of all, there is a long lead time in provisioning new hardware. As an application grows in popularity or complexity, it needs more servers to handle the load. A company that runs its own server farm needs to order and configure the hardware, which takes days or weeks. A company renting shared or dedicated servers from a third party might be able to acquire the new hardware faster assuming the third party has some on hand, but they’d still have to set it up, which still takes hours or days. This is especially problematic when you have unexpected bursts in traffic, such as a PR boost; if you aren’t able to plan ahead, your application could be slow or down in what should be a triumphant moment.

Second, you have to over-provision hardware to account for traffic fluctuations. One-time PR boosts aside, nearly all applications have natural ebbs and flows in their traffic, most often based on time of day or week. With server farms, you always have to have enough hardware for your peak minute; much of the day, you’ll be paying for servers that aren’t doing anything.

Finally, consider the maintenance. Server farms require high bandwidth Internet, electricity, cooling systems, and technicians to set up new servers and fix broken ones. That’s a lot of work once you have dozens, hundreds, or even thousands of servers. Then consider the rate at which the computing industry has evolved ever since it began. Servers are antiquated within just a few years and Internet services need to be upgraded frequently to stay competitive. Finally, you have to deal with software maintenance. For performance and security reasons, you want those servers to be running the latest version of its operating system and other software. This is a lot of work. With a shared server, someone else is doing much of this work for you, but you’re paying a premium for them to do so, and you’re still mostly on your own on the software side.

Cloud Computing / Infrastructure as a Service (IaaS)

Then along came Amazon Web Services (AWS), Google Cloud, Azure, and a slew of other services that popularized cloud computing. Cloud servers are server farms taken to the next level. Huge technology companies like Amazon, Google, and Microsoft got really good at running hundreds of thousands of servers to support their own businesses, and decided to sell their expertise to others. They started running even more servers and renting them out to anyone who wanted them.

The key distinction between these services and the server farms of yore is that you can start and stop them programmatically, and you are billed by the hour or by the minute. In other words, you don’t have to call or email someone to get them to set up a server for you, you can do it at the click of a button or, even less, by writing software that automatically starts and stops instances.

While this may not seem like a huge distinction, it was a game changer. Instead of planning weeks or months in advance for an anticipated traffic spike, you could write software that would detect or predict high traffic and start and configure servers automatically (this is known as auto-scaling, and all major cloud providers eventually started offering this as an out-of-the-box service), then shut them down and stop paying for them when traffic died down. You could automatically configure your servers so they were running your application and all of the supporting software rather than configuring them manually. Since you were charged by the minute or hour, you wouldn’t have to pay for all of your hardware in the middle of the night when nobody was using your application.

Containers

As a last stop before serverless, container services such as Docker rose to prominence over the last several years. Containers were essentially created in order modernize shared servers. They are a standardized mechanism for packaging all of your application’s configuration and deploying it to new servers automatically. Container services simplified configuration, deployment, and scaling of an application’s infrastructure. Containers only rose to prominence a few years ago and are still very popular; it remains to be seen whether history deems them to be a stop along the route to serverless or a long term tool in a suite of infrastructure options.

Are you finally going to tell me WTF serverless is?

You might have noticed that the word “server” was used repeatedly in every prior solution. You probably see where I’m going with this: serverless removes the need for engineers to think about servers entirely. Note that I didn’t say we’ve removed servers from the equation; they are still there in the background, but they are part of the infrastructure layer that your engineers don’t have to think about. Much like we depend on things like ISPs and energy companies to run our apps but don’t have to think about them very much, serverless allows us to treat servers the same way. Servers are absolutely necessary, but someone else is doing all the work of maintaining them for us.

What’s so bad about servers?

The main reason that giving up the notion of servers is attractive to engineers is that it takes a specialized skillset to manage them. Early startups need to either hire an Infrastructure Engineer (or similar titles Systems Engineer or DevOps Engineer), or make do with someone, usually a backend engineer, who doesn’t specialize in server management. The former approach is expensive (Infrastructure Engineers are among the highest paid in the field) and the latter approach will take significant time away from your other engineers from working on the application, and may lead to sub-standard production systems that are slower and have more frequent outages. Serverless doesn’t completely negate the need for such engineers, but it can let you do more with less.

Another reason engineers are intrigued by getting rid of servers is that they’re quite heavyweight. While cloud servers are certainly less cumbersome than server farms, each server or container can typically handle thousands of requests per minute, which you probably don’t need initially and don’t need during off peak hours. You can auto-scale the servers, but the granularity is large: you can’t spin up half a container. And while cloud servers spin up in a few minutes, that’s still a long time when you have a sudden traffic spike.

Functions as a Service (FaaS)

At long last, I’m finally going to tell you what serverless means. Serverless lets you run part or all of your backend… wait for it… without servers. Yes, that was a lot of buildup for something you could have inferred from the name.

The way this works is that the major cloud providers have introduced what are called “functions as a service” (FaaS), branded for example, as AWS Lambda and Google Cloud Functions. Amazon and Google run the servers, and they give developers a way to upload their code without worrying about the underlying operating system and everything required to maintain it. Amazon and Google worry about that. Your engineers worry about coding your company’s application.

Best of all, it scales infinitely. There are some caveats, but for most intents and purposes, engineers don’t have to worry about scaling (something we tend to worry about a lot). True, you could scale cloud servers infinitely, but it required quite a bit of setup to make it work correctly. Plus, servers are heavyweight; it costs a few dollars to run a server for an hour, whereas the price of a single AWS Lambda invocation is, at the time of this writing, $0.0000002.

Serverless removes a lot of the system administration work that is needed to maintain servers. While you’re still probably going to need to hire Infrastructure Engineers eventually, your backend engineers should be able to handle things on their own for longer, and you’ll need fewer Infrastructure Engineers to handle the same workload down the road. You’re outsourcing much of the operational complexity – especially the parts that keep engineers up at night – to the experts at Google and Amazon.

Serverless Framework

Somewhat confusingly, “serverless” describes this broad approach toward application hosting without servers, while the capital-S “Serverless Framework” is a specific framework that facilitates working with serverless infrastructures. I’m personally a fan of the framework (aside from the name, which makes it impossible to search for on Google), as it solves many of the limitations of the serverless solutions the big vendors provide: it makes it possible to develop code locally, simplifies unit testing, and decreases vendor lock-in. There are other serverless frameworks, but I call this one out because it is the most popular as of this writing and because the name introduces confusion to conversations about serverless architectures.

So, should my team be using Serverless?

While I’m a big fan of serverless architectures – I’m using it to build my current startup – it has some limitations that prevent it from being the holy grail in all situations.

First of all, using serverless means giving up control. While I wrote at length about the hassle involved in maintaining servers, skilled teams can use this to their benefit. They might be able to optimize performance better than cloud functions or use the servers for something that cloud functions don’t support. Cloud functions were built for common use cases, but they may not work well or at all for some uncommon use cases. With serverless, you’re yielding control to Amazon or Google; if they don’t support your use case or if their performance isn’t good enough for you, you’re out of luck. Relatedly, each FaaS service is different, and choosing one means you’re committing to that vendor. People who are concerned about vendor lock-in will think twice about going serverless.

Another big consideration is cost. At low volume, serverless will undoubtedly be cheaper than the alternatives. That’s because that without serverless, you have to have at least one server running at all times, even at times when nobody’s using it. You’re paying for the ability to handle thousands or millions of requests per hour even if you’re not getting nearly that much traffic. With serverless, you’re not paying for servers during the downtime, you just need to pay a fraction of a penny whenever your service is needed. However, if you are processing millions of requests per hour, servers can give you economies of scale that serverless cannot. Which is cheaper depends on a lot of factors, but servers give you a lot more options.

Third, going serverless is a big decision that requires architecting the application in a different way. This isn’t a problem if you’re developing an application from scratch, but migrating an existing application from servers to serverless is a large undertaking. Some teams choose to move some of the workload to serverless and keep some of it using servers, either as a migration plan or as a long term split.

Finally, serverless is new and changing quickly. Working with new technologies excites many developers, but it poses some risks. Seasoned (a.k.a. jaded) engineers like me have been bitten more than once by jumping on the hottest new technology, only to abandon it at great cost later on. It’s possible that serverless is a passing fad and you’ll need to migrate off of it in a few years to keep up with the changing times. If it stays around, the quick iteration of the underlying systems may mean your developers will need to do more work to upgrade them.

A serverless architecture is not a silver bullet for all teams, and whether or not you should use it depends on a lot of factors. I hope this article gave you enough foundation to have an honest conversation with your engineers about whether it’s the right solution for your company.