One of the many benefits that serverless computing is supposed to offer over traditional, server-based solutions is reduced costs in building and running software systems. While using the serverless stack can offer substantial savings, it doesn't guarantee cheaper IT operations for all types of workloads. At times, it may even be more expensive compared to server deployments, particularly at scale.

Here's my review of the economics of serverless computing, including analyzing pricing models of several cloud services, real-world examples of the savings and insights and tips on how to keep the costs of running serverless applications under control.

Pay-per-use makes FaaS appear insanely cheap

Since Amazon Web Services (AWS) launched its function-as-a-service (FaaS) offering, AWS Lambda, serverless computing has been heralded as the next, natural step in the evolution of cloud computing. It has been called the next big thing that will revolutionize how we deliver and operate the software systems of the future. Because cost is a big reason for this enthusiasm, let's consider the pay-per-use pricing model that underpins most of the cloud provider services used to build serverless applications. This is the model used by FaaS services such as AWS Lambda and Azure Functions, and it is often one of the key arguments for adopting this new way of building cloud-native systems.

Pricing model for the function execution

When looking at the price per function invocation, currently at $0.0000002 for AWS Lambda and Azure Functions, it's very easy to get the impression that FaaS is incredibly cheap (20 cents for 1 million invocations). However, the price based on the number of invocations alone does not truly reflect the cost of providing this sort of service. In fact, it's not the main element in the total cost associated with FaaS compute.

The execution of functions takes up valuable compute resources, and both AWS and Azure charge additionally for the combination of allocated memory and the elapsed time of function execution (rounded up to the nearest 100ms). With the current AWS Lambda price at $0.00001667 for every GB-second used (Azure Functions cost $0.000016 for every GB-second), you can see how the cost mounts quickly.

Since the amount of allocated memory is configurable between 128 MB and 1.5 GB, the total cost of function execution will vary depending on the configuration, and the cost per 100ms of the execution time for the most powerful specification will be roughly 12 times more expensive than the basic 128 MB option.

Now, even considering the cost based on the compute resources used per invocation, AWS Lambda is still looking very cheap, and 1 million invocations with the average time of 500ms and 128 MB of available memory would only cost approximately $1.25. The same function would cost shy of $6 to be run continuously for the entire month (with each invocation taking 500ms to complete).

Or pay nothing to play with functions

Cloud providers such as AWS and Azure are trying to attract more people to give FaaS a try. They offer a hefty free tier with 1 million invocations and 400,000 GB-seconds free each month. The free tier alone provides enough execution seconds to keep a function using 128 MB of memory running around the clock for the entire month.

How does FaaS compare to IaaS?

It's definitely impressive how inexpensive FaaS compute is, but cloud providers have spoiled us with very cheap computing for years now. For instance, t2.nano, the smallest instance type of AWS's infrastructure-as-a-service (IaasS) offering EC2, costs only $4.25 (US East region) to keep it up for an entire month.

In fact, simple math shows that running a tiny EC2 instance would be cheaper than having a function running continuously for the entire month. I'd imagine the same would be true for larger EC2 instances, too, but that's not really the point. Nobody would use FaaS for compute-intensive workloads where some serious processing power is needed to munch through a lot of data.

Unpredictable load

The truth is that a lot of real-world workloads are exhibiting varied and sometimes unpredictable levels of load. For this reason, IaaS-based compute has to be over-provisioned so that resources are available to handle some moderate load fluctuations, and occasionally additional instances need to be provisioned when the more substantial demand spike arrives.

What are the cost implications of such a capacity-provisioning model?

It's impossible to reduce the cost below the minimal footprint; that would require not one but multiple load-balanced EC2 instances in different availability zones in order to deliver adequate availability.

Any increase in the number of required compute instances results in a shift to a higher per-hour cost, while the resource utilization goes down.

Disadvantage of provisioning based on compute instance level

The disadvantage of relying on provisioning based purely on compute instance level has been understood by the industry for a while. These days, most companies operating at a non-trivial scale would most likely be using some form of container-based capacity provisioning that allows greater utilization levels from the underlying compute infrastructure.

Cloud providers have also started offering reserved capacity options, where substantial savings can be gained by committing to the compute infrastructure for a year, or even a few years in advance. All of these factors and the constant drive toward cost reduction have quite often led to companies managing large estates of reserved compute, which is somewhat similar to the world before cloud computing.

By contrast, the on-demand compute capacity offered by FaaS only costs as much as it's being used, starting with close to nothing for lightweight workloads, while providing out-of-the-box scalability and availability across multiple availability zones.

Taking the real-world workload serverless

So far, I've been looking at FaaS on its own, but using functions alone is not enough to deliver any real-world architectures in the cloud. Even the most basic applications will require at least some means of exposing the functionality to the end user, perhaps as a web application or an API, as well as some form of data persistence.

The AWS Simple Monthly Calculator features one relatively common application, showing the cost breakdown for running the entire infrastructure stack for a month. The AWS 3-tier auto-scalable web application, written in Ruby on Rails, can serve 100,000 page views a month. The stack consists of one load balancer, two web servers, two application servers, and a highly available database server. The solution also uses DynamoDB, S3, Route53, and CloudFront and has been estimated to require 120 GB of data transfer each month.

The total cost of running this stack is $894.45 per month, of which EC2 compute accounts for $427.04, and RDS for further $337.88.

Save big with serverless

Consider how much could be saved if the IaaS compute were to be replaced by AWS Lambda, with API Gateway providing the HTTP facade. While conservatively assuming it takes 3 seconds on average to serve a page view based on the data from RDS or DynamoDB, we can calculate the total cost of serving 100,000 pages. Even with a generous 1 GB memory allocation and relatively sluggish 3-second processing time, the total cost for AWS Lambda would be a mere $5 (this is ignoring the free tier), and API Gateway would add another $11.15!

If we consider that for this very modest price the provider takes care of elasticity to accommodate any sudden spikes in demand (up to the default concurrent execution limit of 100, which can be raised by the AWS support), as well as offering the availability based on multiple availability zones (AWS doesn't state how many, but it would be reasonable to assume at least three in any region), the value gained from going serverless becomes very evident.

But wait ... can't you optimize the infrastructure?

Yes, it could be argued that the original infrastructure stack for the web application could be further optimized from the cost perspective, and opting for reserved instances could reduce the cost by about 75 percent. Or perhaps weaker compute instances could have been selected. The bottom line is this: Any such cost reduction would have resulted in either:

Making a long-term commitment to keep the solution online, or

Introducing potential capacity problems if the infrastructure is not fit for purpose.

Be sure to control your serverless costs

Despite the obvious cost savings that can be achieved by switching over to a serverless compute stack, there are a few "gotchas" to be mindful of. With some careful consideration and by following some good advice, these pitfalls can be overcome.

No guarantees here

First of all, using the serverless stack is not always guaranteed to be a money-saving option. There are certain workloads that require substantial compute resources, which makes the serverless model less cost-effective. I advise due diligence and experimentation before deciding to rework anything for the serverless stack. Functional and non-functional requirements should be carefully reviewed and validated against numerous limitations associated with the FaaS environment. The total cost of running the solution using the serverless stack should be calculated up front using the pricing details supplied by the cloud provider. A serverless cost calculator can be used for a quick comparison of FaaS offerings from different providers.

Beware of retries and overhead

Second, it's possible to rack up some serious expenses if functions are being retried by the FaaS environment because of failed invocations. This is particularly important for any streaming event sources such as AWS Kinesis, DynamoDB streams, or S3 notifications. If the function invocation fails, it will be retried for as long as the event is retained, which could be days. It's important to monitor the costs regularly or have automatic checks, such as CloudWatch alarms, in place to react to any unexpected sudden increases of the infrastructure costs.

Lastly, some services such as AWS Kinesis and DynamoDB don't follow the pay-per-use pricing model but instead require provisioned capacity, which can lead to some substantial overhead if not managed properly. Luckily, AWS allows the capacity to be adjusted via the SDK, so it's possible to create a solution that automatically increases and reduces the capacity based on the actual demand, using AWS Lambda functions reacting to CloudWatch metrics.

How much can you save?

While I've tried to highlight some obvious economic benefits of adopting serverless computing, the cost of running software systems is not the only reason to consider FaaS offerings. Building solutions based on small and focused units of business logic that can be quickly and cheaply delivered to the market and effortlessly managed and scaled based on the actual demand offers a massive competitive advantage in the marketplace. And frankly, such an advantage can't be easily quantified, but it has potential to bring disruption to cloud deployment models.

With all this data and advice, I hope to have tempted some of you to consider how your cloud architectures might benefit from serverless computing. I believe serious savings are possible by migrating some of the workloads commonly deployed on a traditional compute infrastructure to a serverless stack.

Keep learning