Welcome to “Serverless Superheroes”!

In this space, I chat with the toolmakers, innovators, and developers who are navigating the brave new world of “serverless” cloud applications.

In this edition, I chatted with Steven Faulkner, a senior software engineer at LinkedIn and the former director of engineering at Bustle. The following interview has been edited and condensed for clarity.

Forrest Brazeal: At Bustle, your previous company, I heard you cut your hosting costs by about forty percent when you switched to serverless. Can you speak to where all that money was going before, and how you were able to make that type of cost improvement?

Steven Faulkner: I believe 40% is where it landed. The initial results were even better than that. We had one service that was costing about $2500 a month and it went down to about $500 a month on Lambda.

Bustle is a media company — it’s got a lot of content, it’s got a lot of viral, spiky traffic — and so keeping up with that was not always the easiest thing. We took advantage of EC2 auto-scaling, and that worked … except when it didn’t. But when we moved to Lambda — not only did we save a lot of money, just because Bustle’s traffic is basically half at nighttime what it is during the day — we saw that serverless solved all these scaling headaches automatically.

On the flip side, did you find any unexpected cost increases with serverless?

There are definitely things that cost more or could be done cheaper not on serverless. When I was at Bustle they were looking at some stuff around data pipelines and settled on not using serverless for that at all, because it would be way too expensive to go through Lambda.

Ultimately, although hosting cost was an interesting thing out of the gate for us, it quickly became a relative non-factor in our move to serverless. It was saving us money, and that was cool, but the draw of serverless really became more about the velocity with which our team could develop and deploy these applications.

At Bustle, we only have ever had one part-time “ops” person. With serverless, those responsibilities get diffused across our team, and that allowed us all to focus more on the application and less on how to get it deployed.

Any of us who’ve been doing serverless for a while know that the promise of “NoOps” may sound great, but the reality is that all systems need care and feeding, even ones you have little control over. How did your team keep your serverless applications running smoothly in production?

I am also not a fan of the term “NoOps”; it’s a misnomer and misleading for people. Definitely out of the gate with serverless, we spent time answering the question: “How do we know what’s going on inside this system?”

IOPipe was just getting off the ground at that time, and so we were one of their very first customers. We were using IOPipe to get some observability, then CloudWatch sort of got better, and X-Ray came into the picture which made things a little bit better still. Since then Bustle also built a bunch of tooling that takes all of the Lambda logs and data and does some transformations — scrubs it a little bit — and sends it to places like DataDog or to Scalyr for analysis, searching, metrics and reporting.

But I’m not gonna lie, I still don’t think it’s super great. It got to the point where it was workable and we could operate and not feel like we were always missing out on what was actually going on, but there’s a lot of room for improvement.

Another common serverless pain point is local development and debugging. How did you handle that?

I wrote a framework called Shep that Bustle still uses to deploy all of our production applications, and it handles the local development piece. It allows you to develop a NodeJS application locally and then deploy it to Lambda. It could do environment variables before Lambda had environment variables, and have some sanity around versioning and using webpack to bundle. All the the stuff that you don’t really want the everyday developer to have to worry about.

I built Shep in my first couple of months at Bustle, and since then, the Serverless Framework has gotten better. SAM has gotten better. The whole entire ecosystem has leveled up. If I was doing it today I probably wouldn’t need to write Shep. But at the time, that’s definitely when we had to do.

You’re putting your finger on an interesting reality with the serverless space, which is: it’s evolving so fast that it’s easy to create a lot of tooling and glue code that becomes obsolete very quickly. Did you find this to be true?

That’s extremely fair to say. I had a little Twitter thread around this a couple months ago, having a bit of a realization myself that Shep is not the way I would do deployments anymore. When AWS releases their own tooling, it always seems to start out pretty bad, so the temptation is to fill in those gaps with your own tool.

But AWS services change and get better at a very rapid rate. So I think the lesson I learned is lean on AWS as much as possible, or build on top of their foundation and make it pluggable in a way that you can just revert to the AWS tooling when it gets better.

Honestly, I don’t envy a lot of the people who sliced their piece of the serverless pie based on some tool they’ve built. I don’t think that’s necessarily a long term sustainable thing.

As I talk to developers and sysadmins, I feel like I encounter a lot of rage about serverless as a concept. People always want to tell me the three reasons why it would never work for them. Why do you think this concept inspires so much animosity and how do you try to change hearts and minds on this?

A big part of it is that we are deprecating so many things at one time. It does feel like a very big step to me compared to something like containers. Kelsey Hightower said something like this at one point: containers enable you to take the existing paradigm and move it forward, whereas serverless is an entirely new paradigm.

And so all these things that people have invented and invested time and money and resources in are just going away, and that’s traumatic, that’s painful. It won’t happen overnight, but anytime you make something that makes people feel like what they’ve maybe spent the last 10 years doing is obsolete, it’s hard. I don’t really know if I have a good way to fix that.

My goal with serverless was building things faster. I’m a product developer; that’s my background, that’s what I like to do. I want to make cool things happen in the world, and serverless allows me to do that better and faster than I can otherwise. So when somebody comes to me and says “I’m upset that this old way of doing things is going away”, it’s hard for me to sympathize.

It sounds like you’re making the point that serverless as a movement is more about business value than it is about technology.

Exactly! But the world is a big tent and there’s room for all kinds of stuff. I see this movement around OpenFaaS and the various Functions as a Service on Kubernetes and I don’t have a particular use for those things, but I can see businesses where they do, and if it helps get people transitioned over to serverless, that’s great.

So what is your definition of serverless, then?

I always joke that “cloud native” would have been a much better term for serverless, but unfortunately that was already taken. I think serverless is really about the managed services. Like, who is responsible for owning whether this thing that my application depends on stays up or not? And functions as a service is just a small piece of that.

The way I describe it is: functions as a service are cloud glue. So if I’m building a model airplane, well, the glue is a necessary part of that process, but it’s not the important part. Nobody looks at your model airplane and says: “Wow, that’s amazing glue you have there.” It’s all about how you craft something that works with all these parts together, and FaaS enables that.

And, as Joe Emison has pointed out, you’re not just limited to one cloud provider’s services, either. I’m a big user of Algolia with AWS. I love using Algolia with Firebase, or Netlify. Serverless is about taking these pieces and gluing them together. Then it’s up to the service provider to really just do their job well. And over time hopefully the providers are doing more and more.

We’re seeing that serverless mindset eat all of these different parts of the stack. Functions as a service was really a critical bit in order to accelerate the process. The next big piece is the database. We’re gonna see a lot of innovation there in the next year. FaunaDB is doing some cool stuff in that area, as is CosmosDB. I believe there is also a missing piece of the market for a Redis-style serverless offering, something that maybe even speaks Redis commands but under the hood is automatically distributed and scalable.

What is a legitimate barrier to companies that are looking to adopt serverless at this point?

Probably the biggest is: how do you deal with the migration of legacy things? At Bustle we ended up mostly re-architecting our entire platform around serverless, and so that’s one option, but certainly not available to everybody. But even then, the first time we launched a serverless service, we brought down all of our Redis instances — because Lambda spun up all these containers and we hit connection limits that you would never expect to hit in a normal app.

So if you’ve got something sitting on a mainframe somewhere that is used to only having 20 connections and then you moved over some upstream service to Lambda and suddenly it has 10,000 connections instead of 20. You’ve got a problem. If you’ve bought into service-oriented architecture as a whole over the last four or five years, then you might have a better time, because you can say “Well, all these things do is talk to each other via an API, so we can replace a single service with serverless functions.”

Any other emerging serverless trends that interest you?

We’ve solved a lot of the easy, low-hanging fruit problems with serverless at this point. Like how you do environment variables, or how you’re gonna structure a repository and enable developers to quickly write these functions, We’re starting to establish some really good best practices.

What’ll happen next is we’ll get more iteration around architecture. How do I glue these four services together, and how do the Lambda functions look that connect them? We don’t yet have the Rails of serverless — something that doesn’t necessarily expose that it’s actually a Lambda function under the hood. Maybe it allows you to write a bunch of functions in one file that all talk to each other, and then use something like webpack that splits those functions and deploys them in a way that makes sense for your application.

We could even respond to that at runtime. You could have an application that’s actually looking at what’s happening in the code and saying: “Wow this one part of your code is taking a long time to run; we should make that its own Lambda function and we should automatically deploy that and set up this SNS trigger for you.” That’s all very pie in the sky, but I think we’re not that far off from having these tools.

Because really, at the end of the day, as a developer I don’t care about Lambda, right? I mean, I have to care right now because it’s the layer in which I work, but if I can move one layer up where I’m just writing business logic and the code gets split up appropriately, that’s real magic.