We started hitting a 30 second timeout on some Lambdas last week but everything was running much faster when debugging locally. So I dug into the XRay logs to learn more and found two execution delays (red boxes below).

The first is the Lambda cold start setup. So this delay was only being hit after the Lambda was deleted and a new execution started. The second was happening on every execution.

The Issue

After re-verifying we weren’t seeing these 10 second delays locally we were pretty sure it wasn’t anything in our code. So I reached out to AWS support and they indicated there was a “penalty” when running Lambda in a VPC and making networking connections inside that VPC. Nicki Klein (AWS Tech Evangelist) provided the actual documentation/link:

When a Lambda function is configured to run within a VPC, it incurs an additional ENI start-up penalty. This means address resolution may be delayed when trying to connect to network resources.

https://docs.aws.amazon.com/lambda/latest/dg/vpc.html …

The documentation is pretty vague so its hard to tell if this penalty only effects cold starts or more than that. Based on my performance analysis it seems like its both.

Workarounds

For the cold start delay, a keep warm trigger should suffice. This has been blogged about many times so I won’t cover this.

To improve the second delay Nicki suggested increasing the RAM allocated to the Lambda. At first I was irritated that I would have to pay more for my Lambdas just because I was using a VPC but the results surprised me.

To test this, I took a set of integration tests that call my API (hosted in Lambda) about a dozen times. I waited about 7 min between each run and increased the RAM each time. (The change to memory forces a new instance and cold start.) For most, I ran it more than once in a span of about 3 hours.

As you might expect the average API response time was faster with more RAM/CPU (CPU is increased proportionally to RAM). What I didn’t expect is the cost of increasing the RAM. While the rates for Lambda increase linearly to the increase in RAM, the performance impact does not (at least not in my case).

If we average out the Lambda execution duration at each RAM-level and calculate the cost for 10K executions based on the rates you can see adding RAM can actually make it cheaper. This is because the RAM is making the Lambda exponentially faster. So just a little bit more RAM is making a really big difference.

(Notice the dual y-axis. The cross section of these lines doesn’t mean anything.)

Based on these estimates we are going to try running our Lambdas with 1GB of RAM which should be approximately 30% cheaper and 4x faster.

As an aside the Max Memory Used reported for my function was 125MB and it was configured to use 256MB. So I had plenty of memory headroom. So these perf improvements must be from the proportional increase in CPU.

Warning – this was all very informal testing/analysis. Your mileage may vary!

Now that I’ve done all this research I found others with similar results for CPU bound scenarios – https://medium.com/@jconning/aws-lambda-faster-is-cheaper-6bf32f58d741

Longterm Solution

AWS must find a better way to support Lambdas in a VPC. Most serverless applications will need to be able to access networked resources in a private VPC so it is shortsighted to have customers pay this penalty for private more secure architectures.

Some folks have rumored that improvements will be announced at re:Invent 2018 but we will see.

If you found this blog post helpful or have further questions/feedback please leave a comment below.

Happy Faster Lambda Executions Inside a VPC!