Custom Runtimes

The big news brought to us by AWS during their annual re:Invent conference in November 2018 was the introduction of custom runtimes for Lambda. This is indeed big as it allows us to write Lambda functions in any programming language without the trickery of running a binary as a subprocess of one of the already supported runtimes (node, python, go etc.). There is an official tutorial on how to write runtime/lambda in bash and some ready-to-use runtimes for C++ and Rust.

Let’s iterate over what is actually done by the runtime and how that fits in with a compiled language like Haskell. The bash tutorial and probably all the runtimes for dynamic/interpreted languages take advantage of the fact that the runtime can load and run the code of the handler. While this is nice since developers can focus on writing only handler code and let the runtime be there somewhere, it probably causes some portion of the cold start. I imagine bulk of that time is spent on instantiating all the infrastructure needed but some of it goes into getting the runtime up and running especially if the handler uses some dynamically-loaded libraries itself.

The approach taken by the official runtimes for Rust and C++ is to put the whole runtime into a library and compile it along with the handler into one binary.

This might be somewhat confusing for people that already have some experience with writing lambdas but it makes perfect sense. In this approach we just pack everything that is needed to run our lambda into the binary (literally everything, but we will explore this later).

So let’s see if we can, step by step, create a lambda runtime with Haskell that will run natively on Lambda 🙂

Environment setup

To have this out of the way, let’s create a local dev environment. In order to run everything we want, we’ll need:

- stack (available from here)

- docker

- aws cli tool (only if we actually want to run the code on Lambda)

The first step is to create a stack project and configure it.

$ stack new hs-lambda

$ cd hs-lambda

$ stack config set resolver lts-12.14

Now onto proper Lambda stuff. We’ll start with the execution environment, which is:

Operating system — Amazon Linux

AMI — amzn-ami-hvm-2017.03.1.20170812-x86_64-gp2

Linux kernel — 4.14.77–70.59.amzn1.x86_64

AWS SDK for JavaScript — 2.290.0

SDK for Python (Boto 3) — 3–1.7.74 botocore-1.10.74

For us the important part is that our compiled code will run on Amazon Linux v2017.03.1.20170812 as we would like it to be our compilation environment for maximum binary compatibility.

So let’s spin up a container based on AWS image:

$ docker run -rm -name $(basename $(pwd)) -v $(pwd):/mnt -ti amazonlinux:2017.03.1.20170812 /bin/bash

This will start a bash session and mount our current directory into /mnt directory inside the container. We would like to use stack here as well, so let's install it along with some dependencies:



bash-4.2# curl -sSL

bash-4.2# yum install -y gcc make xz libffi zlib zlib-devel gmp gmp-devel — suggested by stack install script bash-4.2# cd /mntbash-4.2# curl -sSL https://get.haskellstack.org/ | shbash-4.2# yum install -y gcc make xz libffi zlib zlib-devel gmp gmp-devel — suggested by stack install script

So far we have partially set up compilation and dev environments. Now, let’s finish setting up the dev environment. We will use our container only for final compilation and do everything else on our dev box. Let’s leave the container running and open another terminal window and cd to our hs-lambda directory.

Implementation

stack new created a basic project skeleton and we're ready to write some code. Let's open app/Main.hs file and go back to the docs to see what our runtime is supposed to do. For the purpose of this post we are going to create a bare-bones serverless echo server 🙂

First of all, this quote from AWS docs tells us how we can actually make our custom runtime work:

You can implement an AWS Lambda runtime in any programming language. A runtime is a program that runs a Lambda function handler method when the function is invoked. You can include a runtime in your function deployment package in the form of an executable file named bootstrap.

From the above we know that we need a binary named bootstrap somewhere in our lambda zip bundle and that this file should contain the runtime. Let's go further and examine all the steps that our runtime has to go through.

Retrieve settings — Read environment variables to get details about the function and environment.

_HANDLER - The location to the handler, from the function configuration. The standard format is file.method , where file is the name of the file without an extension, and method is the name of a method or function that's defined in the file.

- The location to the handler, from the function configuration. The standard format is , where is the name of the file without an extension, and is the name of a method or function that's defined in the file. LAMBDA_TASK_ROOT - The directory that contains the function code.

- The directory that contains the function code. AWS_LAMBDA_RUNTIME_API - The host and port of the runtime API

See Environment Variables Available to Lambda Functions for a full list of available variables

2. Initialize the function — Load the handler file and run any global or static code that it contains. Functions should create static resources like SDK clients and database connections once, and reuse them for multiple invocations.

3. Handle errors — If an error occurs, call the initialization error API and exit immediately.

We’ll look at those bullet points and see if all of them apply.

Our plan is to make a bare-bones minimal lambda function with a custom runtime so we will just put everything into our bootstrap binary. In such a case, we don't need to load any handlers so we don't really need to read _HANDLER and LAMBDA_TASK_ROOT variables. The only thing we will need is the value of AWS_LAMBDA_RUNTIME_API . Again, we won’t be loading any handlers so initialization of SDKs etc. can be done inside our bootstrap code. We should probably do some error handling, but since this is only a toy example, we’ll skip it. For simplicity’s sake.

Ok, so far we only need to read one environment variable and run some code to initialize expensive things like DB connections, SDKs etc. Nothing hard.

When we’re done with initialization, we can move to the main loop of processing tasks. As usual, let’s take a step back and read the docs:

Get an event — Call the next invocation API to get the next event. The response body contains the event data. Response headers contain the request ID and other information. Propagate the tracing header — Get the X-Ray tracing header from the Lambda-Runtime-Trace-Id header in the API response. Set the _X_AMZN_TRACE_ID environment variable with the same value for the X-Ray SDK to use. Create a context object — Create an object with context information from environment variables and headers in the API response. Invoke the function handler — Pass the event and context object to the handler. Handle the response — Call the invocation response API to post the response from the handler. Handle errors — If an error occurs, call the invocation error API. Cleanup — Release unused resources, send data to other services, or perform additional tasks before getting the next event.

Now, this is easy. We simply:

make an HTTP request to one endpoint and receive an event

read some headers and set some environment variables

create some additional context

handle the event taking into account (or not) the context

produce an output and POST it via another HTTP request, be it a proper response or an error (different endpoints though)

cleanup if necessary

Some details about the API can be found here. In our example we’re making an echo server so we just need to get the event and post it back, easy. The code is straightforward:

Compilation

Except error handling (which we skip since it’s just a toy) we have pretty much everything we need. For now this will not compile because we use some external libraries. We will need to add them to our package.yaml file. Another small change we need to make is to rename our executable. We could do it manually later but let's automate. The relevant section of package.yaml should now look something like this:

Now we’re ready to compile and run locally (this can take some time).

$ stack build

$ stack exec bootstrap

bootstrap: AWS_LAMBDA_RUNTIME_API: getEnv: does not exist (no environment variable)

It works! We got an error that we’re missing an env var but that’s ok for now. Let’s try to compile it in our container (this also may take a while).

bash-4.2# stack setup

bash-4.2# stack build

bash-4.2# ldd .stack-work/install/x86_64-linux/lts-12.14/8.4.3/bin/bootstrap

linux-vdso.so.1 => (0x00007ffe14dae000)

libm.so.6 => /lib64/libm.so.6 (0x00007fca5f2f1000)

libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fca5f0d5000)

libz.so.1 => /lib64/libz.so.1 (0x00007fca5eebf000)

librt.so.1 => /lib64/librt.so.1 (0x00007fca5ecb7000)

libutil.so.1 => /lib64/libutil.so.1 (0x00007fca5eab4000)

libdl.so.2 => /lib64/libdl.so.2 (0x00007fca5e8b0000)

libgmp.so.10 => /usr/lib64/libgmp.so.10 (0x00007fca5e63a000)

libc.so.6 => /lib64/libc.so.6 (0x00007fca5e26d000)

/lib64/ld-linux-x86–64.so.2 (0x00007fca5f5f3000)

Everything works, however there is one problem with our binary highlighted by the ldd command. Our binary is dynamically linked. This is good since it drives the file size down but on a system that we don't have control of, and can't install any libraries or anything, this is a problem. Remember how we had to install zlib and gmp ? Those two will be missing on a VM that will run our lambda on AWS. There are two options. We can bundle the missing libraries and link our binary against those or compile a statically-linked binary. Though the second approach will produce a bigger file, we will go ahead and use it because it makes the whole thing more robust.

In order to make it work, we will have to install static versions of some libraries, not only the missing ones. After some trial and error I found out that we only need three.

bash-4.2# yum install -y glibc-static gmp-static zlib-static

Now, there are a lot of resources on the internet on how to compile a static binary with ghc/cabal/stack. I went with the one that seemed clean and will probably make it into the tooling some day. When we ran stack build , it produced a hs-lambda.cabal file for us based on the stuff in our package.yaml and stack.yaml . Stack will warn us that we shouldn't really mess with the .cabal file. In fact, it's not even tracked by git since it landed in the .gitignore file created by stack new . However, for the time being, the best way to make static linking work is to add one line to the .cabal file in the executable section.

Now we’ll recompile.

bash-4.2# stack clean

bash-4.2# stack build -ghc-options=’-fPIC’ — this will create a statically linked binary

bash-4.2# ldd .stack-work/install/x86_64-linux/lts-12.14/8.4.3/bin/bootstrap

not a dynamic executable

We’re done. As you can see, our binary is self contained and should run on any linux distro out there.

Deployment

Now we can go back to our dev box, package the whole thing up and deploy it.

$ zip function.zip .stack-work/install/x86_64-linux/lts-12.14/8.4.3/bin/bootstrap

adding: bootstrap (deflated 73%)

$ aws lambda create-function -function-name test-custom-runtime -zip-file fileb://function.zip -handler function.handler -runtime provided -role arn:aws:iam::123456789:role/aws-lambda-role

{

“FunctionName”: “test-custom-runtime”,

“FunctionArn”: “arn:aws:lambda:eu-central-1:123456789:function:test-custom-runtime”,

“Runtime”: “provided”,

“Role”: “arn:aws:iam::123456789:role/aws-lambda-role”,

“Handler”: “function.handler”,

“CodeSize”: 4667617,

“Description”: “”,

“Timeout”: 3,

“MemorySize”: 128,

“LastModified”: “2019–01–15T21:40:01.625+0000”,

“CodeSha256”: “TMVO8YwRTW3tu9CKpT5ShPe+iVVemrq9xKW2iDdS9Lc=”,

“Version”: “$LATEST”,

“TracingConfig”: {

“Mode”: “PassThrough”

},

“RevisionId”: “d8579e6e-bc80–4951-be5f-cabb5d93f0b6”

}

We’re ready to test it.

$ aws lambda invoke -function-name ‘test-custom-runtime’ -payload ‘{“text”:”Hello”}’ response.txt

{

“StatusCode”: 200,

“ExecutedVersion”: “$LATEST”

}$ cat response.txt

{“text”:”Hello”}

It works! As you can see, our lambda ran successfully and echoed our payload back to us.