Setting up an AWS Lambda Function

Once you have your model converted to ONNX, we can set up our AWS Lambda function to serve predictions with it. The current limitations of AWS Lambda require that we fit all of our code and the necessary libraries to run into a 50mb or smaller zip file. This is why we’re using Caffe2 rather than PyTorch to serve our predictions (due to its much smaller footprint).

PyTorch takes up over 300mb of disk space, but we can fit Caffe2 and all of the necessary libraries into a 38mb zip file.

If you want, you can download the deps.zip file from the github repo, add your own python script and run it on AWS Lambda. You can find that here: https://github.com/michaelulin/pytorch-caffe2-aws-lambda

To create your own zip file for AWS Lambda, launch a new EC2 instance based on the AWS Lambda AMI located here: http://docs.aws.amazon.com/lambda/latest/dg/current-supported-versions.html

This is the base image that your Lambda function will run on, so you can test how the Lambda function will work and what libraries/packages are available.

Once that’s up, run the following script to install all the necessary packages and libraries on the instance, install Caffe2 and add it all to a zip file.

# Install necessary packages and update libraries sudo yum update -y

sudo yum -y upgrade

sudo yum -y groupinstall "Development Tools"



sudo yum install -y \

automake \

cmake \

python-devel \

python-pip \

git # Install necessary packages for Pillow. Not necessary if you don't need the Pillow library in Python for working with images sudo yum install -y gcc zlib zlib-devel openssl openssl-devel

sudo yum install -y libjpeg-devel # Install protobuf



git clone https://github.com/google/protobuf.git

cd protobuf

./autogen.sh

./configure

make

sudo make install

sudo ldconfig # Install python virtualenv, set up a new environment and install necessary python packages pip install virtualenv

virtualenv ~/env && cd ~/env && source bin/activate

pip install numpy



pip install --use-wheel --no-index -f http://dist.plone.org/thirdparty/ -U PIL --trusted-host dist.plone.org

pip install protobuf

pip install future

pip install requests

pip install onnx

cd ~ # Clone and install Caffe2 using cmake mkdir cf2



git clone --recursive https://github.com/caffe2/caffe2.git && cd caffe2



mkdir build && cd build



cmake -DBUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX="/home/ec2-user/cf2/" -DCMAKE_PREFIX_PATH="/home/ec2-user/cf2/" -DUSE_GFLAGS=OFF ..

make -j4

make install/fast



cd ~ # Clone and install ONNX-Caffe2. As noted above, a specific commit of Caffe2-ONNX is required at this time for ONNX-Caffe2 to function properly git clone --recursive https://github.com/onnx/onnx-caffe2

cd onnx-caffe2

git reset --hard f7509f293d781638ef14ac3d232de0c140ed8277

python setup.py install



cd ~ # Add python packages to zip file for dir in $VIRTUAL_ENV/lib64/python2.7/site-packages \

$VIRTUAL_ENV/lib/python2.7/site-packages

do

if [ -d $dir ] ; then

pushd $dir; zip -9 -q -r ~/deps.zip .; popd

fi

done # Add protobuf to zip file cd protobuf

zip -9 -q -r ~/deps.zip python



cd ~ # Add Caffe2 to zip file cd cf2

zip -9 -q -r ~/deps.zip caffe2

cd ~ # Add protobuf .so files to zip

mkdir local

mkdir local/lib



cp /usr/lib64/libprotobuf.so* local/lib/



zip -9 -q -r ~/deps.zip local/lib

After creating the zip file, you just need to add the python script that will be run via AWS Lambda. You can use the following test script. The script will download the trained ONNX model from S3 and save it to the /tmp space available to the Lambda function. 500mb of tmp space is available, so we should be able to load most trained models.

import boto3

import os



# Check if models are available

# Download model from S3 if model is not already present if os.path.isfile('/tmp/model.proto') != True:

s3 = boto3.client('s3')

s3.download_file('mybucket', 'model.proto', '/tmp/model.proto')





# Load .so files before launching Caffe2



import ctypes

import os



for d, dirs, files in os.walk(os.path.join(os.getcwd(), 'local', 'lib')):

for f in files:

if f.endswith('.a'):

continue

ctypes.cdll.LoadLibrary(os.path.join(d, f))



import numpy as np

import json

import onnx

import onnx_caffe2.backend as backend



# Load ONNX model

graph = onnx.load("/tmp/model.proto")



# Load model into Caffe2

model = backend.prepare(graph, device="CPU")





def handler(event, context):

# Create dummy input for model

x = np.random.randn(1, 3, 224, 224)



# Get model output

output = model.run(x.astype(np.float32))



# return results formatted for AWS API Gateway

return {"statusCode": 200, \

"headers": {"Content-Type": "application/json"}, \

"body": json.dumps(str(output))}

This script loads the pre-trained ONNX model, loads it into Caffe2 and runs a test prediction. The handler function is the one that will be run by AWS Lambda. Inputs will be available from the event variable. In this example, we generate a random numpy array with the same size as the input, run it through the model and return the prediction as a string. In a real deployment, you could fetch the input from the event variable.

Save this script and add it to the deps.zip file (or whatever your zip file is called). In this example, I’ve added the above script (test.py) to the deps.zip file via the following command:

zip -9 -q -r ~/deps.zip test.py

Once you’ve added your script to the zip file, save the zip file to S3 or your local computer so that you can upload it to AWS Lambda.