Creating A Highly Scalable Website Monitoring Service in half an hour using AWS Lambda

7,479 reads

Lambda functions are a truly awesome bit of technology, if you haven’t read up about my theory that serverless and more specifically AWS Lambdas are going to take over the world in 2018 then check it out here:

It got a couple of readers and somehow got tagged as one of the most read medium articles on the 1st of January… must have been a slow news day I guess.

What Can Serverless Really Do Though?

It’s all very well me evangelising these Serverless technologies, but it doesn’t show you the true potential of the technology.

So, in a bid to show you what is truly possible with Lambda’s I thought I would put together this how to guide and show you what could really be done with the tech!

Introducing UpTimeGirl — My Latest Million Dollar Startup

In this article I am going to be building a Website uptime monitor that will allow people to fill in a form and it will create a new lambda function auto-magically for you that will check the health of your site every 15 minutes and send you an email should one of these sites die. I’m going to be focusing more on the Lambda-oriented concepts and brushing over things such as IAM and setting up of roles.

Every good startup needs a catchy name so we are going to be calling this the UpTimeGirl website monitoring service as a bit of a hommage to Billy Joel?

Disclaimer: If you do make a million dollars from this idea I expect at least a beer!!

Requirements

As this is a software project, it needs a set of requirements, otherwise how will we be able to define any form of success from our project? As I want to build a world-class multi-million dollar monitoring solution for all of my sites I want it to have the following requirements:

It MUST never go down itself.

It MUST be cheap to run. We need to maximise profits of course.

It MUST notify me whenever a site goes down, either through text or email

It MUST scale to hundreds, if not thousands of site.

A fairly decent set of requirements for a project that we’ll be putting together in less than half an hour I think you’ll agree.

The Frontend

The Frontend will be a simple VueJS application that will feature both an input box which will take new sites to monitor and a list of tiles showing the websites registered already.

Why VueJS? — I’m currently trying to learn it and thought why not?

This will just be a very simple frontend that will look like this:

If you want the source code for this you’ll be able to find it in my github repo here: https://github.com/elliotforbes/uptimegirl

I’m currently just using a very simple HTTP GET request in order to trigger the Python API which constructs the lambda functions.

The API

The API will be written in Python and will use the Boto3 AWS SDK to enable me to interact with the AWS’ various APIs. This API will essentially take in a URL and the time_period and it will dynamically generate and create a new Lambda function based off these parameters.

Disclaimer: I can guarantee this isn’t production ready.

In this chunk of code we have created a simple aiohttp REST API in Python 3.6. This will feature 2 main endpoints, a get_functions endpoint and a create_lambda_function endpoint that can be accessed through /funcs and /create respectively.

In the get_functions() function we simply return a json response containing all of the functions currently registered in our Lambda Service.

The create_lambda_function() function is where it gets interesting. In this we create a dictionary object of all the environment variables we wish to pass in. In this case we are just passing in the URL that we wish to monitor. In order to somewhat ensure unique function names we then create an alphanumeric function name from the URL with a little bit of regex magic.

After we do this we then create our Code dictionary object which will simply pass in our lambda.zip file as bytes. We then pass both the env dict and the code dict into the AWS’ create_function function and this should hopefully create our lambda function for us.

After this we then try and attach a CloudWatch Event trigger to our newly created lambda function by using the put_targets() function and passing in the name of the 15-minute rule that we’ve already defined in the AWS console and the list of targets, in this case a single target.

Once this has all happily ran we then return a simple non-json(I know… sorry) response.

In the github repo I’ve included a dockerfile that can be run and deployed on AWS’ ECS should you wish to go down that route, for now I’m keeping it all on localhost because there is still clearly a lot more development work to be done.

The Lambdas

The Lambda functions that we will be dynamically creating will essentially try to grab any URL that we pass into it using a simple GET request. We’ll then run this every 15 minutes based of the CloudWatch Event rule that we are using to trigger every function.

Should any of these scheduled requests happen to fail then it will automatically publish an event to SNS which will then drop an email and send a text message to whoever is subscribed to that SNS.

The Code

The code for our lambda function is incredibly simple. All it does is try to send a GET request to a URL which it picks up from an environment variable that is set when we dynamically create the function. If this request fails then it throws an exception.

If we wanted our system to notify us for any error then we could do this one of two different ways. We could either extend our Lambda function and add the code necessary for it to publish to an SNS topic or we could set up an alert on CloudWatch whenever an error is reported by one of our Lambda functions throws an error.

For simplicity’s sake we’ll go for the latter:

And boom! We have an alerting system that will email us whenever our system is down.

The list of people subscribed to this SNS topic could be modified dynamically by the frontend but we need to leave some work for us to do once we get past the first funding round for our startup.

We Are In Production!

In less than half an hour we’ve been able to successfully construct an entire monitoring solution using a series of lambda functions that is now live and in production.

It is not only incredibly resilient, it also scales to potentially thousands of sites and the cost for doing this is minimal due to the very small nature of the lambdas we are creating.

The frontend and API orchestrating the addition of new sites through the creation of new lambdas hasn’t been deployed to ECS in this article as I felt that’s slightly out of scope, but there is a Dockerfile there should you want to make a start on this. If you were feeling really meta you could totally serve the frontend application from a Lambda function and front it using the API gateway and then dice up the Python API into a series of Lambda endpoints, but I’ll leave that as a follow up task for the reader!

There is still a hell of a lot that could be done to improve this but I’m hoping this simple example helps to highlight the incredible power of Lambda functions and how they can be used. In a traditional application you would have had to not only write a hell of a lot more code, you would have had to deploy it across multiple availability zones. These all represent extra complexity that would have taken far more time to build and to release.

Conclusion

Hopefully you found this article insightful and enjoyable! If you need any further assistance setting up these lambda functions then please feel free to leave a comment in the comments section below or tweet me: Elliot Forbes

I’m also on LinkedIn should you wish to connect!

For those that are following me regularly and enjoying my style of writing I’m publishing my second book through LeanPub titled: An Introduction to Cloud Development Through Story! If you fancy supporting both myself and the EFF check it out!

Tags