The Use Case

A recent client needed their platform to send emails on scheduled date-times. These emails were in groups and their third party email sending API could not support the scheduling they needed. Although this article discusses email scheduling, the same design pattern works for various scheduled events, e.g. report generation, billing, database backups etc…

How Step Functions can help

Step Functions allow us to build Finite State Machine (FSM) workflows. These workflows can contain many steps and transitions (nodes and edges), or be a simple linear set of simple states. One of the step types of interest is the WAIT step that waits for a given number of seconds or a given date-time. Of interest to us is the wait to a given date-time.

Billing of Step Functions

Serverless services are often billed for the compute time needed, so a WAIT state sounds… not a good idea. Luckily Step Functions are billed on the number of step transitions, which is great news for us.

Our Step Function

When our Step Function is triggered, it will transition to the WAIT state… wait for a given date-time and then transition to the next state.

The next state is the called Push in our diagram, this is because it is triggering a Lambda function to push an event to an SQS Queue. This is explained in the next section.

Defining this in CloudFormation is fairly verbose. Luckily there is a plugin for the Serverless Framework to allow Step Functions to be defined easily.

https://serverless.com/plugins/serverless-step-functions/

The definition of our FSM in the serverless.yml using the serverless-step-functions plugin looks as follows.

What does our Step Function link with?

Out Step Function will allow us to trigger some action when a given date-time arrives — but it requires a Lambda to trigger it (behind API Gateway), an SQS queue to allow retry logic and finally a Lambda to send the email.

A request to schedule something will start in the top right of the above diagram — going through API Gateway to a Lambda function to trigger the Step Function invocation, giving a date-time to wait for and some data about the emails to send.

It will then be held in the WAIT step until the given date-time is reached.

step until the given date-time is reached. I will then transition to the next step which is a Lambda function, pushing the event into an SQS Queue.

This queue allows retry logic and a dead-letter queue in case there is a failure with our email sending 3rd party.

The final Lambda function will take the event from SQS, use it’s content to know the context of the email and potentially pull more data from the application database before calling the 3rd Party to send the email.

Configuring the SQS Queue

The SQS Queue can be configured directly in the serverless.yml, with the Send Email Lambda configured to listen to events on the queue.

The lower resource section creates the SQS Queue and it can be referenced via the GetAtt function in the events section of the Lambda to send emails.

Now we have the Step Function to “wait” and the SQS queue trigger — let’s look at the flow of our event before looking at the code of the 3 Lambdas that tie it all together.

Event Flow

Our event can be as simple as the text of an email to send and the “to” and “from” email addresses — or as complex as a template ID and some sort of tag ID for people in our system who should receive the email. We’re going to stay simple in this article.

Our event is going to remain unchanged throughout the flow of the system and will represent our email to be sent as follows:

The event will flow as follows, from the schedule_email Lambda, through the Step Function FSM and then via the push_email Lambda into SQS and finally processed by send_email Lambda.

Getting Functional

Let’s look at the 3 functions that make up the flow of our scheduled events in order of invocation. These are written in Python, but I’ve done the same approach in Node.js and (long story) PHP.

schedule_email

This function is taking the event passed through from API Gateway and using the AWS SDK (boto3 in Python) to start the execution of a Step Function, passing through the body of the event. The Lambda then responds with a 200 status code, and our Step Function will be waiting in the background.

Note we’re telling the WAIT state to use the dueDate key from the event. This is configured in line 11 of the Step Function snipped in the Step Function section above.

push_email

The Schedule Email Lambda will then be triggered once the start date-time has been reached. This function will again use the AWS SDK (boto3), but it will use SQS to push into the queue.

The SQS queue has been added to ensure retry logic and a dead letter can be configured.

send_email

This function will take the event data, that has been passed through the above, from the SQS queue and instantiate an instant of the email sending client from our email provider. The data from the event is used to build the email message and once it’s been sent this Lambda will then return 200.

Why?

There are obviously several defined ways of scheduling emails, from using the provider’s service to polling your own database. An odd requirement of our client meant that we could not use the scheduling of the email sending provider — and there are other events you want to schedule that do not have providers doing that part for you.

On the database polling side, that’s obviously a well-established approach and with a Lambda a cron is easy to setup. We found polling less elegant, generating lots of log data for events that were not of interest and the regularity of this cron would affect how precise our scheduling was.

Conclusion