Imagine a new user signs up for your service. You send an automated welcome message to your new user explaining how the service works. But what if your user struggles with the first steps? You want to send a second email with additional information. To abstract this a little bit, the following steps are needed:

Send a welcome message to the new user. Wait some time. Check if the user completed the initial steps.

a. If yes, done.

b. If no, continue. Send a message with additional information to the new user. Wait some time. Check if the user completed the initial steps.

a. If yes, done.

b. If no, continue. Send a message to the new user offering a Chime call.

This is nothing more than a state machine. It has a start (new user signed up) and an end (the last message was sent) and a few state transitions in between. With AWS Step Functions, you can implement a state machine. To do so, you have to translate the steps into the right format and implement the business logic. I will use AWS Lambda to implement the business logic in this post. Let’s get started.

Anatomy of a state machine in AWS Step Functions

A state machine in AWS Step Functions can take input data in JSON and consists of states:

There is one start state that gets the input when starting the state machine.

Each state can either be an end state or will point to the next state.

There are one or many end states.

A state is of a specific type.

By default, the input of a state is outputted. Some states change this.

In this example, four different state types are used, but there are much more. The four used state types are:

Type Description Task Calls a Lambda function.The event of the Lambda function is the input of the state. By default, the output of the Lambda function is the output of the state. If the Lambda function fails, it can be retried. Wait Waits for a specific amount of time in seconds. You are not billed for the waiting time. Choice So far, a state has only one next state. But sometimes you need to make a choice (e.g., if the user completed initial steps, then ..., else ...). Depending on a precondition, you can have several next states. Succeed Indicates a successful end of a state machine.

Now, you have to map the engaging steps to states.

Example state machine in AWS Step Functions

The start state is SendMessage1 .

Id Type Description Next SendMessage1 Task Send a welcome message to the new user. Wait1 Wait1 Wait Wait some time. FetchActivityCount1 FetchActivityCount1 Task Fetch number of activities the new user performed. CheckActivityCount1 CheckActivityCount1 Choice Did the user completed the initial steps? If yes, then Done , else SendMessage2 SendMessage2 Task Send a message with additional information to the new user. Wait2 Wait2 Wait Wait some time. FetchActivityCount2 FetchActivityCount2 Task Fetch number of activities the new user performed. CheckActivityCount2 CheckActivityCount2 Choice Did the user completed the initial steps? If yes, then Done , else SendMessage3 SendMessage3 Task Send a message to the new user offering a Chime call. Done Done Succeed Done. -

Now, the state machine is defined. Are you surprised by states FetchActivityCount1 and CheckActivityCount1 ? The step Check if the user completed the initial steps was translated to two states:

Task FetchActivityCount1 : Fetch number of activities the new user performed.

: Fetch number of activities the new user performed. Choice CheckActivityCount1 : Did the user completed the initial steps?.

The reason for this is that a state can either do something (like getting the number of activities performed by the user from the database) or it can make a choice. You can not do both in a single state. Also, the Lambda function cannot perform that choice for you. Only the state machine can make a choice based on input data.

Now, the business logic (states of type Task) needs to be implemented.

Implementing tasks

A task can either call a Lambda function or an activity. If your business logic cannot be implemented with Lambda, you can fall back to activities. I will not cover activities in this example.

Send welcome message

I provide a dummy implementation here in Node.js that fails in 30% of the time to demonstrate how retries work.



module .exports.handler = ( event, context, cb ) => {

console .log( JSON .stringify(event));

if ( Math .random() < 0.3 ) {

cb( new Error ( 'error happened' ));

} else {

cb( null , {});

}

};



Fetch number of activities

I provide a dummy implementation here in Node.js that fails in 30% of the time and returns that the user did not complete any activities in 50% of the time.



module .exports.handler = ( event, context, cb ) => {

console .log( JSON .stringify(event));

if ( Math .random() < 0.3 ) {

cb( new Error ( 'error happened' ));

} else {

cb( null , { activities : Date .now() % 2 });

}

};



So far, the state machine is not really defined in a machine readable format. You will change this in the next section.

Translate the state machine to JSON

State machines are defined in a JSON document like this:

{

"Comment" : "AWS Step Functions Example" ,

"StartAt" : "SendMessage1" ,

"Version" : "1.0" ,

"States" : {





}

}



The StartAt property defines the first state in the state machine. Let’s see how states are defined.