Canary What ?

In software testing, a canary (also called a canary test) is a push of programming code changes to a small number of end-users who have not volunteered to test anything. The goal of a canary test is to make sure code changes are transparent and work in a real-world environment.

Canary tests, which are often automated, are run after testing in a sandbox environment has been completed. Because the canary is only pushed to a small number of users, its impact is relatively small should the new code prove to be buggy and changes can be reversed quickly.

Source: Quora

CloudWatch Synthetics (Open Preview)

CloudWatch (CW) Synthetics supports monitoring of your REST APIs, URLs, and website content, checking for unauthorized changes from phishing, code injection, and cross-site scripting. CloudWatch Synthetics runs tests on your endpoints every minute, 24x7, and alerts you when your application endpoints don’t behave as expected. These tests can be customized to check for availability, latency, transactions, broken or dead links, step by step task completions, page load errors, load latencies for UI assets, complex wizard flows, or checkout flows in your applications. You can also use CloudWatch Synthetics to isolate alarming application endpoints and map them back to underlying infrastructure issues to reduce mean time to resolution.

Under the hood, AWS is using Lambda, and NodeJS runtime with puppeteer.

To start with, you can rely on the following blueprints:

Simple Heartbeat monitoring

API Canary

Broken link checker

GUI workflow builder (like Selenium)

Scheduling:

Run once

Every minute

Every 5 minutes

Every hour

Required Permissions

{ "Version" : "2012-10-17" , "Statement" : [ { "Effect" : "Allow" , "Action" : [ "iam:CreateRole" , "iam:CreatePolicy" , "iam:AttachRolePolicy" ], "Resource" : [ "arn:aws:iam::*:role/service-role/CloudWatchSyntheticsRole*" , "arn:aws:iam::*:policy/service-role/CloudWatchSyntheticsPolicy*" ] } ] }

As you can see in this policy, the required permissions to use this service is no more than have Full Administrator Access to the AWS Account. One good point is that AWS has added a warning about this in the documentation.

Thresholds

A CW alarm is automatically created when you create a new Canary test. Don’t forget to update the CW Alarm with an SNS Topic to be notified on crossed alarm thresholds.

Create your first test

Enable Thresholds

Mandatory if you want to link CW Alarm and get notifications.

Create your first Canary test

If you choose to go to Heartbeat monitoring, you just have to fill the name of your test and the targeted website (prefer https)

Automation

Terraform

This new service is in preview, so not yet available in Terraform.

Cloudformation

This new service is in preview, so not yet available in CloudFormation.

API / CLI

Not yet. Dudes.

Should be discovered using the AWS Console, but… I don’t have time to do this.

Pricing

$0.0014 per canary run in eu-west-1

Pricing Examples

You run a single test every 5 minutes:

8,928 canary runs x 0.0014 USD = 12.499 USD / month

5 alarms = 5*$0.10 = 0.50 USD / month

You run a single test every minute:

44,640 canary runs x 0.0014 USD = 62.496 USD / month

5 alarms per month = 5*0.10 USD = 0.50 USD / month

Conclusion

This new service fills an empty space into the monitoring offer at AWS, you now have plenty of different services (Logs Insight, Metrics, Logs, ServiceLens, Xray, Contributor Insights) to cover all of your needs for production workloads without having to subscribe 3rd parties tools.

It could be expensive for personal usage and monitor really simple, personal use-case. But for an enterprise customer, this price is not a big deal versus the ability to have an all-in-one monitoring stack on AWS.

If you want to go deeper with your existing Selenium tests, you can check AWS Device Farm that released a few days ago a Desktop Browser Testing managed service (Selenium)

Tips: Prefer using a different region from your actual production workload (Captain Obvious ? :|) to properly monitor your endpoints and cover “Region Outage”.

Other option: Checkly, datadog, UptimeRobot, ohdear

That’s all folks!

zoph.