Many moons ago, I was at a meetup in San Francisco and someone presented a meme that looked like this:

I thought the speaker was crazy (and it was so long ago I can’t remember who it was). But the meme stuck!

Over the past few years, I’ve seen how cloud applications are becoming complex, dynamic systems. They’re:

Complex, with many dependencies on other cloud applications (e.g., databases, search, etc.); cloud services; and deployment environments

Dynamic, with many services being updated asynchronously, such that the application itself is ever-changing

In this world, traditional strategies for pre-production testing (e.g., setting up a staging environment) start to get very cumbersome. Staging environments no longer are reasonable facsimiles of production, and “it worked in staging” doesn’t necessarily give developers confidence to deploy to production.

A new approach to testing has started to emerge — safely testing in production. This usually involves combining:

L7 traffic shaping, e.g., canary releases or traffic shadowing

Observability, built around comparing metrics between old version and new versions of a given service

Instrumentation to pin point failure, e.g., distributed tracing and logging

Automated continuous deployment pipelines, so changes can be pushed to production quickly

Fast rollback (usually via L7 routing), so if there’s a problem, switching back to an old version is virtually instantaneous

We’re seeing application developers using Ambassador to implement safe testing in production, so we’ll be documenting some of these practices in the coming months as part of some new sections of the documentation. We’re also working on improving Ambassador’s capabilities in these areas as we continue to push forward on the Ambassador roadmap. Stay tuned!