How to implement feature flags and A|B testing

04/04/2017

4 minutes to read

In this article

Finally, continuous integration, testing, and delivery are part of your organization’s engineering process. You understand the cloud and big data; you’ve begun to introduce monitoring into your workflows. Your environments are built on IaC (Infrastructure as Code), and reliability seems to be significantly improved. Your engineering teams are getting comfortable with containers.” But, you know you’ve only just begun.

You’ve taken the first big leap, but the road ahead offers even more opportunities, such as:

Faster innovation

Faster delivery

Ability to react to feedback or change

Continuously deliver value

Ask your customer what they want

One way to keep the pace up, is with the use of feature flags, particularly for A/B testing.

In this first part of the series, we’re going to cover an introduction. In subsequent parts, we’ll focus on “what” and “how” we implemented feature flags and A/B testing.

What are Feature flags and A/B testing?

A 'feature flag' (or Feature Toggle) allows you to turn features (sub-sections) of your application on or off at a moment’s notice. (Read more about feature flags here: Feature flags, Toggles, Controls .)

It’s that simple. With a simple change, a 'feature flag' enables you to transform your delivery processes to drive customer feedback, test new mechanisms with less engineering impact, and release software faster, with less risk and greater control of who has access to a feature.

A/B testing (sometimes called split testing),can be viewed as an experiment led by a hypothesis. You compare different versions by showing variants (let's call them A and B) to different users with common attributes, to determine which one performs better. For example, a stakeholder could ask:

Which one of two navigation pages will result in a higher user satisfaction?

Do users prefer to have the navigation pane on the right (A) or left (B)?

Run A/B testing in production using Feature flags to test the hypothesis. As shown, 80% of the users prefer the feature with the navigation pane on the left. Option (B) is more popular and wins!

Why should we care about FF?

Feature flags give you the power to reduce risk, iterate quicker, and get more control by separating feature rollout from code deployment.

Capabilities you can achieve with feature flag driven development:

Separation of feature rollout from code deployment Spend less time addressing merge conflicts and refactoring old code Spend more time delivering value to your users

A/B Testing Get feedback from your users in production using experiments

Mitigate Risk Ability to gradually reveal a feature (10%, 20% ….) Remove a feature without the need to re-deploy, rollback or hotfix Use canary releases

Iterate more quickly Wrap and deploy features, even if they are “half-baked”

Segmentation Turn on a feature to a subset of users Plan management, for example, community, normal, or premium Allow users to opt-in to experience the latest features Test in production Control cultural dependent features

Collaboration Branching in code enables teams to work on the code mainline instead of creating separate feature branches.



Here are a few solutions to help you implement your first feature flag (listed in no specific order)

Case studies

Microsoft

Have you ever wondered how the Visual Studio Team Services teams are able to roll out tons of new features every 3 weeks? In this detailed article, Buck Hodges (Director of Engineering for Visual Studio Team Services) shares insight into these topics:

goals to decouple deployment and exposure

implementing feature flags (with lots of code samples)

creating a staged roll-out process

Facebook

A couple of years ago, Facebook came up with an in-house implementation of feature flags. Facebook calls their tool “Gatekeeper” because it controls consumer access to each new feature.

Other examples

Flickr, Twitter, and Instagram

Feature flags come at a cost!

Technical debt in code

Feature flags add complexity to your code. You’ll need a robust engineering process and a mature life-cycle management that follows policies, conventions and cleanup retention.

Feature flags add complexity to your code. You’ll need a robust engineering process and a mature life-cycle management that follows policies, conventions and cleanup retention. Cultural shift

FF and A/B testing involves a cultural transition that affects all related parties in the ALM process and beyond.

FF and A/B testing involves a cultural transition that affects all related parties in the ALM process and beyond. FF at scale

Feature flags become difficult to manage on an enterprise scale. It’s easy to manage one feature flag by modifying a configuration file. Tracking and synchronizing multiple feature flags can be challenging.

Feature flags become difficult to manage on an enterprise scale. It’s easy to manage one feature flag by modifying a configuration file. Tracking and synchronizing multiple feature flags can be challenging. Performance and scale in mind

Poor feature flag implementations can introduce a performance penalty. Consider an in-memory store, such as “Redis”, for your flag’s state and users, instead of configuration files and traditional databases.

Poor feature flag implementations can introduce a performance penalty. Consider an in-memory store, such as “Redis”, for your flag’s state and users, instead of configuration files and traditional databases. Building vs. buying

Companies that have built internal feature flagging tools (e.g. Microsoft, Google, Facebook, and Flickr) have dedicated large teams of engineers and DevOps experts to build and maintain the platform. See building vs. buying to make the right choice for your organization.

Where do we stand with feature flags and A|B testing?

We’ve come to the conclusion that we need to invest in feature flags and A|B testing to be able to react to feedback and change, continuously deliver value, and determine user preferences through controlled experiments.

However, based on our research and experience of rolling a custom solution, we recommend that you explore a SaaS solution. We, for example, cannot afford to develop and especially maintain a custom solution. It’s not a scenario that is suited to our community and volunteer driven program.

Watch this space for part 2 of our A|B testing investigations, and good luck with your DevOps journey!

Resources

Abhishek Tiwari blog, Buck Hodges blog, DZone , Feature Flags, Toggles, Controls, James Mckay blog, LaunchDarkly, Martin flower, and Optimizely.