https://www.scalawilliam.com/1609/feature-switches-agile-scala-jmx/

Remember last time a critical service in production worked incorrectly and you reverted your changes, or restarted the services for new configuration? Neither do I.

In one of my recent contracts, 10 days before production deployment I began to work on a feature that would actually take 2–3 months. The management had not accounted for the complexity of the task, but no worries: we have Agile and Play Scala at our disposal.

Once you know that you’ll miss won’t make a deadline, all risk is gone, and only certainty is left.

I agreed with the project manager to implement the simplest, dumbest, safest thing there was, so at least we would not miss the deadline. It would take 1 day to get into master.

This would be our basic interface:

And a very dumb implementation in Scala, called ‘StaticFeature’

Now knowing that the management will be asking for constant updates for the final deliverable, I agreed with the Product Owner that it’ll make more sense to implement something simple that at least partially meets the requirement, but would take a bit less. It took 2 weeks and we we were able to deploy it to production quite easily. If anything went wrong, we’d be able to switch back to something we agreed to earlier (ie ‘StaticFeature’).

Welcome to ‘AdvancedFeature’, which is 10x more complicated than ‘StaticFeature’, is tested, and can fall back to ‘StaticFeature’ when it is unable to fit the bill.

But let’s say rather than fail and fall back, ‘AdvancedFeature’ returned incorrect output? Because the input XML is very loose, there was no guarantee that everything would work as expected and none of it could be easily unit tested.

However, how do we switch back? Revert Git code? Redeploy a configuration change? Come on! We have the JVM at our disposal. It has something called JMX (“Java Management Extensions”) which provides you with a way to manage your application at runtime. You can do metrics, rebind ports, change configuration options, change logging verbosity, run diagnostics — all sorts of things. So here’s how you do it:

First you write a generic interface, a “Management Bean”:

Then you write an implementation which registers this component to the JMX registry and allows its methods to be called remotely:

And now when your app runs, you open up Java Mission Control (jmc in UNIX shell), connect to the app, and then you can change values immediately (it’d call setFeatureLevel method upon pressing Return):

Of course in production you might want to have something easier to use, in which case I used Java 8’s JavaScript interpreter jjs to connect to a remote running process and change the value of configuration via a Shell script.

Now the business had the certainty that instead of being stuck with incorrect behaviour they’ll get a lesser version which gives the expected behaviour.

Final implementation, ‘ComplexFeature’:

This was the most complicated by far, and the business wanted to do an A-B type of roll out of the feature into several production environments. Run on the first environment for a few weeks, then run it on the second, then the third, and then we’re done with the fourth. Should anything go wrong, we resort to the lesser versions and change the level at runtime. Easy & convenient.

This approach was successful. Ops loved it. PO loved it.

If feature C started producing incorrect results, it would fall back to B. If feature B started producing incorrect results, it would fall back to A. Ops were able to switch between these levels at runtime. Ops were able to deploy the same code to different production environments and A/B test the changes. Ops and Test were able to deploy the same code to production, staging and test environments and verify the behaviour of every single feature. Ops were able to monitor behaviour and the rates of fall back on DataDog.

Something now is almost always better than nothing now.

But everything now is better than something now, so choose Scala & JDK 8, it has it all.

Connect up with me on LinkedIn and Twitter.

—

Update, 4 Nov 2016: A reader gave a very good question:

So what are the downsides and upsides compared to, say, database switches?

My answer: