At Moxio we recently started an experiment we called 'Review Roulette': a process of randomized code reviews. We believe this emphasizes code reviews as a means of bidirectional learning and helps onboarding junior developers, and thus improves upon our previous review 'policy'. In this post I would like to sketch the background behind this experiment, explain the idea of Review Roulette and present some preliminary results.

Our old situation

We already did a fair share of code reviews before the introduction of Review Roulette. About four years ago we set up a ReviewBoard server on our local network and started using it to discuss and review code. Review would be done in the tool, taking the discussion offline if desired. So far we did not have a formal policy around what and how to review: putting changes up for review was opt-in and the author chose the reviewer(s) and the timing (pre- or post-commit). Sometimes a review request was just a rough idea for discussion; at other times there would be a full implementation to verify.

This approach made it easy to start doing code reviews and ensured that their introduction was more or less resistance-free. It also kept the review burden and bureaucratic overhead low, making it easy to do small fixes and improvements (in the spirit of the Boy Scout Rule) without having to go through a formal review process.

However, such a lax structure also had its disadvantages. In practice, opt-in code reviews meant that most subjects posted for review were the more difficult or 'controversial' changes, that were bound to spark a fair amount of discussion (which they regularly did). This gave code reviews too much an air of cumbersome discussions and hard decisions. Moreover, changes that the author thought to be straightforward were usually not reviewed, even though someone else on the team may have spotted some overlooked difficult case. Also, a free choice of reviewers caused most reviews to be assigned to a fixed group of 2-5 more senior developers, creating a sort of unofficial hierarchy of reviewers.

Why do code reviews?

In order to understand the undesirable effects of this situation, let's take a step backward and look at the reasons to do code reviews. The most obvious motivation is to find and correct bugs. As Steve McConnell notes in his book Code Complete, research has shown code reviews to achieve a 45%-70% defect detection rate, more than any type of automated testing. A second reason is that code reviews can improve design and thus make code more maintainable.

It is not only the code that benefits from reviews. When done properly, code reviews can also make team members learn from eachother and let them become familiar with parts of the code base that they do not work on often. Note that this works both ways: just like the original author can benefit from the feedback given by the reviewer, the reviewer can learn from the way the author has approached the problem at hand and discover methods and classes they were unfamiliar with. This all fosters shared code ownership and continuous learning, and makes code reviews a great way to onboard new or junior developers.

Our old review 'policy' missed some opportunities regarding this aspect of knowledge sharing. The de facto review hierarchy meant that juniors did not often get the opportunity to review (and thus learn from) code written by more senior developers. To emphasize code reviews as a means of bidirectional learning (as opposed to verification and supervision) requires a sense of reciprocity. Not reviewing changes that the author considered straightforward also missed out on opportunities for knowledge exchange, as a reviewer may see improvements that the author did not consider or (conversely) learn something new from the authors approach. Although we think that reviewing all changes would consume too much time and work aversely, we decided that all changes should be eligible for review.

Review Roulette

The considerations above brought us to the idea of Review Roulette:

Every week, every developer is assigned one random commit (from the previous week) written by a random colleague to review, to the extent possible.

The randomness aspect eliminates any kind of fixed reviewer hierarchy and ensures reciprocity, while also ensuring that every commit is eligible for review, precluding any assumptions about what other team members could learn or teach about a given subject. Another essential aspect is the clause "to the extent possible". With Review Roulette it is unavoidable that sooner or later someone will be assigned to review a difficult commit in an unfamiliar part of the code base. While that person will probably not understand everything, and will not be able to review the change as thorough as someone else would, they can usually still give useful remarks on certain parts of it, point out low-level logic errors and suggest changes to naming and documentation that would make the code easier to understand for an 'outsider'. Just let them indicate what they could and could not review. A fresh pair of eyes often gives very useful insights that someone knee-deep in that codebase would not notice. This provides circumstances in which every developer can participate: if you write code, you're in.

We agreed to give this approach a try as an experiment, adding to (not replacing) the code reviews that we already did. To keep a limit on the time invested, especially for larger commits assigned to a reviewer not familiar in that area, we decided to not spend more than an hour per week on our Review Roulette. In a couple of hours I put together a simple script to retrieve all commits from the week before, filter out things like merge commits, generate a random assignment of reviewers to commits and post the review requests to our ReviewBoard server. We run the script every monday and do the reviews during the following week. Evaluation was planned two months into the experiment, when we will decide whether we continue Review Roulette and if any adjustments are necessary.

Preliminary results

At the time of writing, we have been trying out Review Roulette for 6 weeks, so there is no real evaluation yet (I will blog about that later - UPDATE: I have). Still, there are some personal observations that I would already like to share at this moment:

The vibe around Review Roulette seems to be mostly positive. People are enthousiastic about doing code reviews and sharing views about development.

Most changes, even the ones that seem fairly trivial at first sight, turn out to have something interesting in them if you (are forced to) look closely enough. Even in reviews of simple one-line commits I have seen useful discussions about the surrounding code, UX matters or testing styles arise. It seems that team members really see the reviews as an opportunity to learn and teach and try to get the best out of this interaction.

Some sort of filter on the list of commits eligible for review is still necessary. I have seen an occasion where someone was asked to review a new import of a compiled binary file. Oops!

For myself, I noticed that doing Review Roulette made me pay more care to the commit messages I write. I often found myself thinking "If this commit would be chosen for review, would my colleague understand it?" and deciding that some more context would be useful.

Overall, I would highly recommend to try out Review Roulette with your team! It is a great chance to deepen everyones knowledge of the codebase and let both the code and the developers benefit from bidirectional learning.