Gerrit: Google-style code review meets git

This article brought to you by LWN subscribers Subscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.

Gerrit, a Git-based system for managing code review, is helping to spread the popular distributed revision control system into Android-using companies, many of which have heavy quality assurance, management, and legal processes around software. HTC, Qualcomm, TI, Sony Ericsson, and Android originator Google are all running Gerrit, project leader Shawn Pearce said in a talk at the October 2009 GitTogether event, hosted at Google in Mountain View.

The Gerrit story starts with the progressive escape of an in-house Google process and tool. Google requires code review for any change to company code or configuration files; there are a few exceptions, but those are subject to review after deployment. The code review process started out using lots of email, but for the past several years it has been automated. When Guido van Rossum, creator of the Python language, began working at Google in 2005, he started developing a tool, in Python naturally, to coordinate code reviews. The result, called Mondrian, lets users view the proposed change as a side-by-side comparison, and participate in comment threads attached anywhere in the code under review. An overview page shows a to-do list of incoming changes to review and reviewers' comments. Van Rossum presented Mondrian at a public talk in 2006. (video).

Mondrian has been a huge success inside Google, Pearce said. "Almost every engineer uses this as their daily thing." But Mondrian is heavily dependent on Google's internal infrastructure, including the in-house Bigtable non-relational table store and the proprietary Perforce revision control system. Google is a huge Perforce shop, and has built its own highly-customized IT infrastructure, including Perforce-dependent tools.

The first step in making a Mondrian-style tool available to a wider audience was van Rossum's 2008 release of Rietveld, which uses Subversion instead of Perforce, and the public interfaces of Google App Engine instead of Google internals. It's named for modern architect Gerrit Rietveld. As Google began the Android project, though, developers demanded a Mondrian-like tool for their codebase, tracked with Git. Google App Engine was a deal-breaker, because mobile hardware vendors working on Android-based products maintain internal repositories, and won't rely on an outside service.

Shawn Pearce, who previously reimplemented git in Java as JGit, and is now at Google, took on the project; the result is Gerrit Code Review, now used to track public proposed changes to Android. Android's applications are written in Java, so writing the new tool in that language should make it more accessible to would-be contributors among Android developers.

Gerrit runs a copy of the Mina SSH daemon, along with JGit, which is now maintained as part of the Eclipse EGit project. Although the combination is slower than original git over OpenSSH, it's fast enough for the Android developers. "The entire Android team uses this as their interface to Git," Pearce said. The server-side dependencies are Tomcat and an SQL database, which so far can be either MySQL, PostgreSQL, or H2. Gerrit uses OpenID for authentication by default, but can be configured to use HTTP basic (or digest) authentication, or Siteminder, a single-sign-on system from Computer Associates.

On the UI side, Gerrit uses Google Web Toolkit, an Apache-licensed project that compiles Java to JavaScript with AJAX functionality. The UI has a few tiny Flash widgets for convenience, - to copy Git command lines to the clipboard, for example - but Flash is not required. A user who prefers not to use the web interface can also ssh to the Gerrit server to execute commands. Gerrit doesn't enforce any particular processes to make git look more like the centralized revision control systems that spawned Mondrian and Rietveld. A Gerrit-using developer has a full git install and can still do distributed revision control tricks, such as cherry-picking from a newer upstream release. Gerrit just guards access to its own repository. A developer can set up a git repository with "origin" pointing back to an ssh:// URL on the Gerrit server, and do something like centralized development, or do "drive-by" interactions with a Gerrit server like any other Git repository.

To propose a change for approval through Gerrit, a developer must start a branch in git for that change. Each change, and each iteration of a reworked change, becomes a new branch. In order to preserve information among successive versions of the same work, Gerrit includes a git hook to apply a "Change-Id" line to commit messages. After doing a git push to the Gerrit server, the developer can come back to the web dashboard and see the status of the pending change, then request a code review. Alternatively, a wrapper called Repo lets the developer specify a reviewer on the command line when doing the push.

Once a reviewer is lined up, Gerrit starts sending email, giving both the URL for the Gerrit page and a git command line for the reviewer to pull the change. On the change page, a reviewer can see the change side-by-side with the original or as a diff, and add review comments anywhere in the code along with a "cover sheet" message. Approval has multiple levels, with configurable access to the range that a reviewer can apply. Typically, an individual developer would be able to apply -1 or +1, which are "prefer you don't submit this" and "I like it," and some would have access to the -2 "do not submit" and +2 "Approved" levels. The web interface is not required--a reviewer can ssh to the Gerrit server to approve or reject a change.

A rejected and reworked change with a proper "Change-Id" line preserves Gerrit metadata, and the reviewer can see his or her original comments and the submitter's replies, join an existing comment thread on the previous, rejected version, or start new comment threads anywhere in the new version. If the change is not accepted, the new version has to be a new branch.

Kernel developer David Brown, at the Qualcomm Innovation Center, uses Git and Gerrit with his team. "The biggest complaint people have so far about Gerrit is people have to be constantly rebasing their changes," he said. However, the company has an extensive review process in order to make anything available under a free software license, and Gerrit streamlines the process of approving changes for the people who are authorized to check outgoing code. "The biggest thing that's changed since last year is Gerrit. The second biggest thing that's changed since last year is Gerrit," Brown said. But, he added, doing things the Gerrit way does work. "Most people learn a really small subset of git, I mean a really really small subset of git," he said.

Gerrit can be set up to automatically enforce some policies. "There's a lot of different work models people want," Pearce said. For example, Gerrit can be set up to enforce a check for a signed contributor agreement. The public Gerrit instance for Android enforces the contributor agreement requirement for all modules except the kernel, where only a "Signed-off-by" line is required. Gerrit can be integrated with a bug tracking system (BTS), but the integration is still based on site-specific tricks, since everyone is on a different bug tracker and nobody seems to like theirs very much. Besides better BTS integration, Pearce is looking at ways to store Gerrit metadata in git. "We'd like to do all the things that Gerrit does, offline," he said. "The fact that it doesn't work offline is a bug."

The Android developers are still figuring out how to connect with upstream. Staging maintainer Greg Kroah-Hartman plans to drop Android drivers from drivers/staging as of 2.6.33, as "no one wants to maintain them and help get them merged into the kernel," he said in email. Behind the apparent driver slowness are substantial corporate culture changes, though, with both Qualcomm and TI starting programs to manage outgoing code. Qualcomm is the lead sponsor of Code Aurora Forum, and TI is behind OmapZoom.org. In the potential minefield that is the mobile industry, with considerations such as not offending carrier partners, securely supporting third-party applications, deploying codecs and GUI code without patent troubles, and complying with radio regulations, Gerrit seems to be a needed focus for gatekeeping efforts.