Distributed bug tracking

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

It is fair to say that distributed source code management systems are taking over the world. There are plenty of centralized systems still in use, but it is a rare project which would choose to adopt a centralized SCM in 2008. Developers have gotten too used to the idea that they can carry the entire history of their project on their laptop, make their changes, and merge with others at their leisure.

But, while any developer can now commit changes to a project while strapped into a seat in a tin can flying over the Pacific Ocean, that developer generally cannot simultaneously work with the project's bug database. Committing changes and making bug tracker changes are activities which often go together, but bug tracking systems remain strongly in the centralized mode. Our ocean-hopping developer can commit a dozen fixes, but updating the related bug entries must wait until the plane has landed and network connectivity has been found.

There are a number of projects out there which are trying to change this situation through the creation of distributed bug tracking systems. These developments are all in a relatively early state, but their potential - and limitations - can be seen.

One of the leading projects in this area is Bugs Everywhere, which has recently moved to a new home with Chris Ball as its new maintainer. Bugs Everywhere, like the other systems investigated by your editor, tries to work with an underlying distributed source code management system to manage the creation and tracking of bug entries. In particular, Bugs Everywhere creates a new directory (called .be ) in the top level of the project's directory. Bugs are stored as directories full of text files within that directory, and the whole collection is managed with the underlying SCM.

The advantages to an approach like this are clear. The bug database can now be downloaded along with the project's code itself. It can be branched along with the code; if a particular branch contains a fix for a bug, it can also contain the updated bug tracker entry. That, in turn, ensures that the current bug tracking information will be merged upstream at exactly the same time as the fix itself. Contemporary projects are characterized by large numbers of repositories and branches, each of which can contain a different set of bugs and fixes; distributing the bug database into these repositories can only help to keep the code and its bug information consistent everywhere.

There are also some disadvantages to this scheme, at least in its current form. Changes to bug entries don't become real until they are committed into the SCM. If a bug is fixed, committing the fix and the bug tracker update at the same time makes sense; in cases where one is trying to add comments to a bug as part of an ongoing conversation the required commit is just more work to do. That fact that, in git at least, one must explicitly add any new files created by the bug tracker (which have names like 12968ab9-5344-4f08-9985-ef31153e504f/comments/97f56c43-4cf2-4569-9ef4-3e8f2d9eb1fe/body) does not help the situation.

Beyond that, tracking bugs this way creates two independent sets of metadata - the bug information itself, and whatever the developer added when committing changes. There is currently no way of tying those two metadata streams together. Then, there is the issue of merging. Bugs Everywhere appears to reflect some thought about this problem; most changes involve the creation of new, (seemingly) randomly-named files which will not create conflicts at merge time. It did not take long, however, for your editor to prove that changing the severity of a bug in two branches and merging the result creates a conflict which can only be resolved by hand-editing the bug tracker's files. Said files are plain text, but that is less comforting than one might think.

[PULL QUOTE: All of this can make distributed bug tracking look like a source of more work for developers, which is not the path to world domination. END QUOTE] All of this can make distributed bug tracking look like a source of more work for developers, which is not the path to world domination. What is needed, it seems, is a combination of more advanced tools and better integration with the underlying SCM. Bugs Everywhere, by trying to work with any SCM, risks not being easily usable with any of them.

A project which is trying for closer integration is ticgit, which, as one might expect, is based on git. Ticgit takes a different approach, in that there are no files added to the project's source tree, at least not directly; instead, ticgit adds a new branch to the SCM and stores the bug information there. That allows the bug database to travel with the source (as long as one is careful to push or pull the ticgit branch!) while keeping the associated files out of the way. Ticgit operations work on the git object database directory, so there is no need for separate commit operations. On the other hand, this approach loses the ability to have a separate view of the bug database in each branch; the connection between bug fixes and bug tracker changes has been made weaker. This is something which can be fixed, and it would appear (from comments in the source) that dealing with branches is on the author's agenda.

Ticgit clearly has potential, but even closer integration would be worthwhile. Wouldn't it be nice if a git commit command would also, in a single operation, update the associated entry in the bug database? Interested developers could view a commit which is alleged to fix a bug without the need for anybody to copy commit IDs back and forth. Reverting a bugfix commit could automatically reopen the bug. And so on. In the long run, it is hard to see how a truly integrated, distributed bug tracker can be implemented independently of the source code management system.

There are some other development projects in this area, including:

Scmbug is a relatively advanced project which aims "to solve the integration problem once and for all." It is not truly a distributed bug tracker, though; it depends on hooks into the SCM which talk to a central server. Regardless, this project has done a significant amount of thinking about how bug trackers and source code management systems should work together.

DisTract is a distributed bug tracker which works through a web interface. To that end, it uses a bunch of Firefox-specific JavaScript code to run local programs, written in Haskell, which manipulate bug entries stored in a Monotone repository. Your editor confesses that he did not pull together all of the pieces needed to make this tool work.

DITrack is a set of Python scripts for manipulating bug information within a Subversion repository. It is meant to be distributed (and, eventually, "backend-agnostic"), but its use of Subversion limits how distributed it can be for now.

Ditz is a set of Ruby scripts for manipulating bug information within a source code management system; it has no knowledge of the SCM itself.

As can be seen, there is no shortage of work being done in this area, though few of these projects have achieved a high level of usability. Only Scmbug has been widely deployed so far. A few of these projects have the potential to change the way development is done, though, once various integration and user interface issues are addressed.

There is one remaining problem, though, which has not been touched upon yet. A bug tracker serves as a sort of to-do list for developers, but there is more to it than that. It is also a focal point for a conversation between developers and users. Most users are unlikely to be impressed by a message like "set up a git repository and run these commands to file or comment on a bug." There is, in other words, value in a central system with a web interface which makes the issue tracking system accessible to a wider community. Any distributed bug tracking system which does not facilitate this wider conversation will, in the end, not be successful. Creating a distributed tracker which also works well for users could be the biggest challenge of them all.

