Always-releasable Debian

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

Debian 7.0 ("Wheezy") was released on May 5, at which point preparations started for 8.0 ("Jessie"). But the freeze period in the Wheezy development cycle (during which only release-critical (RC) bugs are meant to be fixed) took longer than average: ten months rather than the usually-expected six. That led to some understandable self-reflection about the development and release process, including a proposal from Lars Wirzenius and Russ Allbery to adopt several changes meant to keep the testing branch in an always-releasable state.

A modest proposal

The benefits of transitioning to a shorter release cycle would affect many groups: users would get updated packages more quickly, the release team would feel less swamped by the stack of RC bugs, and developers would feel less frustrated about the process. The crux of the proposal is to keep the packages in testing as free of RC bugs as possible, akin to maintaining a releasable trunk branch in an individual software project. Wirzenius and Allbery suggest four changes to bring this about:

Making an attitude shift, so that maintainers of individual packages view making Debian releases as part of their job, not just the job of the release team.

Keeping RC bugs out of testing .

. Using automatic testing and/or continuous integration to catch problems earlier.

Limiting the number of packages treated as "core" packages that can hold up a release.

The attitude shift is a nebulous task, naturally; few on the debian-devel mailing list actually think of releases as the concern of the release team alone. But the gist is that individual package maintainers may need to change the way they handle certain changes, such as transitioning a fundamental package like glibc to a new version, which can have far-reaching effects on other packages.

Keeping RC bugs out of testing is a goal that could involve several possible steps. Wirzenius and Allbery note that in the current release model, "right after the release we open the floodgates, and large number of new packages and versions enter testing. The bug count sky-rockets, but we don't care a lot about that until the next freeze gets closer." They suggest removing RC-buggy packages as soon as possible, including dependent packages, devoting more developer-power to fixing RC bugs in those core packages that cannot be removed (e.g., GCC), and having bug-fix-only "mini-freezes" if the RC count rises above a particular threshold.

On the automatic testing front, they suggest setting up a suite of "reference installations" aimed at common Debian deployment scenarios (mail server, LAMP server, desktop machine, etc.) to determine which packages ought to be considered "core" and which ought to be considered optional. The project could then use these reference systems as targets for continuous integration and testing. They note that Debian already has several automatic testing tools (such as lintian, piuparts, adequate, and autopkgtest), but they do not currently guide the progress of a package into testing :

Imagine a continuous integration system for Debian: for every new package upload to unstable, it builds and tests all the reference installations. If all builds succeed, and all tests pass, the package can be moved into testing at once. When you, a developer, upload a new package, you get notified about test results, and testing migration, within minutes. The number of packages in Debian, and the amount of churn in unstable, makes this not quite possible to achieve without massive amounts of hardware. However, we can get close: instead of testing each package separately, we can test together all the packages that have been uploaded to unstable since the previous run, and mostly this will be a fairly small number of packages.

Holger Levsen has already been working on a continuous integration system for Debian at jenkins.debian.net, Wirzenius and Allbery observe, and there quite a few useful test already available in piuparts—it will just take some additional effort to build them into a reliable automatic testing framework.

The great debate

On the whole, most of the people on debian-devel seem supportive of the goal—which is to say they agree that the lengthy Wheezy freeze was less than ideal and that the process can be improved. Almost everyone who replied to Wirzenius and Allbery's email is in favor of increased automated testing and continuous integration. There is less unanimity on the question of limiting the number of packages that constitute the essential core of a release, however, and about how attitude changes may or may not affect the process.

For example, Paul Wise thought that the point about the attitude shift "essentially comes down to 'people don't do enough work and we want them to do more'," which does not account for several factors that impact developers' individual workloads, including available time, knowledge, confidence, and motivation level. In response, Neil Williams replied that the real intent was to solve a somewhat different problem: "people are not getting the work done, let's break down the problems and make working on them easier or the solutions more obvious."

On the other hand, Michael Gilbert felt that there was not a fundamental problem with the methodology used in the Wheezy release cycle, and that it in fact reflected a better bug-squashing effort:

The primary problems with this cycle were that there were something like 400 or 500 extra rc bugs due to a concerted effort to report all serious issues found via piuparts, and then the existential problem of not enough rc squashers, which in and of itself is not all that rewarding. You address the former with the more automated testing comment below. The latter could possibly be addressed by bring in more DDs, and that means doing better with -mentors.

Allbery replied that the project says almost the same thing after every release, and that it needs to "be sure that we're not just trying the same thing over and over again and expecting different results." Ultimately, however, Gilbert's suggestion for handling the increased RC bug count was more automated testing, so regardless of whether the increased RC bug count is intrinsically good or intrinsically bad, handling it better involves the same solution.

Where there is less consensus at present is on the subject of limiting the number of packages regarded as "core" components of a release—and pulling non-core components that introduce RC bugs (although the offending package would be re-admitted to testing once its RC bugs had been fixed). In their original proposal, Wirzenius and Allbery noted that making a core/non-core distinction could have negative side-effects, such as causing buggy packages to miss the release. Perhaps Debian could introduce new packages in subsequent point releases, they said—an idea which always-releasable testing makes possible.

But several felt that the existing process already sifts packages by importance, at least as "importance" is defined in practice. Vincent Bernat said: "If a package is important, an RC bug will get noticed and someone will step to fix the RC bug or ask for a delay. This avoids unnecessary debate on what is important and what is not and people fighting to get their packages in the right list." Helmut Grohne worried that frequently adding and removing packages from testing would destabilize it as much or more than the RC bugs introduced by those packages. Perhaps, he said, simply making the removal notification (which precedes any actual package removal) higher-profile would goad the developer into fixing the bug faster.

It is also not clear who would make the core/non-core determination; such a call could easily be a source of controversy. Wirzenius and Allbery assured everyone in their proposal that the non-core designation is not a black mark, but considering the practical effect it has—namely, that the release will not wait for the package to get fixed—it has the potential to spawn disagreement. Not everyone was sold on the "reference installations" idea, either; Thomas Goirand said that the suggested installation types (e.g., web server, mail server) denote packages that are pretty stable already, and not the sources of problems in Wheezy. "If you want to find out which tests would help, you would have to conduct a better analysis of the problems we had during the freeze, IMO."