Much ado about debugging

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

Recently, an interaction problem between systemd and the kernel was reported. After a calm discussion, developers of both projects found ways in which behavior could be improved and set about coding up the solutions. The technical press was filled with glowing reports on another success of collaborative problem solving... or, perhaps, most of the preceding text is entirely fictional and the systemd "debug flag" problem spiraled out of control in several ways at once.

Actually, that description is not entirely fantasy, if one looks at the problem the right way. It turned out that systemd was using the debug argument from the kernel command line to turn on much of its own debugging output. As Linus Torvalds noted, that is exactly how this flag was intended to be used. But a mistake in the systemd camp caused an assertion to fire, generating so much output that the system was rendered unusable; the end result was an unbootable system. After some discussion, a couple of decisions were made:

Systemd will stop logging through the kernel once the journald logging daemon is available; that will cause much of that output to be directed elsewhere. There are also patches floating around to cause systemd to recognize systemd.debug , rather than plain debug , as the signal to turn on its own debugging options. If merged into systemd, this change will make it easier to turn on kernel debug output without also enabling systemd's output (something which is already possible, but not in The Way Kernel Developers Have Always Done It).

, rather than plain , as the signal to turn on its own debugging options. If merged into systemd, this change will make it easier to turn on kernel debug output without also enabling systemd's output (something which is already possible, but not in The Way Kernel Developers Have Always Done It). The kernel developers have realized that it should not be possible to incapacitate a system by logging too much data from user space. Consequently, some sort of rate limiting will be applied to the /proc/kmsg interface. The proper nature of that limiting and how it will be controlled are still under discussion, but chances are good that some sort of change will find its way into the 3.15 kernel.

In other words, appropriate fixes are being applied on both sides to prevent this kind of problem from recurring. So a reasonable observer might well wonder why the technical press is full of headlines like Linus Torvalds suspends key Linux developer and Open war in Linux world. That comes down to less-than-optimal behavior on both sides of the fence — and even worse behavior in the press.

When Borislav Petkov first encountered this problem, he filed a bug against systemd, asking that its behavior be changed. A little over one hour later, systemd developer Kay Sievers closed it as "NOTABUG," saying that the behavior was expected and that the kernel is not the sole keeper of the debug flag: "Generic terms are generic, not the first user owns them." A lengthy back-and-forth followed, with developers reopening the bug and Kay closing it several times. Eventually the discussion spilled over onto the linux-kernel list when Steven Rostedt proposed hiding the debug flag from user space entirely.

Shockingly, the move to linux-kernel did little to calm the conversation. Eventually Linus announced that he was not interested in accepting any patches from Kay until Kay's pattern of behavior (as seen by Linus) changed. It didn't take that long, though, for things to calm down and for various developers to start looking at real solutions to the problem. As of this writing, that thread has been silent for a few days.

In other words, what we have here is a story that has been seen many times over. A problem turns up that reveals suboptimal behavior by two interacting pieces of software. Developers for both projects are slow to acknowledge that they could be doing things better and point fingers at the other camp. Certain high-profile community members known for their occasionally over-the-top rhetoric live up to their reputations. But once people have some time (measured in hours) to calm down, the problems are fixed and everybody moves on.

That, alas, is not a story that plays well in much of the press. So, instead, various reporters tried to inflate it into some sort of spectacular showdown. The development community was not portrayed in a good light, and perhaps some of that was even deserved. But what was really conveyed by all those articles was that, after all these years, much of the technical press still has a poor (at best) understanding of how free software development communities work.

Proprietary software tends not to be followed by stories like this because the inevitable politics, profanity, and chair-throwing are kept behind closed doors and firewalls. We, instead, do most of it in the open — though flying furniture still tends to be an exceptional occurrence. These events can be fun to watch from a suitable distance and with enough popcorn. But they mean less than the hidden corporate disagreements that we never hear about — and much less than the public accomplishments that we almost never hear about. The 3.15 merge window, ongoing while this debate was happening, has seen (as of this writing) the merging of well over 10,000 changesets from 1100 developers, most of whom are working together smoothly. But none of the press accounts mentioned that.

That's just life in the free software world. Or almost anywhere else, for that matter; where there are people, there will be misunderstandings, blowups, and the occasional failure to immediately recognize a problem. Somehow, we manage to muddle through anyway and create lots of high-quality free software. But that is so normal and mundane that it doesn't qualify for consideration as news.

