By any measure, the PulseAudio sound server is an important part of most Linux desktops. But, like other free-software projects, PulseAudio is understaffed and relies on volunteers to maintain it. An effort by the most active maintainer, Tanu Kaskinen, to use the Patreon platform to help fund PulseAudio maintenance makes one wonder how much of the free software we depend on is similarly suffering.

It was not that long ago that even critical internet infrastructure was largely being ignored by the companies that relied on it. The Heartbleed fiasco made it obvious that OpenSSL was not thriving with mostly volunteer maintainers. Heartbleed led to the formation of the Core Infrastructure Initiative (CII), which targets critical infrastructure projects (OpenSSL, OpenSSH, GnuPG, and so on). Initially, the CII provided grants to projects to help fund their development and maintenance, and evidently still does, though it is moving toward using threat modeling to target projects for security auditing according to the FAQ.

That work is great, but it is limited by a number of factors: funding and the interests of its members, primarily. Few of the companies involved have much, if any, interest in the Linux desktop. Some might argue that there aren't any companies with that particular interest, though that would be disingenuous. In any case, though, desktop Linux is a community-supported endeavor, at least more so than server or cloud Linux, which likely means some things are slipping through the cracks.

Kaskinen left his job in 2015 to be able to spend more time on PulseAudio (and some audio packages that he maintains for OpenEmbedded). For the last four months or so, he has been soliciting funds on Patreon. Unlike Kickstarter and other similar systems, Patreon is set up to provide ongoing funding, rather than just a chunk of money for a particular feature or project. Donors pledge a monthly amount to try to support someone's work going forward.

So far, Kaskinen has attracted 18 patrons who provide $77 per month, which approximately covers his rent according to the Patreon page. His needs are rather modest, as he is looking for $340 per month. [Update: These numbers are based on a misunderstanding, see the first comment below.] Patrons get immediate access to his monthly reports, while others must wait a few weeks (reminiscent of a certain weekly publication perhaps). Once they are freely available, the reports are published on his blog. He is actively looking for other reward ideas for those who donate at more than the $1 per month minimum.

A look through the reports gives the impression of an active maintainer working on bugs, reviewing patches, answering questions on IRC, writing documentation, and so on. Much of that could have been done in his "spare time", though presumably at a much slower rate. In the meantime, Kaskinen is burning through his savings to help support the many users of PulseAudio. It is undoubtedly the plight of many maintainers, though most probably just try to fit that work around their day job, rather than to try to do it full time.

There are a number of companies and groups that use PulseAudio as part of their Linux distribution. That starts with the traditional distributions, such as Fedora, Red Hat Enterprise Linux, Ubuntu, SUSE, and openSUSE, but goes beyond that. Mobile and automobile-focused distributions, such as Tizen, GENIVI, and Automotive Grade Linux, also use (or can use) the audio server. One would think they and others might have an interest in having a full-time PulseAudio maintainer.

It is a dilemma for the free-software world. Our projects will be much better off with full-time maintainers, and lots of projects already have that thanks to various companies in our community, but what should be done for projects that fall through the cracks? Though it is early going for Kaskinen, it is hard to see Patreon-based campaigns being the ultimate solution, though they can certainly help.

The free-software community itself, or at least the individuals who make up a large chunk of it, seem unlikely to be able to solve this problem directly. While that may be unfortunate at some level, the reality is that millions of folks, all over the world, with varying income levels and even awareness of how the software they use comes about, probably cannot be relied upon to directly fund these kinds of projects. For that, it will take organizations and/or companies to help identify, and ultimately fund, maintenance of critical desktop infrastructure.

Plenty of that infrastructure is being funded, of course. The major desktop environments have companies or groups of companies that employ the maintainers, developers, and others for those projects. Web browsers are in good shape, overall, as are some of the office and productivity suites. But there is a second tier of applications (and plumbing in the case of PulseAudio) that may not be receiving the attention it deserves and, perhaps in some cases, requires.

The urgency that Heartbleed provided is probably never going to occur in the Linux desktop realm, however. There is less monoculture and vastly less of an installed base. Android is, of course, a much bigger target, but has a large company behind it and doesn't use much from desktop Linux (other than the kernel).

The kernel model for maintainers seems to work quite well, overall. Companies employ maintainers of various subsystems to, essentially, continue maintaining. Those companies get the benefit of having those people on their staff as well as the benefit of a better-maintained kernel for themselves and others. But the kernel is unique; other parts of our free-software desktop infrastructure are not so centrally placed, thus not so well-maintained.

One hopes that Kaskinen can find enough patrons to meet his modest needs to continue with his work. But it would be better still if we could find a way as a community to make it possible for maintainers (and others) to do their work without giving up all of their free time—or their savings.

Comments (30 posted)

getrandom()

The GNU C library (glibc) 2.25 release is expected to be available at the beginning of February; among the new features in this release will be a wrapper for the Linux getrandom() system call. One might well wonder whyis only appearing in this release, given that kernel support arrived with the 3.17 release in 2014 and that the glibc project is supposed to be more receptive to new features these days. A look at the history of this particular change highlights some of the reasons why getting new features into glibc is still hard.

Glibc remains a conservative project. There are a number of good reasons for that, but it does mean that developers proposing new features tend to run into roadblocks; that has certainly happened with getrandom() . The kernel's random number subsystem maintainer, Ted Ts'o, has been known to complain about the delay in support for this system call; he has suggested that "maybe the kernel developers should support a libinux.a library that would allow us to bypass glibc when they are being non-helpful." Peter Gutmann resorted to channeling Sir Humphrey Appleby when describing the glibc project's approach to getrandom() . But what really caused the delay here?

Glibc bug 17252, requesting the addition of getrandom() , was filed in August 2014, five days after the 3.17 kernel release. Glibc developer Joseph Myers responded twice in the following six months, suggesting that, if anybody wanted getrandom() in glibc, they would need to go onto the project's mailing list and work to drive the development forward. The first reason for the delay is thus simple: nobody stepped up to do the work.

One might wonder why it took so long for somebody to come along and implement a simple system-call wrapper. In its essence, the code that will appear in the 2.25 release is:

/* Write LENGTH bytes of randomness starting at BUFFER. Return 0 on success and -1 on failure. */ ssize_t getrandom (void *buffer, size_t length, unsigned int flags) { return SYSCALL_CANCEL (getrandom, buffer, length, flags); }

Such a function does not seem particularly hard to write. The original patch for getrandom() support, finally posted by Florian Weimer in June 2016, was rather more complicated than that, though. Weimer, knowing that the glibc project is conservative and wants the library to work in almost all situations, attempted to cover every base he could think of. So the patch included documentation updates, test programs, and several other details that, in turn, led to a number of sticking points that surely slowed the eventual acceptance of the patch.

The first obstacle, though, had little to do with the patch itself; it was, instead, brought about by the project's reluctance to add wrappers for Linux-specific system calls at all. Glibc does not see itself as a Linux-specific project, so it naturally prefers standardized interfaces that can be supported on all systems. The project has sporadically discussed its policy around Linux-specific calls over the last couple of years. In 2015, Myers described it as:

The result is a de facto status of "syscall wrappers present for almost all syscalls added up to Linux 3.2 / glibc 2.15 but for nothing added since then", which certainly doesn't make sense.

A draft policy for Linux-specific wrappers has existed since about then but, lacking consensus in a strongly consensus-oriented project, it has never achieved any sort of official status. Thus, even though this policy states that system-call wrappers should be added by default in the absence of reasons to the contrary, Roland McGrath responded to the initial patch posting with a terse message saying: "You need to start with rationale justifying the new nonstandard API and why it belongs in libc." That justification was not hard, given that a number of projects have been asking for this wrapper, and that adding the BSD getentropy() interface on top of it is easily done, but this challenge foreshadowed much of what was to come.

A trickier question was: what should glibc do when running on pre-3.17 kernels (or non-Linux kernels) that lack getrandom() support? The initial patch included a set of emulation functions so that getrandom() calls would always work; they would read the data from /dev/random or /dev/urandom as appropriate. Doing so involved keeping open file descriptors to those devices (lest later calls fail if the application does a chroot() ). But using file descriptors in libraries is always fraught with perils; applications may have their own ideas of which descriptors are available, or may simply run a loop closing all descriptors. So the code took pains to use high-numbered descriptors that applications presumably don't care about, and it used fstat() to ensure that the application had not closed and reopened its descriptors between calls.

This usage of file descriptors drew a number of comments; it is something that glibc tries to avoid whenever possible. After some discussion, it was concluded that glibc should provide only a wrapper for the system call, without emulation. If an application calls getrandom() on a kernel where that system call is not supported, the glibc wrapper will simply return ENOSYS and it will be up to the application to use a fallback. That decision removed a fair amount of code and one obstacle to merging.

In writing the patch, Weimer worried that there may be a number of applications out there with their own function called getrandom() , which may or may not provide the same interface and semantics as the glibc version. The prospect was especially troubling because a getrandom() call that does not actually return random data may not cause any visible problems in the application at all — until some attacker notices this behavior and exploits it. So he employed a bunch of macro and symbol-versioning trickery to detect and prevent confusion over which getrandom() function to use.

This feature, too, was unpopular; glibc does not normally add extra layers of protection around its symbols in this way. The tricks made it impossible to take the address of the function, among other things. After extensive discussion, Weimer backed down and removed the interposition protection, but he clearly was not entirely happy about it.

The most extensive argument, though, was over whether getrandom() should be a thread cancellation point. In other words, what should happen if pthread_cancel() is called on a thread that is currently blocked in getrandom() ? The original patch did make getrandom() into a cancellation point; it still behaves that way in the version merged for 2.25, but it had to survive a lot of argument to get there.

Weimer wanted getrandom() to be a cancellation point because the system call can block indefinitely, even if it almost never blocks at all. The Python os.urandom() episode showed that this blocking can, in rare situations, cause real problems. So, he said, it should be possible for a cancellation-aware program to respond to an overly slow getrandom() call.

The objections here seemed to be, for the most part, objections to cancellation points in general. It is true that cancellation points are problematic in a number of ways. To the implementation issues one can add the fact that most programs are not cancellation-aware and may not respond well to a thread cancellation in an unexpected place. A version of getrandom() that adds a new cancellation point could thus lead to unfortunate behavior. Additionally, getrandom() is supposed to always succeed; the possibility of cancellation adds a failure mode that is not a part of the system call itself.

On the other hand, Carlos O'Donell argued that getrandom() is analogous to read() and thus should behave the same way; read() is a cancellation point. The argument went back and forth over months, and included detours into whether there should be a separate getrandom_nocancel() function or an additional "cancellation point please" argument to getrandom() . In the end, getrandom() remained an unconditional cancellation point. The BSD-compatible getentropy() implementation included in the patch is not a cancellation point, though.

With these issues resolved, the conversation came to a close on December 12 when getrandom() and getentropy() were merged into the glibc repository. A feature that has been shipping in the Linux kernel for over two years will finally be available to application developers without the need to create special system-call wrappers. Now all that's left is all the other Linux-specific system calls that still lack glibc wrappers.

Comments (67 posted)