Network access by Debian package builds

Benefits for LWN subscribers The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!

The Debian project is known for strictly adhering to its various internal rules and guidelines; a package that some developers feel is in violation of the Debian Free Software Guidelines, for example, is sure to spawn considerable debate, regardless of its popularity or its origin. Recently, a Debian package maintainer realized that a particular package violated a guideline prohibiting network access during the build process, thus sparking a discussion about that guideline and what its ultimate purpose is.

On September 7, Vincent Bernat wrote to the debian-devel list, noting that the python-asyncssh package, when building, runs a unit test that attempts a DNS lookup. That would seem to violate section 4.9 of the Debian packaging policy, which states "For packages in the main archive, no required targets may attempt network access."

The issue was reported by Chris Lamb in a bug tagged as "serious." Fixing that violation, Bernat said, is simple enough—just disable the test in question—but Bernat wondered whether or not the policy rule is genuinely useful. Since the test originates in the upstream project, Debian would have to carry the patch indefinitely, adding what might be deemed considerable overhead for little real gain.

Furthermore, the test in question performs a lookup for a host named fail , which is expected not to succeed, and the purpose of the test is to ensure that the package handles the lookup failure gracefully. The build, Bernat said, works in an isolated network namespace and the package builds reproducibly with or without network access. Consequently, he asked for feedback on ignoring the violation in this case:

I have the impression that enforcing every word of the policy in the hard sense can bring endless serious bugs. [...] I appear as a bad maintainer because I don't feel this is an important bug. Any thoughts?

Various responses to Bernat's question suggested alternatives to removing the test, but much of the discussion focused on the purpose of the no-network-access rule. Christian Seiler, among others, said the rule was intended to prevent information leaks, which a DNS lookup would certainly cause.

Paul Wise, however, felt that the rule was intended to ensure that nothing outside of the local build environment had any impact on the result of the build process. Steve Langasek concurred with that viewpoint, noting that:

If your package requires the network to build, we have a hard time auditing to make sure that the package actually contains the source for what's built. While some failures may "just" be test cases, it's better to enforce a blanket policy that packages should build without a connection to the public Internet rather than waste time figuring out which failures "really" impact the package contents.

It would appear that there are more than a few packages in the archive that do violate the no-network-access rule. Christoph Biedl said "a certain package (name withheld) did a *lot* of DNS traffic in the test suite, so far nobody has shown concerns." And the test in python-asyncssh, as written, is problematic: if there is a host named fail in the local DNS zone, the lookup will succeed. While unlikely, this is possible; a better test would be to look up a hostname that is guaranteed to be nonexistant by IANA rules (such as .invalid ). In addition to being a better test of lookup-failure handling, that change would avoid the risk of an information leak through the lookup request (or, at least, reduce the risk, depending on the behavior of the nameserver).

Russ Allbery contended, though, that there is no genuinely important information leak anyway. He made that comment in reply to Thomas Goirand, who suggested that the issue of attempting network access was valid, but questioned whether "serious" was an appropriate severity level:

I don't think it is a so big issue if a package is doing some network operation, but doesn't fail building if there's no Internet connectivity. The only problem (as Christian mentioned) would be a privacy concern in some cases. In such a case, the severity would be "important", but not "serious" (ie: probably not serious enough to be an RC bug), and it'd be nice if the subject of the bug was reflecting the privacy concern rather than the "no network during build" policy thing (though I can imagine it'd be harder to file the bug).

Others on the list, starting with Gregor Hermann, suggested revising the wording of the rule itself. Allbery proposed two rules, one saying that the package build "must not fail when it doesn't have network access" and another that warns against leaking privacy-related information; several similar variations arose from other participants in the thread. But Adam Borowski replied that attempts to distinguish between different types of network usage are, ultimately, doomed to fail, making such an effort pointless:

As there's no way to distinguish such details automatically, and as data/privacy leaks can be quite surprising, I'd strongly prefer the nice, simple rule of "no attempt to access outside network, period". If _some_ network accesses are allowed, we can't easily spot the bad ones. With the current wording of the policy, iptables ... -j LOG is all you need for a QA check. I'd amend the policy to say explicitly "localhost doesn't count as network, DNS lookup do".

Borowski reiterated the privacy-leak angle, saying that even innocent-looking DNS lookups violate "the Dissident Test"—that is, a user performing the build in some location where state-sponsored surveillance is a threat could put themselves at risk of investigation.

But not everyone found the "Dissident Test" argument persuasive. Allbery, in particular, contended that the lookup of a well-known hostname did not reveal significant personally identifiable data, saying:

If you are a dissident building software in an environment where even a DNS query might give away your activity, you seriously need to be using an isolated container or other precautions. It is completely unreasonable and unrealistic to expect all Debian source packages to meet this standard, even if we were trying (which we're not; we've had software that does DNS queries during the build in Debian for twenty years and no one has ever noticed before now), to a level of confidence that a dissident with this type of safety concern would need.

Moreover, Allbery continued, the entire issue is somewhat overblown:

I don't think this argument passes the sniff test for conversations with upstream. We already have enough issues with upstream over licensing, where we've decided that our very aggressive stance is worth the effort. Please let's not pick fights that *aren't* worth the effort and will cause upstream to look at us like we're paranoid nit-pickers. This sort of thing is really bad for cooperation with other projects.

Goirand concurred with that sentiment, noting that Debian already has a contentious relationship with some upstream projects. Zlatan Todoric also agreed, saying "I also feel that we are losing too much energy on this and this is not sustainable long term, nor fun."

The discussion eventually tapered off without a firm conclusion as to whether or not Debian policy should be amended and with no guidelines for a broad approach to assessing future network accesses that occur during package builds. It seems that the status quo will remain in place, then. Each package maintainer will have to individually assess any problematic network-access attempts in build targets, and some of those access attempts may survive for some time if they are determined to pose neither a serious privacy risk nor to impact the result of the build.