Dirty COW and clean commit messages

Did you know...? LWN.net is a subscriber-supported publication; we rely on subscribers to keep the entire operation going. Please help out by buying a subscription and keeping LWN on the net.

We live in an era of celebrity vulnerabilities; at the moment, an unpleasant kernel bug called "Dirty COW" (or CVE-2016-5195) is taking its turn on the runway. This one is more disconcerting than many due to its omnipresence and the ease with which it can be exploited. But there is also some unhappiness in the wider community about how this vulnerability has been handled by the kernel development community. It may well be time for the kernel project to rethink its approach to serious security problems.

Dirty COW (which, naturally, has its own logo and web page, though this one is a bit on the satirical side) is a race condition in the kernel's memory-management subsystem. By timing things right, a local attacker can exploit the copy-on-write mechanism to turn a read-only mapping of a file into a writable mapping; with that, a file that should not be writable can be written to. It doesn't take much imagination to see how the ability to overwrite files could be used to escalate privileges in any of a number of ways.

The core code with the vulnerability is present in every Linux system (excluding, perhaps, no-MMU systems, which lack the sort of protection that has been defeated here in the first place). The known exploit depends on access to /proc/self/mem , which is not universally available, but, even on systems that do not provide that access, there are other ways to exploit the bug. The exploit happily bypasses almost every hardening technique out there; strong sandboxing with mechanisms like seccomp() might slow it down, but counting on that is probably not wise either. The only real protection is to upgrade to a kernel containing the fix.

Given that the nature of the vulnerability was known when the fix was applied, one might think that the development community would go out of its way to inform its users of the nature of the problem. The actual fix, when applied to the mainline, was grouped with a set of "cleanups"; Linus's merge commit for that set mentioned this fix at the very end: "Additionally, there's a fix for an ancient bug related to FOLL_FORCE and FOLL_WRITE by me." The fix itself is not much more illuminating:

This is an ancient bug that was actually attempted to be fixed once (badly) by me eleven years ago in commit 4ceb5db9757a ("Fix get_user_pages() race for write access") but that was then undone due to problems on s390 by commit f33ea7f404e5 ("fix get_user_pages bug").

The fix was (properly) rushed out into a set of stable updates, but there was no mention of the vulnerability there either. So users of the kernel are entirely reliant on others to inform them that an important vulnerability has been fixed and that they should really be updating their systems.

There is nothing new about this practice; Linus and others have long had a habit of, at best, neglecting to mention vulnerabilities that have been fixed in released kernels. There are a number of reasons given for operating this way, starting with a general disdain for the "security circus" and the industry that lives on responding to yesterday's vulnerabilities. Every kernel release fixes a great many serious bugs, they say, some of which certainly have security implications that nobody has (publicly) noticed yet. Highlighting specific vulnerabilities only draws attackers' attention to them while glossing over the fact that the only way to get all of the important fixes is to run the latest releases. Security bugs are just bugs, and we fix them like every other bug.

Everything expressed in that viewpoint could be said to be entirely true (though it glosses over the new vulnerabilities that can also come with the latest releases), but it is, in your editor's view, insufficient to justify this practice.

One of the core tenets of kernel development is that every change must describe the problem that is being solved and what the impact of the change will be. The core SubmittingPatches document, to which all developers are referred, says:

Describe user-visible impact. Straight up crashes and lockups are pretty convincing, but not all bugs are that blatant. Even if the problem was spotted during code review, describe the impact you think it can have on users.

Core reviewer Andrew Morton asks developers to add information about user-visible impacts so often (example) that one assumes he has it bound to a special key sequence. The idea that a patch should describe what it does, and why, is deeply ingrained in the community; it's the only way that we can judge patches properly, and it's the only way that developers will know why a change was made when they encounter it in the repository years later. It's also the only way for users and distributors to know how important a particular patch is known to be. Failing to document the security impact of patches violates that rule at a time when that information is needed the most.

When, for example, a kernel bug can cause random corruption and loss of data within a filesystem, the fix describes the problem and its consequences. That way, users and distributors understand the importance of the change. Security bugs are bugs too, and their consequences should be spelled out as well. The fact that developers are not always aware that a bug has security implications does not excuse omitting that information when it is available.

It has been a long time since all but the laziest of attackers grepped the changelogs for explicit mention of vulnerabilities. As a rule, they are far more interested in the changes that introduce the vulnerabilities in the first place, or those that quietly fix vulnerabilities that the original developer might not even realize are there. Suppressing information about security fixes may keep vulnerabilities out of the awareness of those who are not actively looking for them — the users who may be exploited via those vulnerabilities — but it doesn't fool the exploit creators.

What suppressing security-related information does do is this: it calls into question the community's seriousness about dealing with security issues in general. When users are not told about the known effects of bugs, they rightly assume that they will be unaware of other silently fixed problems. So they can never know that they have the fixes for all known security issues unless they run the absolute latest releases. Staying on the leading edge is not an option for a huge portion of our user base, so they are forced to live with the worry that important fixes have been hidden from them. That is not a situation that engenders confidence in the kernel in general.

To summarize: the suppression of information about the security implications of a change runs counter to the kernel community's principles, makes life harder for our users and distributors, does little or nothing to slow down attackers, and hurts the kernel's reputation in general. A practice that might have made sense (but probably didn't) against script kiddies in the early 1990s is clearly out of place in 2016. Your editor humbly suggests to the kernel development community that the time has long since come to reconsider how security-related patches are handled.

