Compile-time stack validation

LWN.net needs you! Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing

An occasionally heard horror story about the kernel development community concerns developers who are told that, in order to get their code upstream, they must first invest considerable effort into fixing a related subsystem. As with many such stories, this is not an experience many kernel developers have had, but there is also a grain of truth behind it. The ongoing live-patching effort, and the extra work that has been required to push that work forward, is a case in point.

Live patching's rough patch

In one sense, the live-patching work has been quiet for much of this year; when LWN last looked at this work in February, the core code had been merged, but the "consistency model" code remained out-of-tree. This code's job is to ensure that a patch is only applied to a live kernel if it is safe to do so; that job includes checking to be sure that the affected functions are not executing at the time the patch is applied. Without this assurance, only relatively trivial patches can be applied with any degree of safety. This is important: the appeal of live patching is the ability to avoid rebooting, so a patch application that crashes the kernel (or, worse, results in data corruption) defeats the whole purpose.

One way of ensuring that a given function is not executing is to freeze all processes on the system, then examine the call stack of each to see which functions are active at the time. This is the approach that was taken when the kpatch and kGraft consistency models were unified in the February patch set. That work ran into strong opposition at the time for a simple reason: the information in the kernel's call stack is often not reliable. The biggest culprit here is assembly-language code, which can easily dispense with the call-stack discipline observed by code compiled from C. The results are often observed by kernel developers; stack traces from kernel crashes are often unreliable, making it hard to determine the sequence of calls that led to the problem.

It's one thing for an unreliable stack trace to make kernel developers scratch their heads more; it's another if that information can fool a live-patching utility into applying a patch at an inopportune time. The risk of that happening was deemed high enough to block the merging of the proposed consistency code. This code, it was said, could only be used if kernel stack traces were known to be 100% reliable.

At the time, 100% reliable stack traces were not widely seen as an attainable goal. It is certainly possible to fix up all of the assembly code that does not set up proper stack frames (assuming it could all be found), but, since nothing in the kernel's normal operation depends on good call-stack information, there was nothing preventing things from breaking again at any time. In the absence of some sort of ongoing assurance that the kernel's call stack will always remain valid, it is hard to be confident that a live-patching system won't do the wrong thing.

Validating the call stack

Some developers might have given up at this point. Josh Poimboeuf, instead, set out to find a way to make the call stack valid at all times and keep it that way; the result is the "compile-time stack metadata validation" patch set, in its 13th revision as of this writing. This work adds a new tool (called stacktool ) that checks the entire kernel as part of the build process to be sure that all code obeys the rules for maintaining the call stack.

The rules are, for the most part, relatively straightforward. For example, every function in assembly code must be marked as a callable function (using the ELF function type). There are some handy macros ( ENTRY and ENDPROC ) that do this annotation now, but not all assembly functions use them. A clear sign that the rules are not being followed is a ret instruction outside of a function block, so stacktool will complain about those.

The primary source of call-stack problems is assembly code that calls another function (possibly a C function) without setting up a new stack frame first. Such calls work, but they will trip up code that is trying to make sense out of the call stack. The validation tool checks to make sure that function calls are surrounded by the appropriate frame-maintenance code. There are currently assembly macros to do this work, but they are unused; Josh's patch renames them to FRAME_BEGIN and FRAME_END and puts them into use. Versions of these macros for inline assembly in C code have also been added; they can be found in <asm/frame.h> .

There are also some rules about dynamic jumps; for the most part, they are only allowed as part of a C switch statement. The one exception is "sibling calls," where the end of one function jumps to the beginning of another and the frame pointer hasn't changed. These rules make it possible for stacktool to follow the control flow in all cases and ensure that the call stack is always maintained.

If the STACK_VALIDATION configuration option is set, stacktool will be run on the kernel's object files as part of the build process. This pass, Josh says, causes a kernel build to take about three seconds longer (he doesn't say whether that's a kernel with a focused configuration or a distribution kitchen-sink configuration). Three seconds is probably an acceptable delay, even for impatient kernel developers, but Josh suggests that some optimization work could probably reduce that figure anyway.

What might be harder for developers to get used to are the complaints emitted by stacktool when it finds a problem. Such complaints go out as warnings in the current patch set, but the intent is to turn them into hard errors once most of the current problems have been fixed. Even if a given developer doesn't enable stack validation, others will, so changes that break the call stack will be returned for repairs in short order. The included documentation file includes examples of the types of errors that may be indicated and how to respond to them.

The current version of the patch set only supports the x86_64 architecture; evidently provisions have been made for adding other architectures, but the nature of the task ensures that a lot of the work will have to be done over again to support something else. Even with a single supported architecture, though, the stack validation work should help to bring an end to the long era where stack traces could not really be trusted. That is good for live patching, but any developer trying to figure out an oops will also benefit from this work. The live-patching developers may not have wanted to take this digression, but the kernel as a whole will be better off as a result of it.

