SGX: when 20 patch versions aren't enough

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

Intel's " Software Guard Extensions " (SGX) feature allows the creation of encrypted "enclaves" that cannot be accessed from the rest of the system. Normal code can call into an enclave, but only code running inside the enclave itself can access the data stored there. SGX is pitched as a way of protecting data from a hostile kernel; for example, an encryption key stored in an enclave should be secure even if the system as a whole is compromised. Support for SGX has been under development for over three years; LWN covered it in 2016. But, as can be seen from the response to the latest revision of the SGX patch set , all that work has still not answered an important question: what protects the kernel against a hostile enclave?

The proposed API for creating and controlling enclaves is complex, so one would expect it to come with comprehensive documentation. The actual API documentation turns out to be a little sparse, though. One starts by opening /dev/sgx/enclave ; there are no privilege checks in the kernel, so the ability to open and act upon this file is determined solely by its permission bits. The SGX_IOC_ENCLAVE_CREATE ioctl() command will begin the process of setting up an enclave in the system. Each page of code or data must then be added with a separate SGC_IOC_ENCLAVE_ADD_PAGE call; the contents of those pages will be encrypted by the processor so that they will be unreadable outside of the enclave. When that process is complete, the enclave is completed with an SGX_IOC_ENCLAVE_INIT operation. At that point, the system loses its ability to manipulate the contents of the enclave; it can call into the enclave to ask for services, but cannot read or modify any of the data stored therein.

After 20 revisions of the patch set over three years, the authors of this work (which was posted by Jarkko Sakkinen) might well be forgiven for thinking that it must be about ready for merging. This posting evoked a new round of opposition, though, that seems clear to delay things for at least a couple more rounds.

The most vocal critic is Greg Wettstein, who has clearly been working with SGX and Intel's out-of-tree driver for some time. His complaints put off some developers with their tone and verbosity, and not all of them were seen as being entirely valid. He was, for example, unhappy that the user-space API has changed from previous versions of the patch set, breaking his current code. But, since this functionality has never been supported in a mainline kernel, there was little sympathy on offer. Some of his other observations, though, needed to be taken more seriously.

When SGX support was first proposed for Linux in 2016, one of its "features" was that only code that had been signed by Intel would be accepted into an enclave. This restriction was less than popular at the time by virtue of the fact that it essentially guaranteed that enclaves would be restricted to running binary blobs. It was made clear that, as long as Intel retained control over which code could run under SGX, support would not be merged into the kernel. Since then, Intel has added "flexible launch control" on some CPUs, which removes this restriction. Now, it seems, things may be a little bit too open.

The core of Wettstein's main complaint is that it is now possible for anybody who can open /dev/sgx/enclave to create and launch an enclave. In theory, that ability would do little for an attacker, since there is little that can actually be done inside an enclave. Any code running inside is restricted to what is available in the enclave itself; there is no ability to call outside code, to make system calls, or even access to facilities like timers. But, as Wettstein pointed out, it has been demonstrated [PDF] that code running within an enclave is able to carry out a number of cache-based, information-exfiltration attacks, even against code running in other enclaves.

Many of these attacks, of course, can be run by code outside of an enclave as well. But running inside of an enclave changes the picture significantly, since the host system has no way to know what that code is doing. Code hiding within an enclave cannot be monitored, profiled, or examined; for an attacker, an enclave is a convenient shadow in which to lurk while trying to exploit various types of information-disclosure vulnerabilities. The fact that one might normally expect the permissions on /dev/sgx/enclave to restrict access to root does little to improve this scenario: remember that the whole purpose of SGX is to defend against a compromised host.

Wettstein's message included a proposed solution, in the form of an interface to the SGX launch control mechanism. The system administrator could configure, at system-initialization time, a set of keys that would be recognized as valid for the signing of enclave contents; only properly signed enclaves could then be launched. Once the set of keys has been established, it can be rendered immutable. A sufficiently advanced attack against the kernel could perhaps circumvent this restriction, but it raises the bar considerably.

This proposal doesn't appear likely to get far; see, for example, Andy Lutomirski's criticism of both the code and the policies that it implements. If the sort of launch control envisioned by Wettstein is to be implemented, Lutomirski said, it should be based more firmly in the kernel. He thought that this feature, should it ever be implemented, could be added after the initial SGX support goes upstream. When pressed by Wettstein, though, Lutomirski did agree that a related problem exists:

There are many, many Linux systems that enforce a policy that *all* executable text needs to come from a verified source. On these systems, you can't mmap some writable memory, write to it, and then change it to executable. (Obviously, JITs either don't work or need special permissions on these systems.) Unless I'm missing it, the current SGX API is entirely incompatible with this model -- the host process supplies text *bytes* to the kernel, and the kernel merrily loads those bytes into executable enclave memory. Whoops!

The restriction mentioned here is typically enforced by a Linux security module (LSM) such as SELinux. With an appropriate policy loaded, the LSM will prohibit the enabling of execute permission on any memory that has ever been mapped writable. With that restriction in place, executable code can only come from the filesystem, which can be verified using a number of mechanisms built into the kernel. The SGX API bypasses all of this, though, allowing a process to run any code it wants as long as it is inside an enclave.

This problem is seen as being a bit of a show-stopper; changing SGX so that it plays well with security modules could require API changes, so it really needs to happen before the code goes upstream. Lutomirski proposed a solution where, rather than passing individual pages into an enclave, user space would pass a descriptor for a file containing the enclave data; security modules and the integrity subsystem would then be given a chance to examine the situation and allow or deny the operation. Unsurprisingly, some of the developers involved were less than happy about making more changes, but the development community is likely to stand firm on this one.

That last point was driven home by Linus Torvalds, who noted that Intel's transactional memory feature turned out to be more useful to attackers than to anybody else. SGX, he said, might turn out in a similar way, so "the patches to enable it should make damn sure that the upsides actually outweigh the downsides". At a minimum, making LSM support work properly would seem to be an important part of providing the assurance that Torvalds is asking for.

Thus, it would seem, even 20 revisions are not going to be enough for the SGX feature. Security technologies are not easy to get right in the best of times; mechanisms that have to play well with other security features are certain to be even harder. It seems likely that, as processors — and the security-related mechanisms they provide — become more complex, the discussions around how they are to be supported in the kernel will become more difficult.

