Subject Re: [RFC 09/10] x86/enter: Create macros to restrict/unrestrict Indirect Branch Speculation From David Woodhouse <> Date Mon, 22 Jan 2018 16:27:06 +0000 On Sun, 2018-01-21 at 14:27 -0800, Linus Torvalds wrote:

> On Sun, Jan 21, 2018 at 2:00 PM, David Woodhouse <dwmw2@infradead.org> wrote:

> >>

> >> The patches do things like add the garbage MSR writes to the kernel

> >> entry/exit points. That's insane. That says "we're trying to protect

> >> the kernel". We already have retpoline there, with less overhead.

> >

> > You're looking at IBRS usage, not IBPB. They are different things.

>

> Ehh. Odd intel naming detail.

>

> If you look at this series, it very much does that kernel entry/exit

> stuff. It was patch 10/10, iirc. In fact, the patch I was replying to

> was explicitly setting that garbage up.

>

> And I really don't want to see these garbage patches just mindlessly

> sent around.



I think we've covered the technical part of this now, not that you like

it — not that any of us *like* it. But since the peanut gallery is

paying lots of attention it's probably worth explaining it a little

more for their benefit.



This is all about Spectre variant 2, where the CPU can be tricked into

mispredicting the target of an indirect branch. And I'm specifically

looking at what we can do on *current* hardware, where we're limited to

the hacks they can manage to add in the microcode.



The new microcode from Intel and AMD adds three new features.



One new feature (IBPB) is a complete barrier for branch prediction.

After frobbing this, no branch targets learned earlier are going to be

used. It's kind of expensive (order of magnitude ~4000 cycles).



The second (STIBP) protects a hyperthread sibling from following branch

predictions which were learned on another sibling. You *might* want

this when running unrelated processes in userspace, for example. Or

different VM guests running on HT siblings.



The third feature (IBRS) is more complicated. It's designed to be

set when you enter a more privileged execution mode (i.e. the kernel).

It prevents branch targets learned in a less-privileged execution mode,

BEFORE IT WAS MOST RECENTLY SET, from taking effect. But it's not just

a 'set-and-forget' feature, it also has barrier-like semantics and

needs to be set on *each* entry into the kernel (from userspace or a VM

guest). It's *also* expensive. And a vile hack, but for a while it was

the only option we had.



Even with IBRS, the CPU cannot tell the difference between different

userspace processes, and between different VM guests. So in addition to

IBRS to protect the kernel, we need the full IBPB barrier on context

switch and vmexit. And maybe STIBP while they're running.



Then along came Paul with the cunning plan of "oh, indirect branches

can be exploited? Screw it, let's not have any of *those* then", which

is retpoline. And it's a *lot* faster than frobbing IBRS on every entry

into the kernel. It's a massive performance win.



So now we *mostly* don't need IBRS. We build with retpoline, use IBPB

on context switches/vmexit (which is in the first part of this patch

series before IBRS is added), and we're safe. We even refactored the

patch series to put retpoline first.



But wait, why did I say "mostly"? Well, not everyone has a retpoline

compiler yet... but OK, screw them; they need to update.



Then there's Skylake, and that generation of CPU cores. For complicated

reasons they actually end up being vulnerable not just on indirect

branches, but also on a 'ret' in some circumstances (such as 16+ CALLs

in a deep chain).



The IBRS solution, ugly though it is, did address that. Retpoline

doesn't. There are patches being floated to detect and prevent deep

stacks, and deal with some of the other special cases that bite on SKL,

but those are icky too. And in fact IBRS performance isn't anywhere

near as bad on this generation of CPUs as it is on earlier CPUs

*anyway*, which makes it not quite so insane to *contemplate* using it

as Intel proposed.



That's why my initial idea, as implemented in this RFC patchset, was to

stick with IBRS on Skylake, and use retpoline everywhere else. I'll

give you "garbage patches", but they weren't being "just mindlessly

sent around". If we're going to drop IBRS support and accept the

caveats, then let's do it as a conscious decision having seen what it

would look like, not just drop it quietly because poor Davey is too

scared that Linus might shout at him again. :)



I have seen *hand-wavy* analyses of the Skylake thing that mean I'm not

actually lying awake at night fretting about it, but nothing concrete

that really says it's OK.



If you view retpoline as a performance optimisation, which is how it

first arrived, then it's rather unconventional to say "well, it only

opens a *little* bit of a security hole but it does go nice and fast so

let's do it".



But fine, I'm content with ditching the use of IBRS to protect the

kernel, and I'm not even surprised. There's a *reason* we put it last

in the series, as both the most contentious and most dispensable part.

I'd be *happier* with a coherent analysis showing Skylake is still OK,

but hey-ho, screw Skylake.



The early part of the series adds the new feature bits and detects when

it can turn KPTI off on non-Meltdown-vulnerable Intel CPUs, and also

supports the IBPB barrier that we need to make retpoline complete. That

much I think we definitely *do* want. There have been a bunch of us

working on this behind the scenes; one of us will probably post that

bit in the next day or so.



I think we also want to expose IBRS to VM guests, even if we don't use

it ourselves. Because Windows guests (and RHEL guests; yay!) do use it.



If we can be done with the shouty part, I'd actually quite like to have

a sensible discussion about when, if ever, we do IBPB on context switch

(ptraceability and dumpable have both been suggested) and when, if

ever, we set STIPB in userspace.







[unhandled content-type:application/x-pkcs7-signature]