Subject Re: Avoid speculative indirect calls in kernel From Tom Lendacky <> Date Thu, 4 Jan 2018 14:00:11 -0600



On 1/4/2018 10:15 AM, David Woodhouse wrote:

> On Thu, 2018-01-04 at 15:29 +0000, Woodhouse, David wrote:

>>

>>> With the GCC -mindirect-branch=thunk-external support, and microcode,

>>> Xen will make a boot-time choice between using Retpoline, Lfence (which

>>> is the better AMD option, and more performant than retpoline), or IBRS

>>> on Skylake and newer processors where it is strictly necessary, as well

>>> as using IBPB whenever available.

>>

>> I need to pull in the AMD lfence alternative for retpoline, giving us a

>> 3-way choice of the existing retpoline thunk, "lfence; jmp *%\reg", and

>> a bare "jmp *%\reg".

>

> I think I can abuse X86_FEATURE_SYSCALL for that, right? So it would

> look something like this:

>

> --- a/arch/x86/lib/retpoline.S

> +++ b/arch/x86/lib/retpoline.S

> @@ -12,7 +12,7 @@

>

> ENTRY(__x86.indirect_thunk.\reg)

> CFI_STARTPROC

> - ALTERNATIVE "call 2f", __stringify(jmp *%\reg), X86_BUG_NO_RETPOLINE

> + ALTERNATIVE_2 "call 2f", __stringify(lfence;jmp *%\reg), X86_FEATURE_SYSCALL, __stringify(jmp *%\reg), X86_BUG_NO_RETPOLINE

> 1:

> lfence

> ASM_UNREACHABLE

>

>

> However, I would very much like to see a categorical statement from AMD

> that the lfence is sufficient in all cases. Remember, Intel were saying

> that too for a while, before finding that it was not *quite* good

> enough.



Yes, lfence is sufficient. As long as the target is in the register

before the lfence and we jump through the register all is good, i.e.:



Include a dispatch serializing instruction after the load of an indirect

branch target. For instance, change this code:



1: jmp *[rax] ; jump to address pointed to by RAX



To this:



1: mov [rax], rax ; load target address

2: lfence ; dispatch serializing instruction

3: jmp *rax



The processor will stop dispatching instructions until all older

instructions have returned their results and are capable of being retired

by the processor. At this point the branch target will be in the general

purpose register (rax in this example) and available at dispatch for

execution such that the speculative execution window is not large enough

to be exploited.



Thanks,

Tom



>



