Date Fri, 26 Feb 2016 20:59:59 +0100 Subject Re: BUG: unable to handle kernel paging request from pty_write [was: Linux 4.4.2] From Robert Święcki <> 2016-02-26 20:44 GMT+01:00 Linus Torvalds <torvalds@linux-foundation.org>:



>> I've contacted Robert Święcki (who found the microcode problem) in

>> case he wants to weigh in in this thread.. He was talking to some AMD

>> people, but I don't know the exactly who.

>

> And since it's looking increasingly likely that it's the same issue,

> I'm adding Robert here explicitly to the cc so that he sees the

> thread...



Thx,



Some data I was able to gather:



It happens only with 0x6000832 ucode, and Piledriver-based CPUs: i.e.

newer AMD FX, and Opteron 300 series (4300, 6300 etc.).



The visible effects are in ~80% of cases incorrect RSP leading to bad

'rets' into kernel data/bss or stack-protector faults. But there are

also more elusive ones, like registers being cleared before use in

indirect memory fetches or so.



I can trigger it from within qemu guest (non-root), causing bad RIP in

the host kernel. When testing, a couple of times (maybe 2) out of

maybe 30 seen oopses, I was able to set it to user-space addresses

mapped in the guest. It greatly depends on timing, but I think with

some more effort and populating kernel stack with guest addresses it'd

be possible to create a more reliable qemu-guest to host ring0 escape.



I CC'd some AMD engineers from this list, and on of them replied with

"We are working on the final testing of a new microcode patch to

replace 0x06000832."

but without specifying any errata no, or ETA for the new ucode.



I can only now suggest not using 0x06000832 is possible (i.e. if it's

not embedded in BIOS), I tested a few from

http://www.amd64.org/microcode.html and only this version seemed

vulnerable.



PS. There's a bug on vmware pages -

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2061211

- which looks very similar to this problem (affects Opteron 6300 which

is Piledriver-based), and it was "somehow" patched by vmware in their

kernel. It points to AMD errata #815 -

http://support.amd.com/TechDocs/48063_15h_Mod_00h-0Fh_Rev_Guide.pdf -

but I cannot tell whether it's really the same problem, or whether it

can be somehow by-passed on the kernel side.



--

Robert Święcki



