Intel Meltdown bug mitigation in master

Meltdown is an Intel-specific bug. AMD is immune. Some ARM cpus might also be vulnerable but DragonFly doesn't run on ARM so... What Meltdown is is basically a FULL KERNEL MEMORY disclosure bug. An unprivileged user program can essentially discern the contents of all of kernel memory on an Intel CPU. The bug works because Intel CPUs will do speculative reads across protection domains, allowing the user program to massage the memory and branch prediction cache to cause a speculative read of kernel memory (even though it crosses the protection domain) followed by a speculative conditional execution. Timing can then be used to scan for and distinguish the contents of kernel memory. DragonFlyBSD master now has a commit to fix this issue. It is not considered 100% tested yet, but it is in the tree and has been tested fairly well. Unfortunately, the only mitigation possible is to remove the kernel memory mappings from the user MMU map, which means that every single system call and interrupt (and the related return to userland later) must reload the MMU twice. This will add 150ns - 250ns of overhead to every system call and interrupt. System calls usually have an overhead of only 100ns, so now it will be 250nS - 350nS of overhead on Intel CPUs. The mitigation is automatically enabled on Intel CPUs, and disabled on AMD CPUs. A new sysctl can be used to manually enable or disable the mitigation: sysctl machdep.isolated_user_pmap -- PERFORMANCE EFFECTS ON SYSTEMS -- Nominal program execution will lose around 5% of its performance with this mitigation. e.g. compiles, utilities, etc. Not too bad. Any system-call-heavy or interrupt-heavy program will lose between 10% and 30% of its performance. This can include databases, high-speed storage operations, very high-speed network operations (e.g. 10GBe or faster), and virtualized operations. -- ADDITIONAL WORK -- I will again look into using PCID to further mitigation the problem. We currently do not use PCID because it doesn't really improve performance. But when this mitigation is enabled, PCID might reduce the impact somewhat. Linux kernel programmers are saying that using PCID can reduce the impact by 50% (e.g. 5%->30% becomes 3%->15% performance loss). But it should be noted that Linux's mitigation is a bit more involved than ours so it is unclear whether the same optimization will improve DragonFlyBSD's performance when running with this mitigation. I should note that we kernel programmers have spent decades trying to reduce system call overheads, so to be sure, we are all pretty pissed off at Intel right now. Intel's press releases have also been HIGHLY DECEPTIVE. In particular, they are starting to talk up 'microcode updates', but those are mitigations for the Spectre bug, not for the Meltdown bug. Spectre is another bug, far more difficult to exploit than Meltdown, which leaks information from other processes or the kernel based on those other processes or kernel doing speculative reads and executions which are partially managed by the originating user process. Spectre does NOT involve a protection domain violation like Meltdown, so the Meltdown mitigation cannot mitigate Spectre. These bugs (both Meltdown and Spectre) really have to be fixed in the CPUs themselves. Meltdown is the 1000 pound gorilla. I won't be buying any new Intel chips that require the mitigation. I'm really pissed off at Intel. -- DRAGONFLY-STABLE -- This work is now in master. It needs significantly more testing before I can move it to -stable and I'm not even sure I CAN move it to -stable easily. I will be looking into that on the weekend. -Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20180105/9dfc3fbe/attachment.html>