From Linus Torvalds <> Date Tue, 21 Nov 2017 05:25:06 -1000 Subject Re: [GIT PULL] usercopy whitelisting for v4.15-rc1 [ This turned longer than it should have. I guess jet lag is a good thing ]



On Tue, Nov 21, 2017 at 3:48 AM, Jason A. Donenfeld <Jason@zx2c4.com> wrote:

>

> It might be news to you that actually some security people scoff at

> other security peoples' obsession with "security bugs".



Heh. I'm not actually surprised. It's just that the public "look at

this security bug" ones are the ones you see.



> The security industry is largely obsessed by finding (and

> selling/using/patching/reporting/showcasing/stockpiling/detecting/stealing)

> these "dangerous/useful" variety of bugs. And this obsession is

> continually fulfilled because bugs keep happening -- which is just the

> nature of software development -- and so this "security bug"

> infatuation continues. It even leads to people upset with you that you

> don't care about CVEs and so forth, because they're so fixated on

> individual bugs and their security impact.



Agreed.



> So what's the

> alternative to obsessing over each individual software bug?

>

> In the context of the kernel, the solution from Spender and Pipacs,

> and more recently "adopted" by Kees and his KSPP project, has been to

> try to eliminate the "security utility" of bugs.



And the thing is, I obviously agree very much about the whole "let's

have multiple layers of security even within the kernel, so that

random individual bugs don't end up being so exploitable". Bugs will

happen, let's aim to limit their damage.



To turn them "benign" in your words.



So I should be thrilled pink about the hardening efforts, right?



Well, I would - except for what "benign" means in that context, and

how security people have very different expectations from users - and

how those are both different from developers.



From a security standpoint, when you find an invalid access, and you

mitigate it, you've done a great job, and your hardening was

successful and you're done. "Look ma, it's not a security issue any

more", and you can basically ignore it as "just another bug" that is

now in a class that is no longer your problem.



So to you, the big win is when the access is _stopped_. That's the end

of the story from a security standpoint - at least if you are one of

those bad security people who don't care about anything else.



But from a developer standpoint, things _really_ are not done. Not

even close. From a developer standpoint, the bad access was just a

symptom, and it needs to be reported, and debugged, and fixed, so that

the bug actually gets corrected.



So from a developer standpoint, the end point of hardening is just

the starting point, and when _you_ think you're done, we're really

only getting started.



And from a _user_ standpoint, it's something else altogether. For a

user, pretty much EVERY SINGLE TIME, it wasn't actually a security

attack at all, it was just a latent bug that got exposed. And the

keyword here is that it was _latent_, and things used to work, and the

hardening patch did something - probably fairly drastic - to turn it

from "dangerous" to "benign" from a security perspective.



So from a user standpoint, the hardening was just a big nasty

annoyance, and probably made their workflow _break_, without actually

helping their case at all, because they never really saw the original

bug as a problem to begin with.



Notice? BIG disconnect in what "hardening" means for three groups, and

in particular, the number one rule of kernel development is that "we

don't break users".



Because without users, your program is pointless, and all the

development work you've done over decades is pointless.



.. and security is pointless too, in the end.



Now, the thing that annoys me and that makes me so _angry_ about this,

is that it shouldn't need to be that huge a disconnect.



It shouldn't need to be a big issue, because pretty much all the work

done for hardening should be able to actually make both the developers

and the users _happier_, instead of just making their lives miserable.



But that does mean that he hardening people need to really see past

that "endpoint" that they were looking at.



For a developer, the hardening effort _could_ be a great boon, in that

it could show nasty bugs early, it could make them easier to report,

and it could add a lot of useful information to that report that makes

them easier to fix too.



And from a user perspective, the hardening work shouldn't have to mean

"the latent bug that I didn't care about now screwed me over and is an

overt bug for me". It might not help the user directly, but if a year

from now, the latent bug that made their machine occasionally go all

wonky is fixed, the hardening effort did end up helping them too.



But what do we need for this to actually happen?



As a developer, I do want the report. But if you killed the user

program in the process, I'm actually _less_ likely to get the report,

because the latent access was most likely in some really rare and

nasty case, or we would have found it already. In the kernel, there's

a high likelihood that it was in a driver, for example. Maybe an

unusual ioctl() that is not getting a huge amount of attention,

because it's one driver ramong thousands, and it's probably not used

every time anyway. But because it's the kernel, and because it's a

driver, it's quite likely that killing the offender will do bad things

to various random locks that were held, or maybe it happens in an

interrupt and the whole machine is now dead if we're unlucky because

there really were some very core locks being held.



And as a user, my unhappiness is obvious. You don't even have to kill

the machine and make it hard to report to make a user unhappy, just

"the new kernel didn't work for me" will make that user skittish about

upgrading the kernel at all.



And the fix really looks fairly straightforward:



- when adding hardening features, you as a security person should

always see that hardening to be the _endpoint_, but not the immediate

goal.



- when adding hardening features, the first step should *ALWAYS* be

"just report it". Not killing things, not even stopping the access.

Report it. Nothing else.



"Do no harm" should be your mantra for any new hardening work.



And that "do no harm" may feel antithetical to the whole point. You go

"but t hat doesn't work - then the bug still exists".



But remember - keep your eye on the endpoint, and that this is just

the first step. You need to not piss off users, and you need to not

piss of developers.



Because if you as a security person just piss off users, and piss off

developers, I'm not going to take your work, and I'm going to call you

a bad security person.



Because in the end, those users really do matter. Without those users,

your system may be "secure", but all your security work was still just

masturbation. You didn't do anything useful at all in the end.



So if hardening people can learn to "always report first", then I

think we can all work together.



But that really does mean that you don't start killing processes until

after you've shown that "look, the code to report these things has

been there for months, can we start doing more drastic things now?"



And remember: it's not that the code is months old. It's that the

code has been RUN BY USERS for months. If it's been in your tree, or

in grsecurity for five years, that doesn't mean a thing. It only means

that hardly anybody actually ever ran it.



If it's been on a random cellphone for a few months, and real users

used it, and had facebook and candy crush running on it, that's a

pretty different deal.



If it's been in a released kernel for a year, and Ubuntu and Fedora

and SuSE had it enabled, and there aren't reports, that's a big deal.



In contrast, if it's been on your server farm for three months, that

means _nothing_. You have pretty much zero coverage of the driver

situation, or of the random apps that people run.



See?



All I need is that the whole "let's kill processes" mentality goes

away, and that people acknowledge that the first step is always "just

report".



Do no harm.



Please.



Linus



