From Linus Torvalds <> Date Thu, 6 Jul 2017 09:34:14 -0700 Subject Re: [RFC][PATCH] exec: Use init rlimits for setuid exec On Wed, Jul 5, 2017 at 9:32 PM, Kees Cook <keescook@chromium.org> wrote:

> In an attempt to provide sensible rlimit defaults for setuid execs, this

> inherits the namespace's init rlimits:



Yeah, so I have to admit to hating this patch.



As already mentioned by others, it's not only not clear that we want

to do this on every setuid exec, it's also not clear that init is the

right source of limits, or even which limits we'd want to copy.



I can easily see init doing a rlimit for its own use, and then when it

goes through the fork/exec process does it set up some other rlimit

for what it is going to run. You'd presumably want that for any

non-system thing, so it's actually fairly natural to do it for system

things too, so it's not at all obvious that "init" itself would run

with some generic "system limits".



So to me this feels like a bad hack that was brought on by this

particular attack.



I'd much rather see something like



(a) minimal: just use our existing default stack (and stack _only_)

limit value for suid binaries that actually get extra permissions: {

_STK_LIM, RLIM_INFINITY }.



or



(b) fancier: per-namespace defaults that can be explicitly set by

something, and enabled individually.



or



(c) perhaps encourage people to annotate their suid binaries with

initial resource requirements (and for stack, I mean the existing

GNU_STACK ELF annotation in particular).



For an example of (a), that existing _STK_LIM define is what the

kernel defaults to, and it's a 8MB stack. And looking at my Fedora

install, I see that the default user rlimit is 8MB for the stack.



Is that just coincidence, or is that just a sign of "nobody ever even

modifies the default value"? So (a) feels like "nobody really cares,

and 8MB is fine, and nobody even bothers changing it - just do the

minimal thing".



As to (b), we could just have that whole INIT_RLIMITS per-namespace,

but only enable the stack limit by default. But then system admins

could cvhange those limits and enable/disable individual rlimits to be

used by suid binaries. That feels like the "give the admin tools"



And (c) would be the sane option, and what we already do for things

like GNU_STACK to enable/disable executable stacks. It really feels

like allowing the GNU_STACK segment to contain stack rlimit override

information would be the perfect tool for binaries to say "Yeah, I

need more stack than _STK_LIM".



So I see many different approaches (that could be combined: I like

combining (a) and (c), for example), and absolutely none of them

involve the random "take some values from init".



And yes, a large part of this may be that I no longer feel like I can

trust "init" to do the sane thing. You all presumably know why.



Linus



