arc4random in libressl 2.0.2

There seems to be a series of public critiques of the portable libressl, i'll follow these trends here. (Shouldn't have.. further discussions cannot be followed this way).

TL;DR: the fixes in 2.0.2 are wrong questionable progress of a kind.

The 2.0.2 release added further entropy source to getentropy_fallback: the base addresses of the shared objects in the application are hashed into the entropy pool using the non-standard dl_iterate_phdr api. It is still an open question whether using the fallback entropy sources is reasonable practice: they may be observable to an attacker by other means, so it is not clear how many secret bits are added and the application does not know when the fallback path was taken.

The most significant change in 2.0.2 is a workaround in arc4random to avoid information leak to a forked child via the internal state of the prng. The issue has a detailed explanation in the blog post where it was reported.

The workaround is registering a callback with pthread_atfork (or __register_atfork) that sets a global flag in the child so the prng is reseeded next time arc4random is called. Slightly simplified version of the arc4random code from libressl:

/* Marked MAP_INHERIT_ZERO, so zero'd out in fork children. */ static struct { ... size_t rs_count; /* bytes till reseed */ } *rs; /* Preserved in fork children. */ static struct { ... u_char rs_buf[RSBUFSZ]; } *rsx; static void forkhandler(void) { if (rs) rs->rs_count = 0; } #ifdef __GLIBC__ extern void *__dso_handle; #endif static void init(...) { ... if (!rs) { if ((rs = mmap(...)) == MAP_FAILED) abort(); #ifdef MAP_INHERIT_ZERO if (minherit(rs, sizeof(*rs), MAP_INHERIT_ZERO) == -1) abort(); #else #ifdef __GLIBC__ __register_atfork(0, 0, forkhandler, __dso_handle); #else pthread_atfork(0, 0, forkhandler); #endif #endif } ... } static void stir(void) { if (getentropy(p, n) == -1) raise(SIGKILL); if (!rs) init(p, n); ... } ... static void random_u32(uint32_t *val) { ... stir_if_needed(); ... memcpy(val, keystream, sizeof(*val)); ... } static pthread_mutex_t mtx = PTHREAD_MUTEX_INITIALIZER; uint32_t arc4random(void) { uint32_t val; pthread_mutex_lock(&mtx); random_u32(&val); pthread_mutex_unlock(&mtx); return val; }

pthread_atfork (and __register_atfork)

This article used to have an elaborate description about the brokenness of forkhandler registered by pthread_atfork. Matthew Dempsky (OpenBSD developer) showed me in a mail that i was wrong, so this part got updated including an explanation why i still think that using pthread_atfork is problematic. (2014-07-17)

Original (wrong) argument (abbreviated):

A multi-threaded application may call fork if it only calls async-signal-safe functions in the child. An application may call fork from an async signal handler with the same constraint: the code running in the child must be async-signal-safe. So the forkhandler registered by pthread_atfork (or __register_atfork) must be async-signal-safe otherwise previously correct applications will invoke undefined behaviour with the new libressl version.

forkhandler is not async-signal-safe, because it accesses global state (that is not sig_atomic_t or lock-free atomic). The accessed objects (rs and rs->rs_count) are even protected by a pthread mutex so it seems it would be a data-race calling such a function concurrently with another thread violating thread-safety.

Correction:

forkhandler is formally not async-signal-safe but it is safe to use here:

the rs object is constant because it is only modified before the forkhandler is registered thus there is no data race or async-signal-safety issue. (Note: ISO C and POSIX says that even constant objects have unspecified value in a signal handler, which is an old bug in the standard, in practice this was never an issue, see ISO C DR461, POSIX #728, ISO C++ n3910)

the rs->rs_count object may be concurrently modified when fork is called from another thread (or signal handler), but the child after fork is a single threaded process that cannot affect the parent and thus there is no data race. The semantics of code running after fork is not specified by ISO C (C has no fork), in POSIX the requirement for a multi-threaded process is that the child must be async-signal-safe. In practice the situation after fork in a child is similar to an async signal handler that never returns. So rs->rs_count has unspecified value in the child, but it is only written to, this would not be safe in a signal handler when it returns, but safe after fork.

But..

I still think this is not a happy ending: pthread_atfork is a broken API. POSIX is already on the way to deprecate this function and adds warnings about issues with multi-threaded usage and about the async-signal-safety requirement for child handlers as discussed.

It is hard to use safely from a thread-safe library, because

global state access in the library around fork is problematic (libressl handles this fine if my updated analysis is correct),

and it requires global state management in the C runtime (this is a libc issue and may observably affect applications using the new libressl with pthread_atfork)

Examples where pthread_atfork can cause C runtime issues:

application dlopening and dlclosing a library that registers callbacks with pthread_atfork (without the application's knowledge). On some (arguably broken) implementations this can crash the application at the next fork when non-existent callbacks are invoked. Theo de Raadt (OpenBSD developer) notes that in case of glibc (on Linux), libressl uses __register_atfork which handles the dlclose issue, but on OSX there is no known alternative. (2014-07-17)

On some (arguably broken) implementations pthread_atfork breaks the async-signal-safety of fork. (pthread_atfork has to mask all signals when it accesses global state and guard against concurrent access from multiple threads as well and allow the registration of new handlers from an atfork handler, these are non-trivial to do correctly, indeed no libc gets them right to my knowledge) (2014-07-21).

So i still stand by my previous conclusion, the leaked prng information should not be worked around, instead applications using fork in unsafe ways should be fixed:

a library cannot easily hide its secrets if the application uses fork unreasonably and

an application cannot know all the secrets in all the different libraries it uses so trying workarounds like RAND_poll to hide them is futile: after fork a lot of unwanted information will be exposed to the child anyway.

The only reasonable solution is to deprecate fork in the multi-threaded world (and use posix_spawn instead when possible), unsafe usage of fork in applications makes safe library development impossible.

The other conclusion is that library safety is hard to guarantee when mutable global state is involved: without the global random state there would be no safety problem here.

failure handling

The library does not report resource allocation failures, instead it aborts so it cannot be used in a robust application that is expected to work in low-memory situations.

It also crashes if entropy gathering fails. (uses SIGKILL instead of abort to avoid secrets showing up in a core dump, note that secrets maybe possible to protect with MADV_DONTDUMP madvise flag in recent kernels and keep in memory with mlock, but in general this should not be the concern of a library).

pthread_atfork and __register_atfork return value is not checked (so if the libc could not store the handler it will be silently omitted unlike in the case of the failure of minherit).

I don't think there is a perfect solution given the api requirements, in general hiding possibly failing syscalls behind apis without error return is not a good design. Maybe a modern TLS library should be designed from scratch instead of trying to clean up OpenSSL?

2014-07-16,2014-07-17,2014-07-21 - nsz at port70.net