November 30, 2017 posted by Kamil Rytarowski

During the past month I've finished my work on TSan for NetBSD/amd64. There are still few minor issues, although the Sanitizer is already suitable for real applications and is stable. I was able to build real applications like LLDB against TSan and get it to work to find real threading problems.

The process of stabilization and fixing TSan was challenging as there are intermixed types of issues that resulted in one big random breakage bug that is difficult to analyze. Software debuggers need more work with threaded programs, so this was like a chicken-egg problem, to debug debugging utilities.

Corrections

Most of the corrections were in TSan-specific and Common Sanitizer code. There was also one fix in LSan.

TSan: on_exit()/at_exit(3)/__cxa_atexit()

There are different function types for the same purpose: to execute a callback function on thread or process termination. The existing code in TSan wasn't compatible with the NetBSD Operating System:

on_exit() - This function is Linux-specific, I've disabled it for NetBSD.

at_exit(3) - It was reimplemented by TSan using __cxa_atexit(), however in an incompatible way for NetBSD. TSan was attempting to register a wrapper callback through __cxa_atexit() with the second argument as a function pointer and the third argument (Dynamic Shared Object pointer) equal with NULL. This approach is not portable and it broke on NetBSD, therefore I had to add a new implementation based on a stack (LIFO container).

Every at_exit(3) registering function is intercepted by TSan and the sanitizer pushes it to the local LIFO container, passing its local wrapper function to the system. During the execution of a callback by the OS, we call the wrapper, which pops the originally saved function pointer from the stack and executes it. __cxa_atexit() - This callback shared TSan internals with at_exit(3) and is functional on NetBSD.

To assure the changes, I've added a new test named atexit3, which assures the correct order of execution of the at_exit(3) callbacks.

TSan: _lwp_exit()

In order to detect a thread's termination by the TSan interceptors, a mechanism to register a callback function in the pthread(3) destructor was used. The destructor callback was registered with pthread_key_create(3) and this approach was broken on NetBSD for two reasons.

We cannot register it during early libc and libpthread(3) bootstrap, as the system functions need to initialize. The execution of callback functions is not the last event during a POSIX thread entity termination.

I was looking for a mechanism to defer the destructor callback registration to subsequent libc initialization stages, similar to constructor sections. I've understood that this approach was suboptimal because it resulted in further breakage. The NetBSD implementation of a POSIX thread termination notifies a parent thread (waiter for join) and still attempts to acquire mutex. TSan assumed that no longer any thread specific function is called like a mutex acquisition and destroyed part of thread specific data to trace such events. I've switched the POSIX thread termination event detection to the interception of _lwp_exit(2) call, as it's truly the latest interceptable function on NetBSD, detaching the low-level thread entity (LWP) that is the kernel context for POSIX thread.

TSan: Thread Joined vs Thread Exited

Correcting the detection of termination of a thread caused new problems, with a race between two event notifications that happen at the same time:

Thread A sleeps waiting for joining of thread B.

Thread B wakes thread A notifying it as joinable.

Thread B terminates calling _lwp_exit().

Both events are traced by TSan: joining and exiting and they must be intercepted in the order of exiting followed by joining (unless a thread is marked to be detached without joining).

This problem has been analyzed and fixed by the introduction of atomic-function waiters in low-level parts (not exposed to TSan or other sanitizers), that causes busy waiting in ThreadRegistry::JoinThread for notifying the end of execution of ThreadRegistry::FinishThread. This approach happened to be stable and so far no failures are observed. There was a tiny breakage in ppc64-linux, as this change introduced as infinite freeze, but it was caused by an unrelated problem and a faulty test was switched from failing to unsupported.

Sanitizers: GetTls

I've implemented the initial support for determining whether a memory buffer is allocated as Thread-Local-Storage. The current approach uses FreeBSD code, however it's subject to future improvement: in order to make it more generic and aware of dynamic allocation (like after dlopen(3)) TLS vectors.

Sanitizers: Handling NetBSD specific indirection of libpthread functions

I've corrected handling of three libpthread(3) functions on NetBSD:

pthread_mutex_lock(3),

pthread_mutex_unlock(3),

pthread_setcancelstate(3).

Code out of the libpthread(3) context uses the libc symbols:

__libc_mutex_lock,

__libc_mutex_unlock,

__libc_thr_setcancelstate.

The threading library (libpthread(3)) defines strong aliases:

__strong_alias(__libc_mutex_lock,pthread_mutex_lock)

__strong_alias(__libc_mutex_unlock,pthread_mutex_unlock)

__strong_alias(__libc_thr_setcancelstate,pthread_setcancelstate)

This caused that these functions were invisible to sanitizers on NetBSD. I've introduced interception of the libc-specific functions and I have added them as NetBSD-specific aliases for the common pthread(3) functions.

NetBSD needs to intercept both functions, as the regularly named ones are used internally in libpthread(3).

Sanitizers: Adding DemangleFunctionName for backtracing on NetBSD

NetBSD uses indirection for old threading functions for historical reasons. The mangled names are an internal implementation detail and should not be exposed even in backtraces.

__libc_mutex_init -> pthread_mutex_init

__libc_mutex_lock -> pthread_mutex_lock

__libc_mutex_trylock -> pthread_mutex_trylock

__libc_mutex_unlock -> pthread_mutex_unlock

__libc_mutex_destroy -> pthread_mutex_destroy

__libc_mutexattr_init -> pthread_mutexattr_init

__libc_mutexattr_settype -> pthread_mutexattr_settype

__libc_mutexattr_destroy -> pthread_mutexattr_destroy

__libc_cond_init -> pthread_cond_init

__libc_cond_signal -> pthread_cond_signal

__libc_cond_broadcast -> pthread_cond_broadcast

__libc_cond_wait -> pthread_cond_wait

__libc_cond_timedwait -> pthread_cond_timedwait

__libc_cond_destroy -> pthread_cond_destroy

__libc_rwlock_init -> pthread_rwlock_init

__libc_rwlock_rdlock -> pthread_rwlock_rdlock

__libc_rwlock_wrlock -> pthread_rwlock_wrlock

__libc_rwlock_tryrdlock -> pthread_rwlock_tryrdlock

__libc_rwlock_trywrlock -> pthread_rwlock_trywrlock

__libc_rwlock_unlock -> pthread_rwlock_unlock

__libc_rwlock_destroy -> pthread_rwlock_destroy

__libc_thr_keycreate -> pthread_key_create

__libc_thr_setspecific -> pthread_setspecific

__libc_thr_getspecific -> pthread_getspecific

__libc_thr_keydelete -> pthread_key_delete

__libc_thr_once -> pthread_once

__libc_thr_self -> pthread_self

__libc_thr_exit -> pthread_exit

__libc_thr_setcancelstate -> pthread_setcancelstate

__libc_thr_equal -> pthread_equal

__libc_thr_curcpu -> pthread_curcpu_np

This demangling also fixes several tests that expect the regular pthread(3) function names.

TSan: Handling NetBSD specific indirection of libpthread functions

I've corrected handling of libpthread(3) functions in TSan/NetBSD:

pthread_cond_init(3),

pthread_cond_signal(3),

pthread_cond_broadcast(3),

pthread_cond_wait(3),

pthread_cond_destroy(3),

pthread_mutex_init(3),

pthread_mutex_destroy(3),

pthread_mutex_trylock(3),

pthread_rwlock_init(3),

pthread_rwlock_destroy(3),

pthread_rwlock_rdlock(3),

pthread_rwlock_tryrdlock(3),

pthread_rwlock_wrlock(3),

pthread_rwlock_trywrlock(3),

pthread_rwlock_unlock(3),

pthread_once(3).

Code out of the libpthread(3) context uses the libc symbols that are prefixed with __libc_, for example: __libc_cond_init.

This has caused that these functions were invisible to sanitizers on NetBSD. Intercepting the libc-specific and adding them as NetBSD-specific aliases for the common pthread(3) functions.

NetBSD needs to intercept both functions, as the regularly named ones are used internally in libpthread(3).

TSan: Correcting NetBSD support in pthread_once(3)

The pthread_once(3)/NetBSD type is built with the following structure:

struct __pthread_once_st { pthread_mutex_t pto_mutex; int pto_done; };

I've set the pto_done position as shifted by __sanitizer::pthread_mutex_t_sz from the beginning of the pthread_once struct.

This corrects deadlocks when the pthread_once(3) function is used.

Sanitizers: Plug dlerror() leak for swift_demangle

InitializeSwiftDemangler() attempts to resolve the swift_demangle symbol. If this is not available, we observe dlerror message leak.

LSan: Detecting thread's termination

I've fixed the same problem as has been analyzed in TSan, and I've switched to the _lwp_exit(2) approach.

Sanitizers: Handling symbol renaming of sigaction on NetBSD

NetBSD uses the __sigaction14 symbol name for historical and compat reasons for the sigaction(2) function name.

I've renamed the interceptors and users of sigaction to sigaction_symname and I've reused it in the code base.

TSan: Correcting mangled_sp on NetBSD/amd64

I've fixed the LongJmp(3) function on NetBSD and pointed the correct place of the RSP (stack pointer) register on NetBSD/amd64.

TSan: Supporting the setjmp(3) family of functions on NetBSD/amd64

I've added support for handling the setjmp(3)/longjmp(3) family of functions on NetBSD/amd64.

There are three types of them on NetBSD:

setjmp(3) / longjmp(3)

sigsetjmp(3) / sigsetjmp(3)

_setjmp(3) / _longjmp(3)

Due to historical and compat reasons the symbol names are mangled:

setjmp -> __setjmp14

longjmp -> __longjmp14

sigsetjmp -> __sigsetjmp14

siglongjmp -> __siglongjmp14

_setjmp -> _setjmp

_longjmp -> _longjmp

This leads to symbol renaming in the existing codebase.

There is no such symbol as __sigsetjmp/__longsetjmp on NetBSD so it has been disabled.

Additonally, I've added a comment that GNU-style executable stack note is not needed on NetBSD. The stack is not executable without it.

TSan: Deferring StartBackgroundThread() and StopBackgroundThread()

NetBSD cannot spawn new POSIX thread entities in early libc and libpthread initialization stage. I've deferred this to the point of intercepting the first pthread_create(3) call.

This is the last change that makes Thread Sanitizer functional on NetBSD/amd64 without downstream patches.

Final TSan results

Results for the check-tsan test-target.

******************** Testing Time: 64.91s ******************** Failing Tests (5): ThreadSanitizer-x86_64 :: dtls.c ThreadSanitizer-x86_64 :: ignore_lib5.cc ThreadSanitizer-x86_64 :: ignored-interceptors-mmap.cc ThreadSanitizer-x86_64 :: mutex_lock_destroyed.cc ThreadSanitizer-x86_64 :: vfork.cc Expected Passes : 290 Expected Failures : 1 Unsupported Tests : 83 Unexpected Failures: 5

The following results present that the all crucial issues are now fixed, and this Sanitizer can be used to trace real software. The remaining problems are minor ones and they are scheduled to be fixed in the future:

signal_block.cc - there is some race; sometimes it works sometimes it does not work.

dtls.c - it looks like dynamically allocated TLS vectors are missing on the NetBSD side.

vfork.cc - testing UB, it looks like NetBSD behaves the same way like Linux does, however the test is failing.

mutex_lock_destroyed.cc - it is based on UB implemented in style of Linux.

The other tests fail for similar rare case scenarios like massive mmap(2) calls that seem to overflow the shadow.

LLVM JIT

As noted in the previous reports, there is an ongoing process to improve NetBSD compatiblity with existing Just-In-Time frameworks in LLVM. In the recent month the existing code has been adjusted to the point to pass all existing LLVM tests of JIT code on NetBSD under PaX MPROTECT.

Scudo hardened allocator

I've added initial support for NetBSD in the Scudo hardened allocator. I keep this code locally in pkgsrc-wip/compiler-rt-netbsd.

More work is needed in order to correct the known failures in tests. These are largely caused by the fact that Scudo was a Linux-only feature and the existing tests depend on GLIBC specific internals. They need to be adapted for the default NetBSD allocator (jemalloc(3)).

******************** Testing Time: 5.40s ******************** Failing Tests (32): Scudo-i386 :: double-free.cpp Scudo-i386 :: interface.cpp Scudo-i386 :: memalign.c Scudo-i386 :: mismatch.cpp Scudo-i386 :: options.cpp Scudo-i386 :: overflow.c Scudo-i386 :: preload.cpp Scudo-i386 :: quarantine.c Scudo-i386 :: realloc.cpp Scudo-i386 :: rss.c Scudo-i386 :: secondary.c Scudo-i386 :: sizes.cpp Scudo-i386 :: valloc.c Scudo-x86_64 :: alignment.c Scudo-x86_64 :: double-free.cpp Scudo-x86_64 :: interface.cpp Scudo-x86_64 :: malloc.cpp Scudo-x86_64 :: memalign.c Scudo-x86_64 :: mismatch.cpp Scudo-x86_64 :: options.cpp Scudo-x86_64 :: overflow.c Scudo-x86_64 :: preload.cpp Scudo-x86_64 :: quarantine.c Scudo-x86_64 :: random_shuffle.cpp Scudo-x86_64 :: realloc.cpp Scudo-x86_64 :: rss.c Scudo-x86_64 :: secondary.c Scudo-x86_64 :: sized-delete.cpp Scudo-x86_64 :: sizes.cpp Scudo-x86_64 :: threads.c Scudo-x86_64 :: valloc.c Expected Passes : 8 Unexpected Failures: 32

Plans for the next milestone

The next goal is to finish MSan and switch back to LLDB restoration for tracing single threaded programs.

The TSan corrections indirectly increased the number of passing MSan tests. I'm going to solve the detected problems and thanks to the experience with other sanitizers the MSan issues don't seem to be as challenging like as before finishing TSan.

******************** Testing: 0 .. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. Testing Time: 30.91s ******************** Failing Tests (69): MemorySanitizer-x86_64 :: allocator_returns_null.cc MemorySanitizer-x86_64 :: backtrace.cc MemorySanitizer-x86_64 :: c-strdup.c MemorySanitizer-x86_64 :: chained_origin.cc MemorySanitizer-x86_64 :: chained_origin_empty_stack.cc MemorySanitizer-x86_64 :: chained_origin_limits.cc MemorySanitizer-x86_64 :: chained_origin_memcpy.cc MemorySanitizer-x86_64 :: chained_origin_with_signals.cc MemorySanitizer-x86_64 :: check_mem_is_initialized.cc MemorySanitizer-x86_64 :: death-callback.cc MemorySanitizer-x86_64 :: dlopen_executable.cc MemorySanitizer-x86_64 :: dso-origin.cc MemorySanitizer-x86_64 :: dtls_test.c MemorySanitizer-x86_64 :: dtor-base-access.cc MemorySanitizer-x86_64 :: dtor-bit-fields.cc MemorySanitizer-x86_64 :: dtor-derived-class.cc MemorySanitizer-x86_64 :: dtor-multiple-inheritance-nontrivial-class-members.cc MemorySanitizer-x86_64 :: dtor-multiple-inheritance.cc MemorySanitizer-x86_64 :: dtor-trivial-class-members.cc MemorySanitizer-x86_64 :: dtor-vtable-multiple-inheritance.cc MemorySanitizer-x86_64 :: dtor-vtable.cc MemorySanitizer-x86_64 :: fork.cc MemorySanitizer-x86_64 :: ftime.cc MemorySanitizer-x86_64 :: getaddrinfo-positive.cc MemorySanitizer-x86_64 :: getaddrinfo.cc MemorySanitizer-x86_64 :: getc_unlocked.c MemorySanitizer-x86_64 :: heap-origin.cc MemorySanitizer-x86_64 :: icmp_slt_allones.cc MemorySanitizer-x86_64 :: iconv.cc MemorySanitizer-x86_64 :: ifaddrs.cc MemorySanitizer-x86_64 :: insertvalue_origin.cc MemorySanitizer-x86_64 :: mktime.cc MemorySanitizer-x86_64 :: mmap.cc MemorySanitizer-x86_64 :: msan_copy_shadow.cc MemorySanitizer-x86_64 :: msan_dump_shadow.cc MemorySanitizer-x86_64 :: msan_print_shadow.cc MemorySanitizer-x86_64 :: msan_print_shadow2.cc MemorySanitizer-x86_64 :: origin-store-long.cc MemorySanitizer-x86_64 :: param_tls_limit.cc MemorySanitizer-x86_64 :: print_stats.cc MemorySanitizer-x86_64 :: pthread_getattr_np_deadlock.cc MemorySanitizer-x86_64 :: pvalloc.cc MemorySanitizer-x86_64 :: readdir64.cc MemorySanitizer-x86_64 :: realloc-large-origin.cc MemorySanitizer-x86_64 :: realloc-origin.cc MemorySanitizer-x86_64 :: report-demangling.cc MemorySanitizer-x86_64 :: scandir.cc MemorySanitizer-x86_64 :: scandir_null.cc MemorySanitizer-x86_64 :: select_float_origin.cc MemorySanitizer-x86_64 :: select_origin.cc MemorySanitizer-x86_64 :: sem_getvalue.cc MemorySanitizer-x86_64 :: signal_stress_test.cc MemorySanitizer-x86_64 :: sigwait.cc MemorySanitizer-x86_64 :: stack-origin.cc MemorySanitizer-x86_64 :: stack-origin2.cc MemorySanitizer-x86_64 :: strerror_r-non-gnu.c MemorySanitizer-x86_64 :: strlen_of_shadow.cc MemorySanitizer-x86_64 :: strndup.cc MemorySanitizer-x86_64 :: textdomain.cc MemorySanitizer-x86_64 :: times.cc MemorySanitizer-x86_64 :: tls_reuse.cc MemorySanitizer-x86_64 :: tsearch.cc MemorySanitizer-x86_64 :: tzset.cc MemorySanitizer-x86_64 :: unaligned_read_origin.cc MemorySanitizer-x86_64 :: unpoison_string.cc MemorySanitizer-x86_64 :: use-after-dtor.cc MemorySanitizer-x86_64 :: use-after-free.cc MemorySanitizer-x86_64 :: wcsncpy.cc Expected Passes : 38 Expected Failures : 1 Unsupported Tests : 24 Unexpected Failures: 69

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate [0 comments]