LLVM, the compiler infrastructure used to build Android, contains multiple components that perform static and dynamic analysis. One set of these components that have been used extensively when analyzing Android are the sanitizers, specifically AddressSanitizer, UndefinedBehaviorSanitizer and SanitizerCoverage. These sanitizers are compiler-based instrumentation components contained in compiler-rt that can be used in the development and testing process to push out bugs and make Android better. The sanitizers that are currently available in Android can discover and diagnose many memory misuse bugs and undefined behavior and can give code coverage metrics to ensure that your test suite is as complete as possible.

This blog post details the internals of the current Android sanitizers—AddressSanitizer, UndefinedBehaviorSanitizer and SanitizerCoverage—and show how they can be used within the Android build system.

Address Sanitizer

AddressSanitizer (ASan) is a compiler based instrumentation capability that allows for runtime detection of many types of memory errors in C/C++ code. In Android, the checks for the following classes of memory errors have been tested:

Out-of-bounds accesses to heap, stack and globals

Use-after-free

Use-after-return (runtime flag ASAN_OPTIONS=detect_stack_use_after_return=1)

Use-after-scope (clang flag -fsanitize-address-use-after-scope)

Double-free, invalid free

Android allows for full build instrumentation by ASan, and also allows for ASan instrumentation at the app level through asanwrapper. Instructions for both instrumentation techniques can be found on source.android.com.

AddressSanitizer is built upon two high-level concepts. The first is instrumentation of all memory-related function calls, including alloca, malloc, and free, with information to track memory allocation, free, and usage statistics. This instrumentation allows for ASan to detect invalid memory usage bugs including double-free, use-after scope, return, and free. ASan can also detect reads and writes that occur out of bounds of defined memory regions. It does this by padding all allocated memory buffers and variables. If a read or write to this padding region occurs, ASan catches it and outputs information useful for diagnosing the memory violation. This padding is known as poisoned memory in ASan terms. Here is an example of what poisoned memory padding looks like with stack allocated variables:

Figure 1. Example of ASANified stack variables with an int8_t array of 8 elements, a uint32_t, and an int8_t array of 16 elements. The memory layout after compiling with ASAN is on the right, with padding between each variable. For each stack variable, there are 32 bytes of padding before and after the variable. If the object size of a variable is not 32 bytes, then an additional 32 - n bytes of padding are inserted, where n is the object size.

ASan uses shadow memory to keep track of which bytes are normal memory and which bytes are poisoned memory. Bytes can be marked as completely normal (marked as 0 in shadow memory), completely poisoned (high bit of the corresponding shadow byte is set), or the first k bytes are unpoisoned (shadow byte value is k). If shadow memory indicates a byte is poisoned, then ASan crashes the program and outputs information useful for debugging purposes, including the call stack, shadow memory map, the type of memory violation, what was read or written, PC that caused the violation and the memory contents.

AddressSanitizer: heap-buffer-overflow on address 0xe6146cf3 at pc 0xe86eeb3c bp 0xffe67348 sp 0xffe66f14 WRITE of size 39 at 0xe6146cf3 thread T0 #0 0xe86eeb3b (/system/lib/libclang_rt.asan-arm-android.so+0x64b3b) #1 0xaddc5d27 (/data/simple_test_fuzzer+0x4d27) #2 0xaddd08b9 (/data/simple_test_fuzzer+0xf8b9) #3 0xaddd0a97 (/data/simple_test_fuzzer+0xfa97) #4 0xaddd0fbb (/data/simple_test_fuzzer+0xffbb) #5 0xaddd109f (/data/simple_test_fuzzer+0x1009f) #6 0xaddcbfb9 (/data/simple_test_fuzzer+0xafb9) #7 0xaddc9ceb (/data/simple_test_fuzzer+0x8ceb) #8 0xe8655635 (/system/lib/libc.so+0x7a635) 0xe6146cf3 is located 0 bytes to the right of 35-byte region [0xe6146cd0,0xe6146cf3) allocated by thread T0 here: #0 0xe87159df (/system/lib/libclang_rt.asan-arm-android.so+0x8b9df) #1 0xaddc5ca7 (/data/simple_test_fuzzer+0x4ca7) #2 0xaddd08b9 (/data/simple_test_fuzzer+0xf8b9) SUMMARY: AddressSanitizer: heap-buffer-overflow (/system/lib/libclang_rt.asan-arm-android.so+0x64b3b) Shadow bytes around the buggy address: 0x1cc28d40: fa fa 00 00 00 00 07 fa fa fa fd fd fd fd fd fd 0x1cc28d50: fa fa 00 00 00 00 07 fa fa fa fd fd fd fd fd fd 0x1cc28d60: fa fa 00 00 00 00 00 02 fa fa fd fd fd fd fd fd 0x1cc28d70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x1cc28d80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa =>0x1cc28d90: fa fa fa fa fa fa fa fa fa fa 00 00 00 00[03]fa 0x1cc28da0: fa fa 00 00 00 00 07 fa fa fa 00 00 00 00 03 fa 0x1cc28db0: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x1cc28dc0: fa fa 00 00 00 00 00 02 fa fa fd fd fd fd fd fd 0x1cc28dd0: fa fa 00 00 00 00 00 02 fa fa fd fd fd fd fd fd 0x1cc28de0: fa fa 00 00 00 00 00 02 fa fa fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb

More information on what each part of the report means, and how to make it more user-friendly can be found on the LLVM website and in Github.

Sometimes, the bug discovery process can appear to be non-deterministic, especially when bugs require special setup or more advanced techniques, such as heap priming or race condition exploitation. Many of these bugs are not immediately apparent, and could surface thousands of instructions away from the memory violation that was the actual root cause. As ASan instruments all memory-related functions and pads data with areas that cannot be accessed without triggering an ASan-related callback, memory violations are caught the instant they occur, instead of waiting for a crash-inducing corruption. This is extremely useful in bug discovery and root cause diagnosis. In addition, ASAN is an extremely useful tool for fuzzing, and has been used in many fuzzing efforts on Android.

UBSan

UndefinedBehaviorSanitizer (UBSan) performs compile-time instrumentation to check for various types of undefined behavior. Device manufacturers can include it in their test builds by including LOCAL_SANITIZE:=default-ub in their makefiles or default-ub: true in the sanitize block of blueprint files. While UBSan can detect many undefined behaviors, Android's build system directly supports:

bool

integer-divide-by-zero

return

returns-nonnull-attribute

shift-exponent

unreachable

vla-bound

UBSan's integer overflow checking is also used in Android's build system. UBSan also supports unsigned-integer-overflow, which is not technically undefined behavior, but is included in the sanitizer. These can be enabled in makefiles by setting LOCAL_SANITIZE to signed-integer-overflow, unsigned-integer-overflow, or the combination flag integer, which enables signed-integer-overflow, unsigned-integer-overflow, integer-divide-by-zero, shift-base, and shift-exponent. These can be enabled in blueprint files by setting Misc_undefined to the desired flag. These UBSan targets, especially unsigned-integer-overflow are used extensively in the mediaserver components to eliminate any latent integer overflow vulnerabilities.

The default implementation on Android is to abort the program when undefined behavior is encountered. However, starting in October 2016, UBSan on Android has an optional runtime library that gives more detailed error reporting, including type of undefined behavior encountered, file and source code line information.

In Android.mk files, this is enabled with:

LOCAL_SANITIZE:=unsigned-integer-overflow signed-integer-overflow LOCAL_SANITIZE_DIAG:=unsigned-integer-overflow signed-integer-overflow

And in Android.bp files, it is enabled with:

sanitize: { misc_undefined: [ "unsigned-integer-overflow", "signed-integer-overflow", ], diag: { misc_undefined: [ "unsigned-integer-overflow", "signed-integer-overflow", ], }, },

Here is an example of the information provided by the UBSan runtime library:

external/icu/icu4c/source/common/ucnv.c:1193:23: runtime error: unsigned integer overflow: 4291925010 + 2147483647 cannot be represented in type 'unsigned int' external/icu/icu4c/source/common/cstring.c:288:16: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'uint32_t' (aka 'unsigned int') external/harfbuzz_ng/src/hb-private.hh:894:16: runtime error: unsigned integer overflow: 72 - 55296 cannot be represented in type 'unsigned int' external/harfbuzz_ng/src/hb-set-private.hh:82:24: runtime error: unsigned integer overflow: 32 - 562949953421312 cannot be represented in type 'unsigned long' system/keymaster/authorization_set.cpp:500:37: runtime error: unsigned integer overflow: 6843601868186924302 * 24 cannot be represented in type 'unsigned long'

SanitizerCoverage

Sanitizer tools have a very simple code coverage tool built in. SanitizerCoverage allows for code coverage at the call level, basic block level, or edge level. These can be used as a standalone instrumentation technique or in conjunction with any of the sanitizers, including AddressSanitizer and UndefinedBehaviorSanitizer. To use the new guard-based coverage, set fsanitize-coverage=trace-pc-guard. This causes the compiler to insert __sanitizer_cov_trace_pc_guard(&guard_variable) on every edge. Each edge has its own uint32_t guard_variable. In addition, a module constructor, __sanitizer_cov_trace_pc_guard_init(uint32_t* start, uint32_t* stop) is also generated. All the __sanitizer_cov_ functions should be provided by the user. You can follow the example on Tracing PCs with guards.

In addition to control flow tracing, SanitizerCoverage allows for data flow tracing. This is activated with fsanitize-coverage=trace-cmp and is implemented by instrumenting all switch and comparison instructions with __sanitizer_cov_trace_* functions. Similar functionality exists for integer division and GEP instructions, activated with fsanitize-coverage=trace-div and fsanitize-coverage=trace-gep respectively. This is an experimental interface, is not thread-safe, and could change at any time, however, it is available and functional in Android builds.

During a coverage sanitizer session, the coverage information is recorded in two files, a .sancov file, and a sancov.map file. The first contains all instrumented points in the program, and the other contains the execution trace represented as a sequence of indices into the first file. By default, these files are stored in the current working directory, with one created for each executable and shared object that ran during the execution.

Conclusion

ASan, UBSan, and SanitizerCoverage are just the beginning of LLVM sanitizer use in Android. More LLVM Sanitizers are being integrated into the Android build system. The sanitizers described here can be used as a code health and system stability mechanism and are even currently being used by Android Security to find and prevent security bugs!