Posted by Jeff Vander Stoep, Android Security & Privacy Team and Chong Zhang, Android Media Team

Android Q Beta versions are now publicly available. Among the various new features introduced in Android Q are some important security hardening changes. While exciting new security features are added in each Android release, hardening generally refers to security improvements made to existing components.

When prioritizing platform hardening, we analyze data from a number of sources including our vulnerability rewards program (VRP). Past security issues provide useful insight into which components can use additional hardening. Android publishes monthly security bulletins which include fixes for all the high/critical severity vulnerabilities in the Android Open Source Project (AOSP) reported through our VRP. While fixing vulnerabilities is necessary, we also get a lot of value from the metadata - analysis on the location and class of vulnerabilities. With this insight we can apply the following strategies to our existing components:

Contain: isolating and de-privileging components, particularly ones that handle untrusted content. This includes: Access control: adding permission checks, increasing the granularity of permission checks, or switching to safer defaults (for example, default deny). Attack surface reduction: reducing the number of entry/exit points (i.e. principle of least privilege). Architectural decomposition: breaking privileged processes into less privileged components and applying attack surface reduction.

Mitigate: Assume vulnerabilities exist and actively defend against classes of vulnerabilities or common exploitation techniques.

Here’s a look at high severity vulnerabilities by component and cause from 2018:

Most of Android’s vulnerabilities occur in the media and bluetooth components. Use-after-free (UAF), integer overflows, and out of bounds (OOB) reads/writes comprise 90% of vulnerabilities with OOB being the most common.

A Constrained Sandbox for Software Codecs

In Android Q, we moved software codecs out of the main mediacodec service into a constrained sandbox. This is a big step forward in our effort to improve security by isolating various media components into less privileged sandboxes. As Mark Brand of Project Zero points out in his Return To Libstagefright blog post, constrained sandboxes are not where an attacker wants to end up. In 2018, approximately 80% of the critical/high severity vulnerabilities in media components occurred in software codecs, meaning further isolating them is a big improvement. Due to the increased protection provided by the new mediaswcodec sandbox, these same vulnerabilities will receive a lower severity based on Android’s severity guidelines.

The following figure shows an overview of the evolution of media services layout in the recent Android releases.

Prior to N, media services are all inside one monolithic mediaserver process, and the extractors run inside the client.

In N, we delivered a major security re-architect, where a number of lower-level media services are spun off into individual service processes with reduced privilege sandboxes. Extractors are moved into server side, and put into a constrained sandbox. Only a couple of higher-level functionalities remained in mediaserver itself.

In O, the services are “treblized,” and further deprivileged that is, separated into individual sandboxes and converted into HALs. The media.codec service became a HAL while still hosting both software and hardware codec implementations.

In Q, the software codecs are extracted from the media.codec process, and moved back to system side. It becomes a system service that exposes the codec HAL interface. Selinux policy and seccomp filters are further tightened up for this process. In particular, while the previous mediacodec process had access to device drivers for hardware accelerated codecs, the software codec process has no access to device drivers.

With this move, we now have the two primary sources for media vulnerabilities tightly sandboxed within constrained processes. Software codecs are similar to extractors in that they both have extensive code parsing bitstreams from untrusted sources. Once a vulnerability is identified in the source code, it can be triggered by sending a crafted media file to media APIs (such as MediaExtractor or MediaCodec). Sandboxing these two services allows us to reduce the severity of potential security vulnerabilities without compromising performance.

In addition to constraining riskier codecs, a lot of work has also gone into preventing common types of vulnerabilities.

Bound Sanitizer

Incorrect or missing memory bounds checking on arrays account for about 34% of Android’s userspace vulnerabilities. In cases where the array size is known at compile time, LLVM’s bound sanitizer (BoundSan) can automatically instrument arrays to prevent overflows and fail safely.

BoundSan instrumentation

BoundSan is enabled in 11 media codecs and throughout the Bluetooth stack for Android Q. By optimizing away a number of unnecessary checks the performance overhead was reduced to less than 1%. BoundSan has already found/prevented potential vulnerabilities in codecs and Bluetooth.

More integer sanitizer in more places

Android pioneered the production use of sanitizers in Android Nougat when we first started rolling out integer sanization (IntSan) in the media frameworks. This work has continued with each release and has been very successful in preventing otherwise exploitable vulnerabilities. For example, new IntSan coverage in Android Pie mitigated 11 critical vulnerabilities. Enabling IntSan is challenging because overflows are generally benign and unsigned integer overflows are well defined and sometimes intentional. This is quite different from the bound sanitizer where OOB reads/writes are always unintended and often exploitable. Enabling Intsan has been a multi year project, but with Q we have fully enabled it across the media frameworks with the inclusion of 11 more codecs.

IntSan Instrumentation

IntSan works by instrumenting arithmetic operations to abort when an overflow occurs. This instrumentation can have an impact on performance, so evaluating the impact on CPU usage is necessary. In cases where performance impact was too high, we identified hot functions and individually disabled IntSan on those functions after manually reviewing them for integer safety.

BoundSan and IntSan are considered strong mitigations because (where applied) they prevent the root cause of memory safety vulnerabilities. The class of mitigations described next target common exploitation techniques. These mitigations are considered to be probabilistic because they make exploitation more difficult by limiting how a vulnerability may be used.

Shadow Call Stack

LLVM’s Control Flow Integrity (CFI) was enabled in the media frameworks, Bluetooth, and NFC in Android Pie. CFI makes code reuse attacks more difficult by protecting the forward-edges of the call graph, such as function pointers and virtual functions. Android Q uses LLVM’s Shadow Call Stack (SCS) to protect return addresses, protecting the backwards-edge of control flow graph. SCS accomplishes this by storing return addresses in a separate shadow stack which is protected from leakage by storing its location in the x18 register, which is now reserved by the compiler.

SCS Instrumentation

SCS has negligible performance overhead and a small memory increase due to the separate stack. In Android Q, SCS has been turned on in portions of the Bluetooth stack and is also available for the kernel. We’ll share more on that in an upcoming post.

eXecute-Only Memory

Like SCS, eXecute-Only Memory (XOM) aims at making common exploitation techniques more expensive. It does so by strengthening the protections already provided by address space layout randomization (ASLR) which in turn makes code reuse attacks more difficult by requiring attackers to first leak the location of the code they intend to reuse. This often means that an attacker now needs two vulnerabilities, a read primitive and a write primitive, where previously just a write primitive was necessary in order to achieve their goals. XOM protects against leaks (memory disclosures of code segments) by making code unreadable. Attempts to read execute-only code results in the process aborting safely.

Tombstone from a XOM abort

Starting in Android Q, platform-provided AArch64 code segments in binaries and libraries are loaded as execute-only. Not all devices will immediately receive the benefit as this enforcement has hardware dependencies (ARMv8.2+) and kernel dependencies (Linux 4.9+, CONFIG_ARM64_UAO). For apps with a targetSdkVersion lower than Q, Android’s zygote process will relax the protection in order to avoid potential app breakage, but 64 bit system processes (for example, mediaextractor, init, vold, etc.) are protected. XOM protections are applied at compile-time and have no memory or CPU overhead.

Scudo Hardened Allocator

Scudo is a dynamic heap allocator designed to be resilient against heap related vulnerabilities such as:

Use-after-frees: by quarantining freed blocks.

Double-frees: by tracking chunk states.

Buffer overflows: by check summing headers.

Heap sprays and layout manipulation: by improved randomization.

Scudo does not prevent exploitation but rather proactively manages memory in a way to make exploitation more difficult. It is configurable on a per-process basis depending on performance requirements. Scudo is enabled in extractors and codecs in the media frameworks.

Tombstone from Scudo aborts

Contributing security improvements to Open Source

AOSP makes use of a number of Open Source Projects to build and secure Android. Google is actively contributing back to these projects in a number of security critical areas:

Thank you to Ivan Lozano, Kevin Deus, Kostya Kortchinsky, Kostya Serebryany, and Mike Antares for their contributions to this post.