C++ programs using exceptions are problematic for binary lifters. The non-local control-flow “throw” and “catch” operations that appear in C++ source code do not map neatly to straightforward binary representations. One could allege that the compiler, runtime, and stack unwinding library collude to make exceptions work. We recently completed our investigation into exceptions and can claim beyond a reasonable doubt that McSema is the only binary lifter that correctly lifts programs with exception-based control flow.

Our work on McSema had to bridge the semantic gap between a program’s high-level language semantics and its binary representation, which required a complete understanding of how exceptions work under the hood. This post is organized into three sections: first, we are going to explain how C++ exceptions are handled in Linux for x86-64 architectures and explain core exception handling concepts. Second, we will show how we used this knowledge to recover exception information at the binary level. And third, we will explain how to emit exception information for the LLVM ecosystem.

A Short Primer in C++ Exception Handling

In this section, we will use a small motivating example to demonstrate how C++ exceptions work at the binary level and discuss exception semantics for Linux programs running on x86-64 processors. While exceptions work differently on different operating systems, processors, and languages, many of the core concepts are identical.

Exceptions are a programming language construct that provide a standardized way to handle abnormal or erroneous situations. They work by automatically redirecting execution flow to a special handler called the exception handler when such an event occurs. Using exceptions, it is possible to be explicit about ways in which operations can fail and how those failures should be handled. For example, some operations like object instantiation and file processing can fail in multiple ways. Exception handling allows the programmer to handle these failures in a generic way for large blocks of code, instead of manually verifying each individual operation.

Exceptions are a core part of C++, although their use is optional. Code that may fail is surrounded in a try {…} block, and the exceptions that may be raised are caught via a catch {…} block. Signalling of exceptional conditions is triggered via the throw keyword, which raises an exception of a specific type. Figure 1 shows a simple program that uses C++ exception semantics. Try building the program yourself (clang++ -o exception exception.cpp) or look at the code in the Compiler Explorer.

#include <iostream> #include <vector> #include <stdexcept> int main(int argc, const char *argv[]) { std::vector myVector(10); int var = std::atoi(argv[1]); try { if( var == 0 ) { throw std::runtime_error("Runtime error: argv[1] cannot be zero."); } if(argc != 2) { throw std::out_of_range("Supply one argument."); } myVector.at( var ) = 100; while(true) { new int [100000000ul]; } } catch (const std::out_of_range& e) { std::cerr << "Out of range error: " << e.what() << '

'; return 1; } catch (const std::bad_alloc& e) { std::cerr << "Allocation failed: " << e.what() << '

'; return 1; } catch (...) { std::cerr << "Unknown error.

"; return 1; } return 0; }

This simple program can explicitly throw std::runtime_error and std::out_of_range exceptions based on input arguments. It also implicitly throws the std::bad_alloc exception when it runs out of memory. The program installs three exception handlers: one for std::out_of_range , one for std::bad_alloc , and a catch-all handler for generic unknown exceptions. Run the following sample inputs to trigger the three exceptional conditions:

Scenario 1: ./exception 0 Unknown error. The program checks for the input argument and it does not expect `0` as the input and throws the std::runtime_error exception. Scenario 2: ./exception 0 1 Out of range error: vector::_M_range_check The program expects one argument as the input and checks for it. If the number of input arguments are more than one it throws std::out_of_range exception. Scenario 3: ./exception 1 Allocation failed: std::bad_alloc For the input other than `0`, the program does the large memory allocation which can fail. It can happen during runtime and may go unnoticed. The memory allocator in such cases throws the std::bad_alloc exception to safely terminate the program.

Let’s look at the same program at the binary level. Compiler Explorer shows the binary code generated by the compiler for this program. The compiler translates the throw statements into a pair of calls to libstdc++ functions ( __cxa_allocate_exception and __cxa_throw ) that allocates the exception structure and start the process of cleaning up local objects in the scopes leading up to the exception stack unwinding (see lines 40-48 in Compiler Explorer).

Stack unwinding: Removes the stack frame of the exited functions from the process stack.

The catch statements are translated into functions that handle the exception and perform clean-up operations called the landingpad. The compiler generates an exception table that ties together everything the operating system needs to dispatch exceptions, including exception type, associated landing pad, and various utility functions.

landingpad: User code intended to catch an exception. It gains control from the exception runtime via the personality function, and either merges into the normal user code or returns to the runtime by resuming or raising a new exception.

When an exception occurs, the stack unwinder cleans up previously allocated variables and call the catch block. The unwinder:

Calls the libstdc++ personality function. First, the stack unwinder calls a special function provided by libstdc++ called the personality function. The personality function will determine whether the raised exception is handled by a function somewhere on the call stack. In high-level terms, the personality function determines whether there is a catch block that should be called for this exception. If no handler can be located (i.e. the exception is unhandled), the personality function terminates the program by calling std::terminate . Cleans up allocated objects. To cleanly call the catch block, the unwinder must first clean up (i.e. call destructors for each allocated object) after every function called inside the try block. The unwinder will iterate through the call stack, using the personality function to identify a cleanup method for each stack frame. If there are any cleanup actions, the unwinder calls the associated cleanup code. Executes the catch block. Eventually the unwinder will reach the stack frame of the function containing the exception handler, and execute the catch block. Releases memory. Once the catch block completes, a cleanup function will be called again to release memory allocated for the exception structure.

For the curious, more information is available in the comments and source code for libgcc’s stack unwinder.

personality function: A libstdc++ function called by the stack unwinder. It determines whether there is a catch block for a raised exception. If none is found, the program is terminated with std::terminate.

Recovering Exception Information

Recovering exception-based control flow is a challenging proposition for binary analysis tools like McSema. The fundamental data is difficult to assemble, because exception information is spread throughout the binary and tied together via multiple tables. Utilizing exception data to recover control flow is hard, because operations that affect flow, like stack unwinding, calls to personality functions, and exception table decoding happen outside the purview of the compiled program.

Here’s a quick summary of the end goal. McSema must identify every basic block that may raise an exception (i.e. the contents of a try block) and associate it with the appropriate exception handler and cleanup code (i.e. the catch block or landing pad). This association will then be used to re-generate exception handlers at the LLVM level. To associate blocks with landing pads, McSema parses the exception table to provide these mappings.

We’re going to go into some detail about the exception table. It’s important to understand, because this is the main data structure that allows McSema to recover exception-based control flow.

The Exception Table

The exception table provides language runtimes the information to support exceptions. It has two levels: the language-independent level and the language-specific level. Locating stack frames and restoring them is language agnostic, and is therefore stored in the independent level. Identifying the frame that handles the exceptions and transferring control to it is language dependent, so this is stored in the language-specific level.

Language-Independent Level

The table is stored in special sections in the binary called .eh_frame and .eh_framehdr . The .eh_frame section contains one or more call frame information records encoded in the DWARF debug information format. Each frame information record contains a Common Information Entry (CIE) record, followed by one or more Frame Descriptor Entry (FDE) records. Together they describe how to unwind the caller based on the current instruction pointer. More details are described in the Linux Standards Base documentation.

Language-Specific Level

The language-specific data area (LSDA) contains pointers to related data, a list of call sites, and a list of action records. Each function has its own LSDA, which is provided as the augmentation data of the Frame Descriptor Entry (FDE). Information from the LSDA is essential to recovering C++ exception information, and in translating it to LLVM semantics.

The LSDA header describes how exception information applies to language-specific procedure fragments. Figure 4 shows the LSDA in more detail. There are two fields defined in the LSDA header that McSema needs to recover exception information:

The landing pad start pointer: A relative offset to the start of the landing pad code.

The types table pointer: A relative offset to the types table, which describes exception types handled by the catch clauses for this procedure fragment.

Following the LSDA header, the call site table lists all call sites that may throw an exception. Each entry in the call site table indicates the position of the call site, the position of the landing pad, and the first action record for that call site. A missing entry from the call site table indicates that a call should not throw an exception. Information from this table will be used by McSema during the translation stage to emit proper LLVM semantics for call sites that may throw exceptions.

The action table follows the call site table in the LSDA and specifies both catch clauses and exception specifications. By exception specifications here we mean the much maligned C++ feature called “exception specifications”, that enumerates the exceptions a function may throw. The two record types have the same format and are distinguished solely by the first field of each entry. Positive values for this field specify types used in catch clauses. Negative values specify exception specifications. Figure 3 shows the action table with a catch clauses (red), catch-all clause (orange), an exception specification (blue). (The exception specification feature has been deprecated in C++17.) Because this feature is being deprecated and rarely used, currently McSema does not handle exception specifications.

.gcc_except_table:4022CF db 7Fh; ar_filter[1]: -1( exception spec index = 4022EC ) .gcc_except_table:4022D0 db 0 ; ar_next[1]: 0 (end) .gcc_except_table:4022D1 db 0 ; ar_filter[2]: 0 (cleanup) .gcc_except_table:4022D2 db 7Dh; ar_next[2]: -3 (next: 1 => 004022CF) .gcc_except_table:4022D3 db 4 ; ar_filter[3]: 4 ( catch typeinfo = 000000 ) .gcc_except_table:4022D4 db 0 ; ar_next[3]: 0 (end) .gcc_except_table:4022D5 db 1 ; ar_filter[4]: 1 ( catch typeinfo = 00603280 ) .gcc_except_table:4022D6 db 7Dh; ar_next[4]: -3 (next: 3 => 004022D3) .gcc_except_table:4022D7 db 3 ; ar_filter[5]: 3 ( catch typeinfo = 603230 ) .gcc_except_table:4022D8 db 7Dh; ar_next[5]: -3 (next: 4 => 004022D5)

Lifting Exception Information

So far we have looked at how exceptions in C++ work at a low level, how exception information is stored, and how McSema recovers exception based control flow. Now we will look at how McSema lifts this control flow to LLVM.

To lift exception information, the exception and language semantics described in the last section have to be recovered from the binary and translated into LLVM. The recovery and translation is a three-phase process that required updating control flow graph (CFG) recovery, lifting, and runtime components of McSema.

McSema’s translation stage uses the information gleaned from CFG recovery to generate LLVM IR that handles exception semantics. To ensure the final binary will execute like the original, the following steps must happen:

McSema must associate exception handlers and cleanup methods with blocks that raise exceptions. Functions that throw exceptions must be called via LLVM’s invoke instruction versus the call instruction.

Stack unwinding has to be enabled for function fragments that raise exceptions. This is complicated by the fact that translated code may have two stacks: a native stack (used for calling external APIs) and a lifted stack.

McSema must ensure there is a smooth transition between lifted code and the language runtime. Handlers called directly by the language runtime must serialize processor state into a structure expected by lifted code.

Associating Blocks and Handlers

The initial association between blocks that may throw exceptions and the handlers for those exceptions is performed during CFG recovery, via information extracted from the exception table. This association is required because the translator must ensure functions that may throw exceptions are called via LLVM’s invoke semantics and not the typical call instruction. The invoke instruction has two continuation points: normal flow when call succeeds and exception flow (i.e., the exception handler) if the function raises an exception (Figure 4). The replacement of call with invoke must cover every invocation of that function. Any call of the function convinces the optimizer the function doesn’t throw and does not need an exception table.

%1403 = call i64 @__mcsema_get_stack_pointer() store i64 %1403, i64* %stack_ptr_var %1404 = call i64 @__mcsema_get_frame_pointer() store i64 %1404, i64* %frame_ptr_var %1405 = load %struct.Memory*, %struct.Memory** %MEMORY %1406 = load i64, i64* %PC %1407 = invoke %struct.Memory* @ext_6032a0__Znam(%struct.State* %0, i64 %1406, %struct.Memory* %1405) to label %block_40119f unwind label %landingpad_4012615

Unwinding of the Stack

When an exception occurs, control transfers from the throw statement to the first catch statement that can handle the exception. Before the transfer, variables defined in function scope must be properly destroyed. This is called stack unwinding.

McSema uses two different stacks: one for lifted code, and one for native code (i.e. external functions). The split stack puts limitations on stack unwinding, since the native execution (i.e. libstdc++ API) doesn’t have a full view of the stack. To support stack unwinding, we added a new flag, --abi-libraries , which enables the usage of the same stack for lifted and native code execution.

The --abi-libraries flag enables usage of the same stack for native and lifted code by removing the need for lifted code to native transitions. McSema needs to transition stacks so that an external function that does not know about McSema can see CPU state as it was in the original program. Application binary interface (ABI) libraries, which provide external function signatures, including the return value, argument type, and argument count, allow lifted code to directly call native functions on the same stack. Figure 5 shows a snapshot of function signatures defined via ABI libraries.

declare i8* @__cxa_allocate_exception(i64) #0 declare void @__cxa_free_exception(i8*) #0 declare i8* @__cxa_allocate_dependent_exception() #0 declare void @__cxa_free_dependent_exception(i8*) #0 declare void @__cxa_throw(i8*, %"class.std::type_info"*, void (i8*)*) #0 declare i8* @__cxa_get_exception_ptr(i8*) #0 declare i8* @__cxa_begin_catch(i8*) #0 declare void @__cxa_end_catch() #0

Exception handling at runtime

Exception handlers and cleanup methods are called by the language runtime, and are expected to follow a strict calling convention. Lifted code does not follow standard calling convention semantics, because it expresses the original instructions as operations on CPU state. To support these callbacks, we implemented a special adaptor that converts a native state into a machine context usable by lifted code. Special care has been taken to preserve the RDX register, which stores the type index of the exception.

There is one more trick to emitting functional exception handlers: proper ordering of type indices. Recall that our motivating example (Figure 1) has three exception handlers: std::out_of_range , std::bad_alloc , and the catch-all handler. Each of these handlers are assigned a type index, say 1, 2, 3 respectively (Figure 6a), meaning that the original program expects type index 1 to corresponds to std::out_of_range .

.gcc_except_table:402254 db 3 ; ar_filter[1]: 3 (catch typeinfo = 000000) .gcc_except_table:402255 db 0 ; ar_next[1]: 0 (end) .gcc_except_table:402256 db 2 ; ar_filter[2]: 2 (catch typeinfo = 603280) .gcc_except_table:402257 db 7Dh ; ar_next[2]: -3 (next: 1 => 402254) .gcc_except_table:402258 db 1 ; ar_filter[3]: 1 (catch typeinfo = 603230) .gcc_except_table:402259 db 7Dh ; ar_next[3]: -3 (next: 2 => 402256) .gcc_except_table:40225A db 0 .gcc_except_table:40225B db 0 .gcc_except_table:40225C dd 0 ; Type index 3 .gcc_except_table:402260 dd 603280h; Type index 2 .gcc_except_table:402264 dd 603230h; Type index 1

.gcc_except_table:41A78E db 1 ; ar_filter[1]: 1 (catch typeinfo = 000000) .gcc_except_table:41A78F db 0 ; ar_next[1]: 0 (end) .gcc_except_table:41A790 db 2 ; ar_filter[2]: 2 (catch typeinfo = 61B450) .gcc_except_table:41A791 db 7Dh ; ar_next[2]: -3 (next: 1 => 0041A78E) .gcc_except_table:41A792 db 3 ; ar_filter[3]: 3 (catch typeinfo = 61B4A0) .gcc_except_table:41A793 db 7Dh ; ar_next[3]: -3 (next: 2 => 0041A790) .gcc_except_table:41A794 dd 61B4A0h; Type index 3 .gcc_except_table:41A798 dd 61B450h; Type index 2 .gcc_except_table:41A79C dd 0 ; Type index 1

During the lifting process McSema recreates exception handlers used in the program. The type index assigned to each handler is generated at compile time. When lifted bitcode is compiled into a new binary, the type indices could be, and often are, reassigned. For example, std::out_of_range could get type index 3 in a new binary (Figure 6b). This would cause the lifted binary to run the catch-all handler when std::out_of_range is thrown!

To ensure the right exception handler is called, McSema generates a static map (see gvar_landingpad_401133 in Figure 7) of original type indices to new type indices, and fixes the type index during ladningpad passthrough. The landingpad passthrough is a function that is automatically generated by McSema. Not only does it ensure the type index is correct, it also transitions between lifted and native state.

Upon being called, the passthrough saves native execution state, loads lifted state, and calls any exception handlers (that have been lifted, and expect lifted state). When the passthrough returns (in case the exception wasn’t handled), it must do the reverse, and transition from lifted to native state to return into runtime library code. Figure 7 shows the landingpad passthrough generated for our motivating example. The generated passthrough code gets the type index from the RDX register using the function __mcsema_get_type_index . It fixes and restores the machine context of the lifted execution using the function __mcsema_exception_ret . The wrapper instruction across the invoke statement saves the stack and frame pointer in the function context.

%landingpad_4011336 = landingpad { i8*, i32 } catch i8* @"_ZTISt13runtime_error@@GLIBCXX_3.4" catch i8* @"_ZTISt12out_of_range@@GLIBCXX_3.4" catch i8* null %4021 = call i64 @__mcsema_get_type_index() %4022 = getelementptr [4 x i32], [4 x i32]* @gvar_landingpad_401133, i32 0, i64 %4021 %4023 = load i64, i64* %stack_ptr_var %4024 = load i64, i64* %frame_ptr_var %4025 = load i32, i32* %4022 call void @__mcsema_exception_ret(i64 %4023, i64 %4024, i32 %4025) br label %block_401133

With all of these pieces in place, McSema can finally translate C++ programs that use exceptions into LLVM.

Conclusion

To our knowledge, McSema is the only binary lifter to handle C++ exceptions, which are common throughout C++ software of any complexity. As we have shown, exception-based control flow recovery and accurate translation is an extremely complex topic and difficult to implement correctly. Implementing exception handling touched all parts of McSema, including new challenges for both control flow recovery and translation. The fine details, such as type index re-ordering and ensuring every call is replaced with an invoke all had to be discovered the hard way by debugging subtle and frustrating failures.

We are continuing to develop and enhance McSema, and have more to share about exciting new features. If you are interested in McSema, try it out, contribute (we love open source contributions!), and talk to us in the binary-lifting channel on the Empire Hacking Slack.

Share this: Twitter

LinkedIn

Reddit

Telegram

Facebook

Pocket

Email

Print

