Table of Contents:



This post presents the reader ways to abuse forced inlining, which is

supported by both GCC and Microsoft Visual C/C++ compiler.

Throughout the next paragraphs we will introduce the reader to inlining,

its purpose, several ways to abuse forced inlining, and some additional

notes.

Finally, we present the reader source and binaries of the methods

described.

Before we get any further, here is a quick introduction to inlining.

From Wikipedia[1]:



In various versions of the C and C++ programming languages, an

inline function is a function upon which the compiler has been

requested to perform inline expansion. In other words, the programmer

has requested that the compiler insert the complete body of the

function in every place that the function is called, rather than

generating code to call the function in the one place it is defined.



For those of us, that are still unsure about inlining, here is a simple

example.

#include <stdio.h> inline int MAX(int a, int b) { return a < b ? b : a; } int main(void) { printf("max(2, 3): %d

", MAX(2, 3)); }

Although this might be a bad example, because the compiler might have

inlined it anyway (without specifying the inline attribute), it’s a very

easy one. So basically what happens, assuming the compiler inlines the

MAX function, is the following.

#include <stdio.h> int main(void) { printf("max(2, 3): %d

", 2 < 3 ? 3 : 2); }

As you can see, instead of calling the MAX function, the compiler

has rewritten the main function in such a way that it inlines the

MAX function. Nowadays compilers are smart enough to see that

the equation evaluates to three, so if you analyze the binary, it will

probably say three instead of an equation, but the point should be clear

by now.

So, now we know what inlining does, forced inlining is quite

self-explanatory. It enforces the compiler to inline a certain function.

Normally one uses inlining to gain increased performance, for example in

performance critical code sections, it’s useful to inline a function such

as max (returns the biggest of two given numbers)

because the cpu does not need perform all the work involved with

calling a function (storing/loading the return address and

possible basepointer), in addition to ensuring that the page

containing the called function is loaded into cache.

Besides optimalizations regarding caching, the compiler might also able to

optimize further when inlining, this happens for example when one (or

more) of the parameters to the function are known compile-time.

However, it should be noted that inlining relatively bigger functions

(those with more than a few lines of code) can be fairly expensive. This

is because the CPU can cache functions (or, actually, pages that the

functions are located on) that are commonly called. But if

such function is inlined multiple times, then the CPU will not be able to

recognize that, and it will not be able to cache the particular function

(e.g. if max is inlined several times, the CPU will not recognize

one from another.)

So, as usual, the techniques discussed here will drain some performance.

As mentioned earlier, both GCC and MSVC support forced inlining. In GCC

we give a function the __attribute__((always_inline)) attribute,

whereas MSVC requires us to specify __forceinline. In order to

solve this without worrying too much about the compiler, we create a

simple macro which expands to the correct attribute depending on the

compiler. This macro looks like the following.

#ifdef __MSVC__ #define FORCEDINLINE __forceinline #else #define FORCEDINLINE __attribute__((always_inline)) #endif

Now we’ve seen what inlining does, and how we enforce the compiler to

inline a function, we’re up for some more tricks which we use in order to

apply our obfuscations.

Although we will discuss methods to obfuscate binaries, we do not want to

obfuscate the source, as obfuscating the source makes it unreadable etc.

So, what we do is, we overload functions. And we do this through macro’s.

By overloading functions we have the ability to create transparant

obfuscations, that is, obfuscations while barely adjusting the original

source.

So, instead of asking someone to rename all occurences of MessageBoxA() to

MessageBoxA_Obfuscated(), or similar, we do this transparantly. This also

means that the obfuscations can be disabled simply by not including a

header file (assuming the obfuscations are defined in a header file.)

Function overloading does however bring some small problems, but they are

quite easily overcome and, fortunately for us, when they do go wrong, it’s

easy to spot what’s going wrong (you’ll get errors compile-time.)

What we do is the following. We make a macro for each function that we

wish to obfuscate. This is the tricky part, because the macro should be

defined after the function has been declared (or implemented, for

that matter.) There are two things we can do. We can either make a header

file containing each obfuscation (which is then included into our C file

of choice), or define the obfuscations after the function has been

declared/implemented (e.g. put the function on the top of a source file,

followed by the obfuscation code.)

Although the latter method is a bit ugly; you essentially mix the

real code with obfuscation code. The first method is

reasonable and doesn’t make you puke while developing further (because all

obfuscations can be developed in a seperate file.) The first method does

however have one requirement: if you have all obfuscations in a

single header file, then you have to put a #undef funcname before

each function that is declared as obfuscation. Otherwise you get one of

those funky compiler errors.

Note that, after undefining a macro (using

#undef), the obfuscation will not be applied to function calls to

this particular function which are made after the function

definition in the same source file.

So, let’s get to some example code to illustrate this method (we use the

second method; defining obfuscation after the declaration/implementation

of the function.)

int foo(int a, int b) { return a + b; } FORCEDINLINE int foo_obfuscated(int a, int b) { Sleep(1); int ret = foo(a, b); Sleep(1); return ret; } #define foo foo_obfuscated int main(void) { printf("foo(2, 3): %d

", foo(2, 3)); }

As you can see, we have redirected execution flow from the foo

function to foo_obfuscated. From here the foo_obfuscated

function, which has the forced inline attribute, can do anything before

and after calling the original function. In the example it sleeps for a

millisecond before and after calling the original function.

So, because we’ve specified the forced inline attribute on the

foo_obfuscated function, the compiler will interpret this code as

something like the following.

int foo(int a, int b) { return a + b; } int main(void) { Sleep(1); int ret = foo(2, 3); Sleep(1); printf("foo(2, 3): %d

", ret); }

As you can see, the foo_obfuscated code is entirely inlined into

the main function. Now let’s see an example in which the

obfuscation is defined, but not applied (see the note about

#undef a few lines up.)

// implementation of the obfuscation for `foo' FORCEDINLINE int foo_obfuscated(int a, int b) { Sleep(1); int ret = foo(a, b); Sleep(1); return ret; } #define foo foo_obfuscated ... snip ... // implementation of the `foo' function // we have to undefine the foo macro, // otherwise we'd get compiler errors #undef foo int foo(int a, int b) { return a + b; } ... snip ... int main(void) { // this function call to `foo' is NOT obfuscated printf("foo(2, 3): %d

", foo(2, 3)); }

This is really all there is to it, so now we’re

done with the introduction, it’s time to abuse forced inline.

In the following paragraphs we will see how one abuses forced inline.

These methods mainly represent simple obfuscation techniques, but when

combined, they can be very, very annoying towards Reverse Engineers and/or

static code analysis tools.

If you’re unsure about a certain method, or if you’d like to see what kind

of horror the compiler makes using it, then load the binary (which

can be found in the Source and Binaries

section) in a tool such as IDA Pro [2].

Note that in the examples below, we will be using MSVC. This means that,

when porting the examples to GCC, any inline assembly will have to be

rewritten.

The first method which we will investigate is the following. Instead of

calling the original function directly, we insert some garbage code around

the function call. This is a well-known technique used in so-called

polymorphic viruses. Let’s take a look at the following example.

int foo(int a, int b) { return a + b; } FORCEDINLINE int foo_obfuscated(int a, int b) { __asm { jmp short lbl _emit 0xb8 lbl: } int ret = foo(a, b); __asm { jmp short lbl2 _emit 0xb8 lbl2: } return ret; } #define foo foo_obfuscated int main(void) { printf("foo(2, 3): %d

", foo(2, 3)); }

In this example we surrounded the call to foo with some garbage

code. The short jump is a simple jump which tells the CPU to step over the

emitted byte (an emitted byte is placed directly into the binary.) The

emitted byte is the first byte for the “mov eax, imm32″ instruction. In

other words, if a disassembler disassembles the 0xb8 byte, it will take

the following four bytes as well (as they are required as the immediate

operand.)

From there the rest of the disassembly will be corrupted,

because the disassembler processed five bytes for the 0xb8 byte, while we

inserted only one byte. The first four bytes of the following

instruction(s) have been processed as part of the mov instruction,

instead of the original instructions they were meant to be.

This is an easy trick and therefore it’s also easy to ignore it when

disassembling, however, in combination with other tricks it can be quite

annoying.

In this technique we will redirect execution to an obfuscation helper

function, which we will give a few extra parameters (these are generated

in the inlined function.) The inlined function will call the obfuscation

helper function, which is not inlined. Together the obfuscation and

obfuscation helper functions do some stuff that is not even remotely

useful, such as allocating and freeing heap objects. From the helper

function it will call the original function. Without further ado, let’s

dive into some example code.

int foo(int a, int b) { return a + b; } int foo_obfuscated_helper(void *mem, int b, int size, int a) { free(mem); return foo(a, b); } FORCEDINLINE int foo_obfuscated(int a, int b) { return foo_obfuscated_helper(malloc(42), b, 42, a); } #define foo foo_obfuscated int main(void) { printf("foo(2, 3): %d

", foo(2, 3)); }

The example code presented above does nothing more (useful) than the

original function, however, we’ve managed to add useless heap routines,

an unused parameter and we’ve shuffled the a and b

parameters, which is really annoying.

It is not quite uncommon to implement a simple stand-alone library in C

which will only be accessible using a simple API. Take for example a

simple memory manager, a wrapper around reading and/or writing of files,

etc. In the following example we’ll demonstrate encryption of the

context variabele of a memory manager, that is, the context is encrypted

and decrypted for every function call regarding this particular API.

Although it’s far from “real” encryption, it’s still useful because the

state of the memory manager context will be semi-encrypted in times when

it’s not used. For the entire code, please refer to the source code, here

we’ll only see useful snippets of the code.

// // memory manager api header include // typedef struct _memory_t { // memory size used int used; // memory size left int left; } memory_t; // yes, there is no `free' memory API.. void memory_init(memory_t *mem, int maxsize); void *memory_get(memory_t *mem, int size); void memory_status(memory_t *mem, int *used, int *left); void *memory_destroy(memory_t *mem); ... snip ... // // obfuscation header include // // simple encryption/decryption function which // xor's the input with 0x42. FORCEDINLINE _object_crypt(void *obj, int size) { for (int i = 0; i < size; i++) { ((unsigned char *) obj)[i] ^= 0x42; } } FORCEDINLINE void memory_init_obf(memory_t *mem, int maxsize) { memory_init(mem, maxsize); // encrypt memory context _object_crypt(mem, sizeof(memory_t)); } FORCEDINLINE void *memory_get_obf(memory_t *mem, int size) { // decrypt memory context _object_crypt(mem, sizeof(memory_t)); // call original function void *ret = memory_get(mem, size); // encrypt memory context _object_crypt(mem, sizeof(memory_t)); // return the.. return value.. return ret; } FORCEDINLINE void memory_status_obf(memory_t *mem, int *used, int *left) { // decrypt memory context _object_crypt(mem, sizeof(memory_t)); // call original function memory_status(mem, used, left); // encrypt memory context _object_crypt(mem, sizeof(memory_t)); } FORCEDINLINE void memory_destroy_obf(memory_t *mem) { // decrypt memory context _object_crypt(mem, sizeof(memory_t)); // call original function memory_destroy(mem); // no need to encrypt, context is no longer valid anyway } #define memory_init memory_init_obf #define memory_get memory_get_obf #define memory_status memory_status_obf #define memory_destroy memory_destroy_obf ... snip ... // // real code here // int main(void) { memory_t mem; memory_init(&mem, 1000); void *a = memory_get(&mem, 10); void *b = memory_get(&mem, 20); // do something with `a' and `b' // get the status (and compare it with // the "encrypted" status) int used, left; memory_status(&mem, &used, &left); printf("Real status: %d %d

", used, left); printf("Encrypted status: %d, %d

", mem.used, mem.left); memory_destroy(&mem); }

Executing this will result in the following.

$ ./encrypted_context Real status: 30 970 Encrypted status: 1111638620, 1111638408

It should be clear by now that this is a pretty interesting technique. You

can “encrypt” virtually anything; sockets, handles, or even

strings (although you won’t be able to do stuff like ptr[0] in that case.)

As an additional note, here is the output difference between the original

Graph Overview (from IDA), and the one after applying the obfuscations.

Although the methods proposed here are interesting and perhaps useful.

Please do make sure you apply them correctly and, in the case of an API

library, for the entire library; not providing context

encryption/decryption for some function in the entire library results in

undefined behaviour for the functions that have not been obfuscated. The

same goes for raw access to an encrypted data structure (as can be seen in

the example above, where the Encrypted status returns garbage.)

Also keep in mind that undefined behaviour will occur when using an

encrypted object in a multi-threaded manner. (E.g. two threads

might decrypt the object at the same time.)

Even though the examples provided are really, really basic. They show what

one could do using forced inline and, when combining the methods, they can

become quite a pain in the arse.

Source and Binaries for all Forced Inline posts can be found

here.