Code Virtualizer is a software protection solution using heavy obfuscation. Citing the author’s website [5]:

Code Virtualizer is a powerful code-obfuscation system that helps developers protect their sensitive code areas against Reverse Engineering while requiring minimum system resources.

Code Virtualizer can generate multiple types of virtual machines with a different instruction set for each one. This means that a specific block of Intel x86 instructions can be converted into different instruction set for each machine, preventing an attacker from recognizing any generated virtual opcode after the transformation from x86 instructions.

This post describes DeCV — a decompiler for Code Virtualizer.

DeCV

This is a high-level description. For a detailed discussion of CV internals see “Inside Code Virtualizer” by scherzo [6].

CV obfuscates the original x86 code by translating it to a custom stack oriented language [1]. CVL (CV’s language) has around 150 instructions but a lot of them are byte/word/dword variants of the same operation.

For example, consider this simple x86 code:

xor eax, 1111h

Its equivalent in CVL is:

load ptr eax

store addr

load dword [addr]

load dword 0x1111L

xor dword

store dword eflags

load ptr eax

store addr

store dword [addr]

To emulate x86 registers, CVL uses a table. Each dword in this table corresponds to one x86 register, so ‘load ptr <reg>‘ places a pointer to a variable holding <reg’s> value on the stack.

‘store addr‘ pops a value from the stack and places in a ‘built in’ variable called ‘addr’.

‘load dword [addr]‘ pushes a dword pointed by ‘addr’ on the stack.

The cumulative effect of the three instructions above is to push the value of eax on the stack.

Let’s emulate the rest of the code:

Instruction | Stack ---------------------------- load dword 0x1111 | eax xor dword | eax, 0x1111 store dword eflags | eax ^ 0x1111, new_flags load ptr eax | eax ^ 0x1111 store addr | eax ^ 0x1111, ptr eax store dword [addr] | eax ^ 0x1111 -- | -

Notice that ‘xor’ pushes two values on the stack: the xor’s result and the new value of EFLAGS register.

In order to recover the original x86 code, it’s sufficient to emulate CVL’s execution using symbolic registers instead of concrete values and emit x86 assembly on instructions like ‘store dword [addr]’. For example, if ‘addr’ equals ‘eax’ and there is ‘eax XOR ebx’ on top of the stack, the instruction to emit is ‘xor eax, ebx’. This is tree munching is disguise btw 🙂 [2].

DeCV does not implement full x86 recovery. I included an example of how this should be implemented, if anyone is interested — recover_x86.py reads a short CVL snippet and translates it back to assembly using the method described above.

CVL -> x86 translation is the easiest part of the CV puzzle, the real problem is to extract the CVL given just an obfuscated binary.

Deobfuscation

Each protected binary contains a virtual machine capable of executing a CVL program ‘compiled’ to bytecode. The interpreter doesn’t differ from other virtual machines:

dispatch: ;(deobfuscated and simplified version)

lodsb

movzx eax, al

jmp dword ptr [edi+eax*4]

‘esi’ points to the bytecode, ‘edi’ points to a table of handlers for different virtual instructions. For example, the ‘load ptr <reg>’ instruction is implemented as:

lodsb

movzx eax, al

lea eax, [edi+eax*4]

push eax

Byte pointed by ‘esi’ is the register’s offset in the register structure pointed by ‘edi’.

The decompilation process seems obvious — just take the bytecode, analyze which handlers are executed, get the CVL and then translate it back to x86. This indeed works, assuming we can tell which handlers implement which CVL instructions. There are two problems:

– all handlers are obfuscated to the point where it’s not possible to identify their semantics,

– handlers’ parameters and the bytecode itself are encrypted with a dynamic key (ebx), so even after all handlers are identified, their parameters are still unknown.

In order to deobfuscate handlers, DeCV performs a set of compiler optimizations on them, like constant propagation and folding, dead code elimination, peephole optimizations, etc. When the handler’s implementation stops changing (a fixpoint is reached), we can compare it to a set of original handlers extracted from the CV protector binary. Finding a match is equivalent to identifying the CVL instruction implemented by a handler. This is the same method as described in [3]. In [4] Rolf applies compiler opts during a different stage (CVL->x86).

There’s one problem with this approach that isn’t mentioned anywhere — CV can obfuscate the handlers’ control flow, by inserting random conditional jumps in its body. Fortunately, this obfuscation isn’t very complex — there’s only one correct path through the handler and it doesn’t change between executions. This means that the conditions tested during branching do not depend on variables, but on constants (in other words, they evaluate to constants). It should be possible to simply emulate all code up to the branch point to decide which path should be taken. It turns out it’s not that simple. Consider this code:

xor eax, ebx

xor ebx, eax

xor eax, ebx

or the xor-swap trick. The above sequence is equivalent to ‘xchg eax, ebx’.

My first attempt at emulation was to have three types of values: symbolic registers, concrete constants and unknowns. The above sequence is problematic, because if eax=const, ebx=unknown, then eax turns into an unknown after the first xor and there’s no way to recover it. Notice that we can’t just set all registers to concrete values at the beginning, because then we would miss real conditional jumps (there are handlers that have ‘real’ branches in them) and we need to identify branches that really do depend conditionally on input.

DeCV solves this problem by applying optimizations to code before a branch and then emulating the result. This way all nasty corner cases like the swap trick are removed and it’s possible to compute the conditions (by simple emulation) and decide jumps. After all fake branches are removed, the deobfuscation process can proceed like in the branchless case. Note that only a single path in the control flow graph should be followed — it’s not necessary to deobfuscate all of the CFG nodes.

With all handlers deobfuscated it’s easy to extract their parameter decryption procedures. Every handler that requires a parameter decrypts it using a unique function consisting of a combination of add/sub/xor instructions. After collecting all of these decryption functions it’s possible to decrypt the whole CVL program and all parameters used by handlers. Having all of this information is equal to decompiling the obfuscated CVL program.

Usage notes

DeCV will automatically perform all tasks including:

– locating the handlers table,

– locating the main dispatcher,

– collecting the handlers’ bodies split with unconditional jumps.

The decompilation takes few seconds max., depending on how complicated the VM is. The output is printed in IDA’s output window. DeCV was tested with IDA 6.2 and IDAPython 1.5.3. Example output:

vms found: 1

0x40740e

vm: 0

0x00000000 000[07] load ptr eflags

0x00000002 001 store addr

0x00000003 018 store dword [addr]

(…)

0x40740e is the VM’s entry point. First column is the instructions offset (in bytecode), second is the handler’s id (see clean_handlers dir), values in brackets (like [07]) are parameters.

DeCV was well tested on small, synthetic binaries protected with Code Virtualizer 1.3.8.

There are some bugs in IDA that might require manual intervention during the deobfuscation process — see the README.

Sources on github. Typical decompilation output on pastebin.

References

1. http://en.wikipedia.org/wiki/Stack-oriented_programming_language

2. http://www.eng.utah.edu/~cs5470/schedule/Lec21.pdf

3. http://www.woodmann.com/forum/entry.php?115-Fighting-Oreans-VM-(code-virtualizer-flavour)

4. http://static.usenix.org/event/woot09/tech/full_papers/rolles.pdf

5. http://oreans.com/codevirtualizer.php

6. http://tuts4you.com/download.php?view.2640