This pass is explained here: https://github.com/obfuscator-llvm/obfuscator/wiki/Bogus-Control-Flow

We applied this pass using the following command line, on our test case application:

This command enables the Bogus Control Flow protection on all the functions of our test binary. We set only one pass on the "-boguscf-loop" parameter because it does not change the problem, the generation and the recovery are just much slower, and more RAM is needed when the pass is applied.

Protected Function

Here is the control flow graph IDA Pro gives us when we load our target binary:

The poor resolution is not important here as this picture is enough to see the function is complex to understand. This pass, for each basic block to obfuscate, creates a new one containing an opaque predicate which makes a conditional jump: it can lead to the real basic block or another one containing junk code.

We could use the symbolic execution method we seen in the previous part. By applying this method we'll find all useful basic block and rebuild the flow. But there is a problem: the opaque predicate. The basic block containing junk code returns to it's parent, so if we follow this path during symbolic execution, we'll get stuck in an infinite loop. So, we need to solve the opaque predicate in order to directly find the right path and avoid the useless block.

Here is a graphical explanation of the problem, which is available in the source code of OLLVM:

// Before : // entry // | // ______v______ // | Original | // |_____________| // | // v // return // // After : // entry // | // ____v_____ // |condition*| (false) // |__________|----+ // (true)| | // | | // ______v______ | // +-->| Original* | | // | |_____________| (true) // | (false)| !-----------> return // | ______v______ | // | | Altered |<--! // | |_____________| // |__________| // // * The results of these terminator's branch's conditions are always true, but these predicates are // opacificated. For this, we declare two global values: x and y, and replace the FCMP_TRUE // predicate with (y < 10 || x * (x + 1) % 2 == 0) (this could be improved, as the global // values give a hint on where are the opaque predicates)

During our symbolic execution, we need to simplify the following opaque predicate: (y < 10 || x * (x + 1) % 2 == 0). Miasm can still help us to do that because it contains an expression simplification engine which operates on its own IR. We have to add the knowledge of the opaque predicate. As we have an "or" (||) between two equations and as the result has to be True it is enough to only simplify one.

The goal here is to do pattern matching using Miasm and replace the expression: x * (x + 1) % 2 by zero. So, the right term of the opaque predicate is True and we solve it easily.

It seems the ollvm developers made a little mistake in the code comments above: the announced opaque predicate is not valid. At first, we didn't managed to match it using the Miasm simplification engine. By looking at the equations given by Miasm we saw that the opaque predicate equation was: (x * (x - 1) % 2 == 0) (minus one instead of plus one).

This problem can be verified by looking at the OLLVM source code:

BogusControlFlow.cpp:620

//if y < 10 || x*(x+1) % 2 == 0 opX = new LoadInst (( Value * ) x , "" , ( * i )); opY = new LoadInst (( Value * ) y , "" , ( * i )); op = BinaryOperator :: Create ( Instruction :: Sub , ( Value * ) opX , ConstantInt :: get ( Type :: getInt32Ty ( M . getContext ()), 1 , false ), "" , ( * i ));

This piece of code shows the problem. The comment indicates (x+1) but the code is using Instruction::Sub directive which means: (x-1). As we work modulo 2, it is not a real problem because the result of the equation is the same, but working with pattern matching make this information important.

Since we know precisely what we have to match, here is a code example to do it using Miasm:

# Imports from Miasm framework from miasm2.expression.expression import * from miasm2.expression.simplifications import expr_simp # We define our jokers to match expressions jok1 = ExprId ( "jok1" ) jok2 = ExprId ( "jok2" ) # Our custom expression simplification callback # We are searching: (x * (x - 1) % 2) def simp_opaque_bcf ( e_s , e ): # Trying to match (a * b) % 2 to_match = (( jok1 * jok2 )[ 0 : 32 ] & ExprInt32 ( 1 )) result = MatchExpr ( e , to_match ,[ jok1 , jok2 , jok3 ]) if ( result is False ) or ( result == {}): return e # Doesn't match. Return unmodified expression # Interesting candidate, try to be more precise # Verifies that b == (a - 1) mult_term1 = expr_simp ( result [ jok1 ][ 0 : 32 ]) mult_term2 = expr_simp ( result [ jok2 ][ 0 : 32 ]) if mult_term2 != ( mult_term1 + ExprInt ( uint32 ( - 1 ))): return e # Doesn't match. Return unmodified expression # Matched the opaque predicate, return 0 return ExprInt32 ( 0 ) # We add our custom callback to Miasm default simplification engine # The expr_simp object is an instance of ExpressionSimplifier class simplifications = { ExprOp : [ simp_opaque_bcf ]} expr_simp . enable_passes ( simplifications )

Then, every time we call: expr_simp(e) (with "e" a lambda Miasm IR expression), if the opaque predicate is contained in it, it will be simplified. Since Miasm IR classes sometimes call expr_simp() method, it is possible that the callback is executed during IR manupulations.