Now that we can find null literals, we need to be able to replace them. We need:

If you run the pass on our new test file you'll notice that the pass finds 3 integers to register corresponding to %2, %3 and %4 in the following bytecode:

and replace your test code by this updated version:

// Not iterating from the beginning to avoid obfuscation of Phi instructions

To be sure to have a pool of reachable variable during our obfuscation, we are going to register all the variables with integral type we come across while iterating through the block instructions.

I will make this entire pig disappear!

Ok we're almost done, the only thing left is to generate the new instructions and insert them into the code. For those of you who forgot (or skipped the intro) we are going to replace the null integer literals by the result of the expression:

\((p_1 * ((x \lor a_1)^2)

eq p_2 * ((y \lor a_2)^2))\)

Given that:

\((p_1)\) and \((p_2)\) are distinct prime numbers

and are distinct prime numbers \((a_1)\) and \((a_2)\) are distinct strictly positive random numbers

and are distinct strictly positive random numbers \((x)\) and \((y)\) are two variables picked from the program (they have to be reachable from the obfuscation instructions)

We will write a new method replaceZero that will do all the funny stuff. However given the size of the function we will detail it step by step:

First please add the following to your source file.

// Insert with the other #include #include "llvm/IR/IRBuilder.h" #include <random> // Insert just before the MyClass declaration using prime_type = uint32_t ;

Our replaceZero method will replace the null operand(s) of an instruction and return a pointer to the new operand(s) (or nullptr if a problem occurs). This gives us the following signature:

Value * replaceZero ( Instruction & Inst , Value * VReplace ) { // Replacing 0 by: // prime1 * ((x | any1)^2) != prime2 * ((y | any2)^2) // with prime1 != prime2 and any1 != 0 and any2 != 0

To generate our new formula we need 2 distinct prime numbers:

prime_type p1 = getPrime (), p2 = getPrime ( p1 ); if ( p2 == 0 || p1 == 0 ) return nullptr ;

The LLVM bytecode is strongly typed so we will need to play a little with the types. The important types are the type of the operand we are going to replace and the type in which we will do the operations of the obfuscation expression. For the intermediary operations we will use the prime_type we've just declared (in this case uint32_t ). However we need to be careful about type conversions and the type overflows (we will see later why and how).

Type * ReplacedType = VReplace -> getType (), * IntermediaryType = IntegerType :: get ( Inst . getParent () -> getContext (), sizeof ( prime_type ) * 8 );

Next we need to choose randomly two reachable variables (possibly twice the same) and two random strictly positive integers. For the variables we are going to randomly pick values in IntegerVect .

// Abort the obfuscation if we have encontered no integers so far if ( IntegerVect . empty ()) { return nullptr ; } // Random distribution to pick variables from IntegerVect std :: uniform_int_distribution < size_t > Rand ( 0 , IntegerVect . size () - 1 ); // Random distribution to pick Any1 and Any2 from [1, 10] std :: uniform_int_distribution < size_t > RandAny ( 1 , 10 ); // Indexes chosen for x and y size_t Index1 = Rand ( Generator ), Index2 = Rand ( Generator );

If we overflow our intermediary type in one of the new instructions we could lose the property that the obfuscating comparison is always false. We could replace a zero by... something else. So we could change the result(s) produced by the code, and we want to avoid that all costs. To prevent overflowing we have set the maximum for Any1 and Any2 to 10, but this is not enough. We need to make sure that x and y are not too big. The trick is that we have no information on their value at compile time. The solution we chose is to apply a bitmask to x and y in order to obtain a variable of which we know the max value.

The careful reader may have noticed that uniformly picking from IntegerVect is not truly uniform as we did not check for uniqueness of its elements ;-)

// Creating the LLVM objects representing literals Constant * any1 = ConstantInt :: get ( IntermediaryType , 1 + RandAny ( Generator )), * any2 = ConstantInt :: get ( IntermediaryType , 1 + RandAny ( Generator )), * prime1 = ConstantInt :: get ( IntermediaryType , p1 ), * prime2 = ConstantInt :: get ( IntermediaryType , p2 ), // Bitmask to prevent overflow * OverflowMask = ConstantInt :: get ( IntermediaryType , 0x00000007 );

Now that we have everything we need we will create our new instructions. To insert new instructions before a specific instruction we use an IRBuilder . This object will create instructions and insert them before the instruction given to its constructor. And we need to insert our new instructions before the instruction we are working on. That's why replaceZero takes an Instruction as parameter. We will forward it to the builder.

IRBuilder <> Builder ( & Inst ); // lhs // Casting x to our intermediary type Value * LhsCast = Builder . CreateZExtOrTrunc ( IntegerVect . at ( Index1 ), IntermediaryType ); // Registering the new integers for a future obfuscation registerInteger ( * LhsCast ); // To avoid overflow and truncate x Value * LhsAnd = Builder . CreateAnd ( LhsCast , OverflowMask ); registerInteger ( * LhsAnd ); // Creating LhsOr = (x | any1) Value * LhsOr = Builder . CreateOr ( LhsAnd , any1 ); registerInteger ( * LhsOr ); // LhsOr * LhsOr Value * LhsSquare = Builder . CreateMul ( LhsOr , LhsOr ); registerInteger ( * LhsSquare ); // prime1 * LhsOr^2 Value * LhsTot = Builder . CreateMul ( LhsSquare , prime1 ); registerInteger ( * LhsTot ); // rhs // The same as lhs with prime2, any2 and y Value * RhsCast = Builder . CreateZExtOrTrunc ( IntegerVect . at ( Index2 ), IntermediaryType ); registerInteger ( * RhsCast ); Value * RhsAnd = Builder . CreateAnd ( RhsCast , OverflowMask ); registerInteger ( * RhsAnd ); Value * RhsOr = Builder . CreateOr ( RhsAnd , any2 ); registerInteger ( * RhsOr ); Value * RhsSquare = Builder . CreateMul ( RhsOr , RhsOr ); registerInteger ( * RhsSquare ); Value * RhsTot = Builder . CreateMul ( RhsSquare , prime2 ); registerInteger ( * RhsTot ); // The final comparison always returning false Value * comp = Builder . CreateICmp ( CmpInst :: Predicate :: ICMP_EQ , LhsTot , RhsTot ); registerInteger ( * comp ); // Casting the boolean '0' back to the type of the replaced operand Value * castComp = Builder . CreateZExt ( comp , ReplacedType ); registerInteger ( * castComp ); return castComp ; }

OK! Almost there... we need to call our new function in the main loop and explicitly replace the operand:

bool runOnBasicBlock ( BasicBlock & BB ) override { IntegerVect . clear (); bool modified = false ; for ( typename BasicBlock :: iterator I = BB . getFirstInsertionPt (), end = BB . end (); I != end ; ++ I ) { Instruction & Inst = * I ; for ( size_t i = 0 ; i < Inst . getNumOperands (); ++ i ) { if ( Constant * C = isValidCandidateOperand ( Inst . getOperand ( i ))) { if ( Value * New_val = replaceZero ( Inst , C )) { Inst . setOperand ( i , New_val ); modified = true ; } else { // If sthg wrong happens during the replacement, // almost certainly because IntegerVect is empty errs () << "MyPass: could not rand pick a variable for replacement

" ; } } } registerInteger ( Inst ); } return modified ; }

and here is the full code (with the tabulated prime numbers):

namespace { using prime_type = uint32_t ; static const prime_type Prime_array [] = { 2 , 3 , 5 , 7 , 11 , 13 , 17 , 19 , 23 , 29 , 31 , 37 , 41 , 43 , 47 , 53 , 59 , 61 , 67 , 71 , 73 , 79 , 83 , 89 , 97 , 101 , 103 , 107 , 109 , 113 , 127 , 131 , 137 , 139 , 149 , 151 , 157 , 163 , 167 , 173 , 179 , 181 , 191 , 193 , 197 , 199 , 211 , 223 , 227 , 229 , 233 , 239 , 241 , 251 , 257 , 263 , 269 , 271 , 277 , 281 , 283 , 293 , 307 , 311 , 313 , 317 , 331 , 337 , 347 , 349 , 353 , 359 , 367 , 373 , 379 , 383 , 389 , 397 , 401 , 409 , 419 , 421 , 431 , 433 , 439 , 443 , 449 , 457 , 461 , 463 , 467 , 479 , 487 , 491 , 499 , 503 , 509 , 521 , 523 , 541 , 547 , 557 , 563 , 569 , 571 , 577 , 587 , 593 , 599 , 601 , 607 , 613 , 617 , 619 , 631 , 641 , 643 , 647 , 653 , 659 , 661 , 673 , 677 , 683 , 691 , 701 , 709 , 719 , 727 , 733 , 739 , 743 , 751 , 757 , 761 , 769 , 773 , 787 , 797 , 809 , 811 , 821 , 823 , 827 , 829 , 839 , 853 , 857 , 859 , 863 , 877 , 881 , 883 , 887 , 907 , 911 , 919 , 929 , 937 , 941 , 947 , 953 , 967 , 971 , 977 , 983 , 991 , 997 }; class MyPass : public BasicBlockPass { std :: vector < Value *> IntegerVect ; std :: default_random_engine Generator ; public : static char ID ; MyPass () : BasicBlockPass ( ID ) {} bool runOnBasicBlock ( BasicBlock & BB ) override { IntegerVect . clear (); bool modified = false ; // Not iterating from the beginning to avoid obfuscation of Phi instructions // parameters for ( typename BasicBlock :: iterator I = BB . getFirstInsertionPt (), end = BB . end (); I != end ; ++ I ) { Instruction & Inst = * I ; for ( size_t i = 0 ; i < Inst . getNumOperands (); ++ i ) { if ( Constant * C = isValidCandidateOperand ( Inst . getOperand ( i ))) { if ( Value * New_val = replaceZero ( Inst , C )) { Inst . setOperand ( i , New_val ); modified = true ; } else { errs () << "ObfuscateZero: could not rand pick a variable for replacement

" ; } } } registerInteger ( Inst ); } return modified ; } private : Constant * isValidCandidateOperand ( Value * V ) { Constant * C ; if ( ! ( C = dyn_cast < Constant > ( V ))) return nullptr ; if ( ! C -> isNullValue ()) return nullptr ; // We found a NULL constant, lets validate it if ( ! C -> getType () -> isIntegerTy ()) { // dbgs() << "Ignoring non integer value

"; return nullptr ; } return C ; } void registerInteger ( Value & V ) { if ( V . getType () -> isIntegerTy ()) IntegerVect . push_back ( & V ); } // Return a random prime number not equal to DifferentFrom // If an error occurs returns 0 prime_type getPrime ( prime_type DifferentFrom = 0 ) { static std :: uniform_int_distribution < prime_type > Rand ( 0 , std :: extend ( decltype ( Prime_array ) - 1 ); size_t MaxLoop = 10 ; prime_type Prime ; do { Prime = Prime_array [ Rand ( Generator )]; } while ( Prime == DifferentFrom && -- MaxLoop ); if ( ! MaxLoop ) { return 0 ; } return Prime ; } Value * replaceZero ( Instruction & Inst , Value * VReplace ) { // Replacing 0 by: // prime1 * ((x | any1)**2) != prime2 * ((y | any2)**2) // with prime1 != prime2 and any1 != 0 and any2 != 0 prime_type p1 = getPrime (), p2 = getPrime ( p1 ); if ( p2 == 0 || p1 == 0 ) return nullptr ; Type * ReplacedType = VReplace -> getType (), * IntermediaryType = IntegerType :: get ( Inst . getParent () -> getContext (), sizeof ( prime_type ) * 8 ); if ( IntegerVect . empty ()) { return nullptr ; } std :: uniform_int_distribution < size_t > Rand ( 0 , IntegerVect . size () - 1 ); std :: uniform_int_distribution < size_t > RandAny ( 1 , 10 ); size_t Index1 = Rand ( Generator ), Index2 = Rand ( Generator ); // Masking Any1 and Any2 to avoid overflow in the obsfuscation Constant * any1 = ConstantInt :: get ( IntermediaryType , 1 + RandAny ( Generator )), * any2 = ConstantInt :: get ( IntermediaryType , 1 + RandAny ( Generator )), * prime1 = ConstantInt :: get ( IntermediaryType , p1 ), * prime2 = ConstantInt :: get ( IntermediaryType , p2 ), // Bitmask to prevent overflow * OverflowMask = ConstantInt :: get ( IntermediaryType , 0x00000007 ); IRBuilder <> Builder ( & Inst ); // lhs // To avoid overflow Value * LhsCast = Builder . CreateZExtOrTrunc ( IntegerVect . at ( Index1 ), IntermediaryType ); registerInteger ( * LhsCast ); Value * LhsAnd = Builder . CreateAnd ( LhsCast , OverflowMask ); registerInteger ( * LhsAnd ); Value * LhsOr = Builder . CreateOr ( LhsAnd , any1 ); registerInteger ( * LhsOr ); Value * LhsSquare = Builder . CreateMul ( LhsOr , LhsOr ); registerInteger ( * LhsSquare ); Value * LhsTot = Builder . CreateMul ( LhsSquare , prime1 ); registerInteger ( * LhsTot ); // rhs Value * RhsCast = Builder . CreateZExtOrTrunc ( IntegerVect . at ( Index2 ), IntermediaryType ); registerInteger ( * RhsCast ); Value * RhsAnd = Builder . CreateAnd ( RhsCast , OverflowMask ); registerInteger ( * RhsAnd ); Value * RhsOr = Builder . CreateOr ( RhsAnd , any2 ); registerInteger ( * RhsOr ); Value * RhsSquare = Builder . CreateMul ( RhsOr , RhsOr ); registerInteger ( * RhsSquare ); Value * RhsTot = Builder . CreateMul ( RhsSquare , prime2 ); registerInteger ( * RhsTot ); // comp Value * comp = Builder . CreateICmp ( CmpInst :: Predicate :: ICMP_EQ , LhsTot , RhsTot ); registerInteger ( * comp ); Value * castComp = Builder . CreateZExt ( comp , ReplacedType ); registerInteger ( * castComp ); return castComp ; } }; }

DOOOOOOOOOOOOOOOOOOOOONE!

Let's try this awesome pass! If we use it on the last version of our test code we get:

; Function Attrs: nounwind uwtable define i32 @main () #0 { %1 = alloca i32 , align 4 %a = alloca i32 , align 4 store i32 0 , i32 * %1 store i32 2 , i32 * %a , align 4 %2 = call i32 @puts ( i8 * getelementptr inbounds ([ 13 x i8 ]* @.str , i32 0 , i32 0 )) %3 = load i32 * %a , align 4 %4 = mul nsw i32 %3 , 3 store i32 %4 , i32 * %a , align 4 %5 = and i32 %3 , 7 %6 = or i32 %5 , 2 %7 = mul i32 %6 , %6 %8 = mul i32 %7 , 719 %9 = and i32 %2 , 7 %10 = or i32 %9 , 8 %11 = mul i32 %10 , %10 %12 = mul i32 %11 , 397 %13 = icmp eq i32 %8 , %12 %14 = zext i1 %13 to i32 ret i32 %14 }

Look at the assignments %5 to %14, looks familiar? We have successfully obfuscated the return 0 instruction with the expression we gave at the beginning.

But there are a few important things left to read, so stay tunned!