Motivation for writing



As professional developers, we create products. We implement ideas, which are usually driven from some business craving for acceptance in the global market, from their target group. We try to deliver elegant, fast and reliable solutions and, quite honestly, we hate when someone use our work without at least saying "thanks, you've really made a great thing". That is why we need to protect our work. And in order to do that, we should be aware of the common vectors used by crackers to hack our software.

In this article, I'm gonna show you how to disassemble and decompile pure executable written in C++, among other interesting things related to managed and unmanaged environments.

First, we’ll need a little bit of a theory so we can really understand what we are doing and why.



Difference between static and dynamic libraries

Historically, the static libraries are the first type of libraries to appear. In Windows you can find them by the extensions .lib and .dll. The main difference between the static and the dynamic libraries is that the static library is directly embedded in the executable, thus increasing its size. The dynamic library, on the other hand, is a separate file which uploads a different image of itself in memory every time it is called from a program. The dll is one, but the image is different and this way any inter-process concurrent issues are avoided. This also enables more manageable updates, but implies a slight performance degradation, which is not considered a big issue.

In general, the dynamic libraries are the preferred approach for building applications. Even in the latest versions of Visual Studio there is no option to create a static library; by default all libraries are considered dynamic. Yet it is still possible to create statically linked libraries through the console environment.

The CPU registers



The CPU registers are the fastest memory located in the CPU itself. They are basically used for every low – level operation, they are the super-fast data storage of the processor. For x86 architectures there are usually 8 32 bit long registers, 2 of which hold the base pointer and the stack pointer that are used for navigation between the instructions. The registers are even faster than the Static RAM (SRAM, known as the cache) and, of course, the Dynamic RAM.

Quick overview of the Assembly language



For this article we need to know few basic things about the assembly language so we can actually understand what we are doing. The Assembly language is unstructured and is based on very primitive instructions, which are divided in the following general types (I’ll describe only the basic operations) :

Data movement instructions



mov – used to copy data from one cell to another, between registers, or between a register and a cell in the memory

push/pop – operates on the memory supported stack

Arithmetic instructions



add/sub/inc – arithmetic operations. Can operate with constants, registers or memory cells

Control flow instructions

jmp – jump to label or a cell in memory

jb – jump if condition is met

je - jump when equal

jne - jump when not equal

jz - jump when last result was zero

jg - jump when greater than

jge - jump when greater than or equal to

jl - jump when less than

jle - jump when less than or equal to

cmp – compare the values of the two specified operands

call/ret – these two implement the routine call and return

The Control flow instructions are what we are most interested in here. For a complete tutorial on the x86 assembly language, check this article.

Disassembling and modifying a C++ executable



For our example I’ve created a simple C++ application with basic I/O.

#include "stdafx.h" #include <iostream> #include <sstream> using namespace std; void execute() { string numbers; int hold; for(;;) { cout << "Please enter the code:

"; getline(cin, numbers); if(numbers != "82634") { cout << "

Try again.

"; } else { cout << "Code accepted"; break; } } cin >> hold; } int _tmain(int argc, _TCHAR* argv[]) { execute(); return 0; } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 #include "stdafx.h" #include <iostream> #include <sstream> using namespace std ; void execute ( ) { string numbers ; int hold ; for ( ; ; ) { cout << "Please enter the code:

" ; getline ( cin , numbers ) ; if ( numbers != "82634" ) { cout << "

Try again.

" ; } else { cout << "Code accepted" ; break ; } } cin >> hold ; } int _tmain ( int argc , _TCHAR* argv [ ] ) { execute ( ) ; return 0 ; }

We’ll need to disassemble, debug and optionally decompile our example. Download the following tools that will help us to do that :

I’ve compiled this example which you can download from here. When we start it we see the following simple console application :

It asks for some predefined input. If the wrong code is entered, the following output is presented :

“Try again”

Let’s pretend that we don’t have the source code and we don’t know the code. So what can we do ? Obviously, we have a loop here with some check inside which determines if the program should break from the loop or not.

We also got few strings :

“Please enter the code :”

“Try again”

Debug the executable



Start the OllyDbg debugger (with administrator privileges) and open the exe. (click to enlarge)

What we see in the upper-left window is the disassembled machine code. In other words, you see the instructions written in the Assembly Language. Below that we see the window with the binary code presented in hexademical values, and on the right we see the window with the CPU registers.

Locate the loop conditions



So now that our exe is loaded, started, and the debugger is attached, we have to find the exact place in the assembly code where the check is made. To do that we can use the strings that the UI shows us. Right-click on the assembly code view > Search For > All Referenced Strings . Find the “Try again” string and double-click it. The assembly view will locate the exact instruction which prints that string on the console. We can also see the “Code accepted” related instructions few rows below. It is clear where the loop resides.

Modify the assembly instructions



The next step is to modify some assembly instructions. We see a lot of instructions, but we are most interested in the jmp-related ones that control the position of the stack pointer. If we scroll a little bit up we can see “Please enter the following code…” instruction. In order to escape the loop, we need to change the target address of one of the jmp instructions that we run through.

Let’s take the jb at “00D613A4”, click it twice and change the target memory address to “00D613C7” – the one just before the “Code accepted” ASCII text, which obviously opens a stream.

In order to save it, right-click on the assembly window and press “Copy to executable” -> “Selection” while you’re on the modified row.

An alternative to OllyDbg. What is IDA ?



IDA is a debugger and a disassembler like OllyDbg. But it provides a more user-friendly view of the assembly code, and it can also act as a decompiler. For example, check the following screenshot of its assembly view :

As you can see it is more structured, the various jumps are visualized like graph nodes which facilitates navigation.

Read more: The compilation process in C++, Java and .Net

Decompiling a C++ executable using IDA



Which brings us to the question “Is it possible to decompile native image in a way that an understandable source code can be generated ?”. The short answer is no.

What it generates is pseudo C code. Let me show you the output of the small example program :

{ std__operator___std__char_traits_char_(std__cout, "Please enter the code:

"); v1 = std__basic_ios_char_std__char_traits_char____widen((_DWORD) std__cin + (_DWORD)*(&std__cin + 1), 10); std__getline_char_std__char_traits_char__std__allocator_char_(v1); v0 = (int *)&v13; if ( v15 >= 0x10 ) v0 = v13; v2 = 5; if ( v14 < 5 ) v2 = v14; if ( !v2 ) goto LABEL_21; v3 = (int)"82634"; v5 = (unsigned int)v2 < 4; v4 = v2 - 4; if ( v5 ) { LABEL_10: if ( v4 == -4 ) goto LABEL_19; } else { while ( *v0 == *(_DWORD *)v3 ) { ++v0; v3 += 4; v6 = (unsigned int)v4 < 4; v4 -= 4; if ( v6 ) goto LABEL_10; } } v7 = *(_BYTE *)v0 < *(_BYTE *)v3; if ( *(_BYTE *)v0 != *(_BYTE *)v3 || v4 != -3 && ((v8 = *((_BYTE *)v0 + 1), v7 = v8 < *(_BYTE *)(v3 + 1), v8 != *(_BYTE *)(v3 + 1)) || v4 != -2 && ((v9 = *((_BYTE *)v0 + 2), v7 = v9 < *(_BYTE *)(v3 + 2), v9 != *(_BYTE *)(v3 + 2)) || v4 != -1 && (v10 = *((_BYTE *)v0 + 3), v7 = v10 < *(_BYTE *)(v3 + 3), v10 != *(_BYTE *)(v3 + 3)))) ) { v11 = -v7 | 1; goto LABEL_20; } LABEL_19: v11 = 0; LABEL_20: if ( v11 ) goto LABEL_23; LABEL_21: if ( v14 >= 5 && v14 == 5 ) break; LABEL_23: std__operator___std__char_traits_char_(std__cout, "

Try again.

"); } std__operator___std__char_traits_char_(std__cout, "Code accepted"); result = std__basic_istream_char_std__char_traits_char____operator__(std__cin, &v16); if ( v15 >= 0x10 ) result = operator delete(v13); return result; } 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 { std__operator___std__char_traits_char_ ( std__cout , "Please enter the code:

" ) ; v1 = std__basic_ios_char_std__char_traits_char____widen ( ( _DWORD ) std__cin + ( _DWORD ) * ( & std__cin + 1 ) , 10 ) ; std__getline_char_std__char_traits_char__std__allocator_char_ ( v1 ) ; v0 = ( int * ) & v13 ; if ( v15 >= 0x10 ) v0 = v13 ; v2 = 5 ; if ( v14 < 5 ) v2 = v14 ; if ( ! v2 ) goto LABEL_21 ; v3 = ( int ) "82634" ; v5 = ( unsigned int ) v2 < 4 ; v4 = v2 - 4 ; if ( v5 ) { LABEL_10 : if ( v4 == - 4 ) goto LABEL_19 ; } else { while ( * v0 == * ( _DWORD * ) v3 ) { ++ v0 ; v3 + = 4 ; v6 = ( unsigned int ) v4 < 4 ; v4 - = 4 ; if ( v6 ) goto LABEL_10 ; } } v7 = * ( _BYTE * ) v0 < * ( _BYTE * ) v3 ; if ( * ( _BYTE * ) v0 != * ( _BYTE * ) v3 || v4 != - 3 && ( ( v8 = * ( ( _BYTE * ) v0 + 1 ) , v7 = v8 < * ( _BYTE * ) ( v3 + 1 ) , v8 != * ( _BYTE * ) ( v3 + 1 ) ) || v4 != - 2 && ( ( v9 = * ( ( _BYTE * ) v0 + 2 ) , v7 = v9 < * ( _BYTE * ) ( v3 + 2 ) , v9 != * ( _BYTE * ) ( v3 + 2 ) ) || v4 != - 1 && ( v10 = * ( ( _BYTE * ) v0 + 3 ) , v7 = v10 < * ( _BYTE * ) ( v3 + 3 ) , v10 != * ( _BYTE * ) ( v3 + 3 ) ) ) ) ) { v11 = - v7 | 1 ; goto LABEL_20 ; } LABEL_19 : v11 = 0 ; LABEL_20 : if ( v11 ) goto LABEL_23 ; LABEL_21 : if ( v14 >= 5 && v14 == 5 ) break ; LABEL_23 : std__operator___std__char_traits_char_ ( std__cout , "

Try again.

" ) ; } std__operator___std__char_traits_char_ ( std__cout , "Code accepted" ) ; result = std__basic_istream_char_std__char_traits_char____operator__ ( std__cin , & v16 ) ; if ( v15 >= 0x10 ) result = operator delete ( v13 ) ; return result ; }

So, can we decompile a native image into an understandable source code ? Depends on your idea of "understandable". You have to devote a lot of time and you need to posses serious knowledge of the APIs your operation system use, along with understanding of the C and Assembly syntax.

Decompiling applications written in managed environments



Decompiling .Net apps is also done with debuggers and decompilers for .Net like Reflector, for example (which is actually paid from some time on).

But the exe or dll you see on your desktop is intermediate, not binary code (assuming you do not use NGen). Decompiling C++ apps is hard because the compiler first produces Assembly language code targeted to the specific processor architecture, and next the Assembler gets that code and produces the actual native image. And as we saw, decompiling assembly code is hard.

The MSIL, at the other hand, is very close to the actual source code of your app, e.g. written with C#. You can use programs like Reflector to decompile them, along with some plugins to actually modify them.

So it is actually not so hard to crack an application

Yes, it’s not. With the difference that this process in an actual application will be more time-consuming. Do you know a single popular stand-alone application that has not been cracked ? That is why you need to think of better ways of protecting your software. Understand one simple thing :

Every application can be cracked, if you have access to its native image, just like every computer password can be broken, if you have physical access to the machine.

Of course, there are techniques that allows us to slow an attacker down, which might or might not be enough. But "slowing" doesn't mean "preventing", and that's a topic of another article.



That's from me regarding the topic of decompilation, I hope you learned something new today and, hopefully, this knowledge will help you to better protect your software. Know your enemy before going into battle. Because it's the battle for your own time.

Kosta Hristov ( 34 Posts Hi there ! My name is Kosta Hristov and I currently live in London, England. I've been working as a software engineer for the past 6 years on different mobile, desktop and web IT projects. I started this blog almost one year ago with the idea of helping developers from all around the world in their day to day programming tasks, sharing knowledge on various topics. If you find my articles interesting and you want to know more about me, feel free to contact me via the social links below. ;)

Like the article ? Share it ! ;)