Super User

How to Reverse Engineer Software (Windows) the Right Way?

Have you ever felt a desire to take some mechanism apart to find out how it works? Well, who hasn’t. That desire is the leading force in reverse engineering. This skill is useful for analyzing product security, finding out the purpose of a suspicious .exe file without running it, recovering lost documentation, developing a new solution based on legacy software, etc.

In this article, we discuss the knowledge base needed to perform reverse engineering, basic principles of reverse engineering a piece of Windows software, disassemblers, and tools. We also provide a step-by-step example of reverse engineering an application.

Written by Sergii Bratus, Development Coordinator, Network Security Team

and Anton Kukoba, Security Research Leader

Contents

What is software reversing?

What do we need for reverse engineering?

Theoretical Knowledge. Software Reverse Engineering Process

Useful Tools for Reverse Engineering Windows Software

Disassemblers

Windows Sysinternals

Network Monitoring Tools

Debuggers

Real-life software reverse engineering example

How to reverse engineer a driver

Conclusion

What is software reversing?

Reverse engineering is the process of uncovering principles behind a piece of hardware or software, such as its architecture and internal structure. The question that drives reverse engineering is How does it work?

Obviously, if you have documentation, the whole process becomes much simpler. But it often happens that there’s no documentation and you need to find another way to learn how a piece of software works.

When might you need to reverse engineer a piece of software and how might doing so help you?

There are many uses of reverse engineering in the field of computer science, including:

Researching network communication protocols

Finding algorithms used in malware such as computer viruses, trojans, ransomware, etc.

Researching the file format used to store any kind of information, for example emails databases and disk images

Checking the ability of your own software to resist reverse engineering

Improving software compatibility with platforms and third-party software

Using undocumented platform features

Related services Software Reverse Engineering Services

The legality of reverse engineering depends on its purpose and how the software will be used. All the purposes mentioned above are completely legitimate, assuming you’ve obtained a copy of the software legally. But if you intend to, for example, reverse engineer a certain feature of a closed application and then implement it in another application, you’ll probably get into trouble.

Regarding legal documentation, reverse engineering is often prohibited by end-user license agreements (EULAs). But the US Digital Millennium Copyright Act specifies that reversing a piece of software is legal if it'’ done to improve compatibility with other products.

Legal requirements vary from country to country, so take your time to research them before you start.

Now let’s see how to reverse engineer software.

What do we need for reverse engineering?

To start reverse engineering software, you need:

knowledge in the field where you want to apply reverse engineering tools that will allow you to apply your knowledge while trying to disassemble software.

Let’s consider a generic example that isn’t connected to software. Let’s say you have a watch and you want to find out if it’s mechanical, quartz, or automatic.

Having knowledge of the field means you should know that there are three types of watches. Additionally, you should know that if there’s a battery, it’s located inside the watch, and you can see it if you open it up. You should also have basic knowledge of a watch’s internal structure, what the battery looks like, and what tools you need to open a watch case. Having the tools to apply your knowledge means that you need to have a screwdriver or other dedicated tool that will give you the chance to open the watch.

Just like reverse engineering a watch requires a specific skill set and tools, reverse engineering software requires its own field-specific knowledge and tools.

Theoretical Knowledge. Software Reverse Engineering Process

For different software reverse engineering tasks, you need different types of knowledge. Of course, there’s common knowledge that will help you in most reverse engineering tasks: knowledge of common application structures, programming languages, compilers, and so on. However, without special theoretical knowledge, you can’t solve specific reverse engineering tasks.

If you... You need knowledge of... reverse engineer any network applications principles of inter-process communications, the structure of networks, connections, network packets, etc. reverse cryptographic algorithms cryptography and the most popular algorithms used in the field research file structures basic file concepts and how different systems or components work with files

Special techniques can save a lot of time while reversing special types of software. In the case of file interactions, making a test that writes unique type values to a file while logging the offsets and data size to the actual storage file may help you find common patterns in offsets. This will give you a hint about the internal structures of these files.

When starting a reverse engineering process, software developers generally use a disassembler in order to find algorithms and program logic in place. There are many different executable file formats, compilers (which give different outputs), and operating systems. This diversity of technologies precludes the use of one single technology for reversing all types of software.

To understand the decompiled code, you need some knowledge of the assembler language, function calling conventions, stack structure, stack frames concept, etc.

Knowing the assembler output for different code samples may help you in uncovering the original functionality. Let’s consider some examples for the Windows x86 platform.

Let’s say we have the following code:

int count = 0; for (int i = 0; i < 10; ++i) { count++; } std::cout << count;

If we compile this code to an executable file, we’ll see this in the disassembler:

004113DE loc_4113DE: 004113DE mov eax, [ebp-14h] 004113E1 add eax, 1 004113E4 mov [ebp-14h], eax 004113E7 loc_4113E7: 004113E7 cmp [ebp-14h], 0Ah 004113EB jge short loc_4113F8 004113ED mov eax, [ebp-8] 004113F0 add eax, 1 004113F3 mov [ebp-8], eax 004113F6 jmp short loc_4113DE 004113F8 loc_4113F8: 004113F8 mov ecx, ds:[email protected] 004113FE push eax 00411400 call ds:[email protected]<<(int) 00411404 xor eax, eax 00411406 retn

As we can see, the regular cycle turned into assembly code with comparisons and jumps. Notice that the assembly code doesn’t use the regular assembly loop with the counter in the ecx register. In addition, local variables here are referred to as [ebp-14h] and [ebp-8] accordingly.

Let’s see what will happen if we compile this code using the release build:

00401000 main proc near 00401000 mov ecx, ds:[email protected] 00401006 push 0Ah 00401008 call ds:[email protected]<<(int) 0040100E xor eax, eax 00401010 retn 00401010 main endp

This piece of code doesn’t look anything like the previous. This is because of how the code was optimized. Technically, the loop was removed, since it’s not doing anything valuable other than incrementing the count variable to 10. So the optimizer decided just to keep the final value of the count variable and place the value directly as an argument for the count output operator.

The compilers that we use nowadays are very good at optimizing code. That’s why when reverse engineering, it’s better to understand the idea behind the code (the principles of the code) rather than to try getting the original code itself. If you understand the idea behind the code, you can just write your own prototype that fits the original task.

It will be very useful to know what assembly code you’ll get if you compile different operators, structures, and other language constructions. Understanding resultant assembly code is a good way to start the C++ reverse engineering process, but we won’t get into technical details of it here.

Useful tools for reverse engineering Windows software

We’ve already described several reverse engineering tools, including ProcessMonitor and ProcessExplorer, in our application architecture research. These tools are absolutely indispensable for reverse engineering.

In this section, we’ll review the most popular disassemblers and a few more tools that we use for our reverse engineering projects.

You can get more details and usage examples in our article on best software reverse engineering tools.

Disassemblers

A disassembler is a program that translates an executable file to assembly language. The most popular one is IDA Pro

IDA Pro

IDA Pro

IDA Pro is a convenient and powerful tool for disassembly. It has a huge number of instruments that allow you to quickly disassemble a piece of software. It can show the function call tree, parse import and export of the executable, and show information about them. It can even show the code in C. Also, it supports multiple CPU architectures, so it’s possible to use IDA Pro to reverse engineer code for ARM, AVR, M68k, and many other architectures.

Radare

Radare

The Radare disassembler is an alternative to IDA. It basically has all the IDA features without being as robust and stable. But it’s free and open source. Radare itself is a console tool, but it has a Cutter frontend, which makes it a true alternative to IDA.

Windows Sysinternals

Windows Sysinternals utilities are generally used for management, diagnostics, troubleshooting, and monitoring of the Microsoft Windows environment. But they’re also suitable for reverse engineering Windows software.

TCPView is a network sniffer that shows all information about TCP/UDP packets from all processes. This tool is useful for reversing network protocols.

PortMon is a physical system port monitor. It monitors serial and parallel ports and all traffic that goes through them.

WinObj shows all global objects in the system in a hierarchical structure. This tool can be useful when reversing an application that works with synchronization primitives such as mutexes and semaphores and also when reverse engineering kernel mode drivers.

Network monitoring tools

Wireshark

Wireshark

Wireshark is one of the most powerful network sniffers. It not only allows you to capture network traffic but also contains parsers for various network protocols, starting from really low-level like Ethernet, TCP, and IP to application-specific protocols like WebSockets and XMPP.

Fiddler

Fiddler

Fiddler is a web proxy that records traffic from browsers and allows you to analyze HTTP/HTTPS requests. Unlike Wireshark, it shows HTTP sessions instead of separate network packets. Fiddler also allows you to analyze compressed data sent over HTTP and analyze JSON and XML data when monitoring SOAP, REST, and AJAX requests.

API Monitor

API Monitor

API Monitor is a useful tool for discovering which APIs are called by an application and what behavior the application expects from those APIs. This tool has a powerful database and lets you see calls to a huge number of API functions of not only kernel32 and ntdll but also COM, managed environment, and others. Also, API Monitor provides convenient filtering mechanisms.

Debuggers

A debugger is invaluable for any developer to see what a program is doing right now. You get the same benefit from debugging when reversing applications as you get from debugging live applications.

The most popular debuggers are OllyDbg, WinDbg, and Windbg Preview.

OllyDbg

OllyDBG

OllyDbg (and its successor x64dbg) is probably the best debugger when it comes to software reverse engineering. It was specifically developed for the needs of reversing, and has all the tools needed for that purpose:

a built-in disassembler with the ability to analyze and identify key data structures

an import and export analysis feature

a built-in assembling and patching engine

The ability to parse API functions and their parameters makes it easy to reverse interactions with a system. The stack view provides a lot of information about the call stack. One more important advantage is that you may use OllyDbg with debug-protected applications, when usual debuggers just can’t do anything.

WinDbg

Windbg

Despite its simple interface, WinDbg has powerful tools for debugging. It has a built-in disassembler, various commands that allow you to know almost everything about the process/system you’re debugging, and the ability to do kernel-mode debugging, which is probably the most valuable feature. It’s a big advantage for reversing drivers, kernel-mode drivers in particular.

Windbg Preview

Windbg Preview

Windbg Preview is a new version of Windbg developed by Microsoft. It’s distributed via the Windows Store only. It has all the features of the classic Windbg coupled with a new UI and several new features. One of these new features is Time Travel Debugging, which allows you to record some period of program execution and then replay it as many times as you need. This way, you can execute the interesting parts of the code by stepping, without being afraid to run some code accidentally and lose the context or all the data.

Read also:

9 Best Reverse Engineering Tools for 2018

Real-life software reverse engineering example

Now we’ll see an example of how to reverse engineer a piece of software. Let’s imagine you have a suspicious executable file. You need to find out what this program does and if it’s safe for users.

Considering the scenario, it’s a good idea not to run this executable on your work computer but to use a virtual machine instead. Let’s start the application in our virtual machine.

Process creates a service

As we can see, this file creates a Windows service named TestDriver. It has the type kernel, so we know it’s a driver. But where does it take the driver file from in order to run? We can use ProcessMonitor from Sysinternals Suite to find out. When we open ProcessMonitor, we can set up filters to show us only the file activity from the process we’re interested in. Its activity log looks like this:

FileMon information

The driver file is created by the process that we’re reversing, and this process puts this file in the user’s temp directory. There’s no need to look for the file in the temp folder since we see that the process deletes it right after use. So what does the process do with this file? If it unpacks the file, we may try to find it in the process’s resource section, since this is a common place to store such data. Let’s look there. We’ll use another tool – Resource Hacker – to examine the resources. Let’s run it:

Examine resources with Resource Hacker

Bingo! As we can see from the found resource content, this is probably the Windows executable file, since it starts with an MZ signature and has the string “This program cannot be run in DOS mode.” Let’s check if it’s our driver file. For that purpose, we extract the resource using Resource Hacker and open it in the disassembler.

Disassembler screen

As we know, DriverEntry is the entry point for kernel-mode drivers in Windows systems. We can continue our research, as it looks like we’ve found the right driver.

How to reverse engineer a driver

To begin reverse engineering the driver, we examine functions that are called from DriverEntry one by one. If we go to sub_14005, we find nothing interesting, so we continue with sub_110F0 and find this code:

Code piece 1

Code piece 2

Code piece 3

Code piece 4

Some lines are omitted here for the sake of simplicity.

In the first listing, a unicode string is created, and this string points to the path C:\hello.txt. After that, the structure OBJECT_ATTRIBUTES is filled with regular values; we know that this structure is often needed when calling functions like ZwCreateFile.

In the second listing, we see that ZwCreateFile is indeed called, which makes us pretty sure that the driver creates the file – and we know where this file is located after it’s created.

From the third and fourth listings, we can see that the driver takes the unicode string and writes it to the buffer (this happens in the sub_11150 function), and the buffer will be written to the file using the ZwWriteFile function. At the end, the driver closes the file using the ZwClose API.

Let’s summarize. We found out that the original program extracts the driver file from its resources, puts it in the temp folder of the current user, creates the Windows service for this driver, and runs it. After that, the program stops and deletes the service and the original driver file from the temp directory. From this behavior and from analyzing the disassembly, it appears that the driver doesn’t do anything except create a file on the C drive named hello.txt and write the string “Hello from driver”.

Now we need to check if we’re correct. Let’s run the program and check the C drive:

Application screen

Wonderful! We’ve reverse engineered this simple program and now we know that it’s safe to use.

We could have achieved this result in many different ways – using debugging or API Mon, writing tests, etc. You can find your own ways to reverse engineer software that work for you.

Conclusion

Windows software reverse engineering requires a solid educational background and programming experience. In order to perform reverse engineering, you need to combine skills in disassembling, network monitoring, debugging, API integration, several program languages, compilers, etc. You also have to be very careful when reversing software in order not to break copyright laws or harm your system.

At Apriorit, we have an experienced team of reverse engineers. If you want to apply reverse engineering skills to your project, feel free to contact us!