If you used IDA Pro for a while, you might have noted that it contents itself

with simple things. It neatly displays the disassembly listing.

It allows you to improve the listing by adding names and comments. You can manually define

your symbols, types, functions. IDA itself can add some types and discover some

program properties, but

overall the performed analyses appear to be rather modest.



On the other hand things seem to be simple.

After spending some time contemplating these always-the-same code patterns

one gets used to them. Many binary analysis problems start to look elementary, if not boring.

One could even think: it is so obvious, why not automate it? Let the computer

find things for us. Questions like where will this register be used?

or where does that value come from? could be answered by the computer!

Sure, there will be situations when it fails, but at least in other cases the

answers could be calculated without any human intervention!…

When we implement the first naive approach, it turns out to work with our test program.

Then we switch to another program and get our first surprise: it does not work at all!

Switching to another file compiled

by another compiler or even with the same compiler but with a different set of compiler options spoils everything.

The patterns which worked for the first compiler do not work and the analysis fails.

The next step could be to add patterns for the second compiler. Then for the third.

If we take this approach, our analysis tool would never be complete.

Imagine we want to find where a function expects its input arguments. This question arises

very often during binary analysis and answering it correctly is very important.

Without knowing where to look for the input we can never find out what the input values are.

For example, this function

push ebp mov ebp, esp movsx eax, [ebp+8] ...

apparently expects one of its arguments on the stack because it copies

something above the return address to the eax register.

Another example. This function

push ebp mov ebp, esp call [ecx+10h] ...

uses ecx without initializing it. Thus ecx must already

have an initialized value,

otherwise the application would crash. So the rule might be

INPUT ARGUMENTS RULE If the function uses a register or stack location without initializing it,

then it expects an input argument at that location.



We have not defined what the word uses means.

Let’s say that a direct read access constitutes a use. Usually arguments are

accessed directly so the requirement of a direct access does not limit out

analysis1.

We ‘just’ need to perform the data analysis and find all these registers.

Unfortunately, this does not work at all. Here is a counterexample

push ebp mov ebp, esp push esi ... pop esi pop ebp ret

This push instruction accesses the esi register for read.

But we all know that a push/pop

pattern like this represents the ‘save-restore’ idiom. The register is not really used.

We do not care about its value, we just save it on the stack and restore it at the exit.

For any experienced IDA user all this is pretty obvious.

We have to amend our rule about the function

inputs. It could look like this:

INPUT ARGUMENTS RULE VERSION 2 If the function uses a register or stack location without initializing it,

then it is an input argument. Watch out for the preserved registers, the

save-restore instructions should be ignored during this analysis!



My modifications to the rule are very vague. I did so for a reason: the problem of finding

preserved registers is not simple at all!

We will build an algorithm to find preserved registers.

In our first try at it we look for the push/pop pairs:

PRESERVED REGISTERS RULE VERSION 1 A register is preserved by the function if there is a pop

at the end of the function (before the return instruction)

and a push instruction at the beginning of the function which

corresponds to it. In this case we can remove the push/pop pair and repeat.



This rule works perfectly well with this function:

push edi push esi ... pop esi pop edi ret

Alas, a counterexample is obvious:

push edi push esi ... ret1: pop esi pop edi ret ... ret2: pop esi pop edi ret

Strictly speaking we do not have pairs here. There are several pops for each push.

The second version of the rule:

PRESERVED REGISTERS RULE VERSION 2 A register is preserved by the function if for each return

instruction there is a pop instruction

and a push instruction at the beginning of the function which

corresponds to it. In this case we can remove all pop instructions and the

push instruction. Repeat the process while possible.



I’m sure you realize that this rule is too naive… Let’s check this this code (generated

by GNU C++):

push ebp mov ebp, esp sub esp, 18h mov [ebp-8], ebx mov [ebp-4], esi ... mov ebx, [ebp-8] mov esi, [ebp-4] mov esp, ebp pop ebp retn

So the push instruction is not the only means of preserving registers.

This example effectively destroys our rule beyond any hope. It was built around the

idea that a function would push registers onto the stack, do its job, then pop them from

the stack. When the push/pop instructions were replaced by movs,

we have to abandon this approach. Here are two more examples.

The first one:

push ebp mov ebp, esp push esi mov [ebp-4], edi ... mov edi, [ebp-4] mov esp, ebp pop ebp retn

Please note that esi is not saved by this

function2!

Here is the second one:

push ebp mov ebp, esp sub esp, 20h ... push esi ... pop esi ... mov esp, ebp pop ebp retn

As you see, a register is saved in the middle of the function.

To summarize:

it doesn’t make sense to look only for push / pop instructions

other instructions can be used to preserve registers

/ instructions other instructions can be used to preserve registers it doesn’t make sense to check only the function prolog/epilog

a register can be saved/restored anywhere in the function

I used microcode to overcome the instruction variety. All x386 instructions,

be it push, mov, or pusha, are converted to the microcode data transfer instruction.

Data flow analysis helps to find these instruction anywhere in the function body.

Here is the outline of the data flow analysis:

create the function flow chart

add empty entry and exit blocks to it

the entry block will pretend to define all registers used in the function

the exit block will pretend to use all registers defined by the function

for each block compute the use/def lists

use the iterative algorithm to compute du-ud chains3.

There were many minor problems but they are too boring to delve into details.

To make the story short, I’ll just show the final version of the rule:

PRESERVED REGISTERS – FINAL VERSION a register is preserved by the function, iff: its definition by the entry block is used somewhere in the function the instruction which uses that definition copies the register to a stack location or a register.



we will call this instruction the save instruction.



there can be several save instructions for a register. the instruction which defines that value copies the register from a stack location or a register.



we will call this instruction the restore instruction.



there can be several restore instructions for a register. the locations at the steps 2 and 3 are the same



we will call this location the save location. there is at least one path from any save instruction to a restore instruction the stack location used to save the register is not accessed (read or write) on

any path connecting the save and restore instructions

This rule is still not perfect (indirect accesses are not taken into account), but overall it

works well.

Please note that we considered only compiler generated code. Hand crafted code is

almost always impervious to automated analysis. Fortunately there is less and less

assembler code.

I hope this small example illustrates my point: binary analysis problems are much more difficult

than they look. Simple approaches simply do not work.

Even if we strictly limit ourselves to compiler generated code, they still fail.

They will work if we limit ourselves to a specific version of a specific

compiler or even to a specific binary.

So crafting a naive script to solve

today’s problem might be the right thing to do. If not, expect monstrous rules like the above

to haunt your logic…

P.S. The gcc compiler source code is abundant with lengthy and complex rules; does it

mean that we have to live with that? 😉