Table of Contents:



In today’s post we will discuss a tool which automatically detects read

access of uninitialized memory bugs, for both the stack and the heap.

This tool is similar to a (really simple) variant of the valgrind

[1] project, but made for windows

(linux support for our tool is really, really easy though.)

Besides that, it can also be used to learn more about the

Pin framework.

We will show the reader simple examples of accessing uninitialized memory,

how we can prevent them, how we can detect them, and finally, a Pintool (a

tool that uses Pin) which detects uninitialized memory read access bugs.

First of all, inspiration for this blog entry was gained after reading

this

blogpost.

Uninitialized memory is, as it suggests, data which has not been

initialized yet, and therefore it must be assumed to be garbage data.

Stack variables are uninitialized by default, until they have been

assigned. The same goes for allocated memory on the heap; it is

uninitialized until it has been assigned.

Uninitialized variables may lead to crashes or other undefined behavior,

because the contents of the variable are filled with garbage data. For

example, when using an uninitialized pointer, chances are likely that the

pointer points to non-existant memory, and therefore, the application will

crash.

That being said, it’s time to show some examples of accessing

uninitialized memory, and how to prevent them.

The first Proof of Concept application looks like the following. We

allocate a variable on the stack, and read from it before writing to it.

#include <stdio.h> int main() { int a; printf("a: %d

", a); return 0; }

The contents of the variable a must be considered undefined.

Therefore the printf() statement will print out some garbage number (do

note that, when running the application several times, the number might

remain constant.)

By assigning a value to the a variable we would, obviously, get rid

of the bug. Take for example the following code, which does not contain an

uninitialized memory access bug.

#include <stdio.h> int main() { int a = 42; printf("a: %d

", a); return 0; }

A somewhat more advanced Proof of Concept application can be found

here.

// http://reversingonwindows.blogspot.com/2012/07/detecting-read-access-to-uninitialized.html #include <stdio.h> #include <stdlib.h> #include <string.h> #define MAX 16 typedef struct _CONTEXT { int arr[MAX]; int a; int b; int c; } CONTEXT; void init(CONTEXT *ctx) { memset(ctx->arr, 0, sizeof(ctx->arr[0]) * (MAX-1)); ctx->a = 1; } void process(CONTEXT *ctx) { int trash; for (int i = 0; i < MAX; i++) { trash = ctx->arr[i]; } } void process2(CONTEXT *ctx) { ctx->b = ctx->c; } void process3(int num) { int trash; if(num != 0) { trash = num; } } int main(int argc, char *argv[]) { CONTEXT ctx; // Erroneously initializes context. The last element of arr // member remains unitialized. b and c members remain // uninitialized, too. init(&ctx); // Accesses to each element of the array. Read-before-write // error should be reported in this function. process(&ctx); // Copies c to b but c is uninitialized. Read-before-write // error should be reported in this function. process2(&ctx); // This contains no read-before-write bug. process3(ctx.a); }

This Proof of Concept contains two uninitialized memory access bugs (both

on the stack, again.) The first happens in the process function,

because the last element of the ctx->arr array has not been set by

the init function. The second bug occurs in the process2

function, because, as you may notice, the c member of the

ctx object has not been set yet.

A simple fix might look like the following (by altering only the

init function.)

void init(CONTEXT* ctx) { memset(ctx, 0, sizeof(*ctx)); ctx->a = 1; }

This fix initializes the entire ctx object to zero’s, and sets the

a member to the value one afterwards, ensuring that the object does

not contain garbage data, but instead zero’s (which we consider

initialized data here.)

The third Proof of Concept application is based on heap memory. By

intercepting calls to the malloc function (actually, we intercept

calls to the RtlAllocateHeap

[2] function, although that one is

Windows-specific) we can determine the amount of bytes which have been

allocated to which address. For example, when an application allocates

32 bytes, we mark the address of each of these 32 bytes as

uninitialized. The following Proof of Concept application shows

uninitialized memory access from the heap.

#include <stdio.h> #include <stdlib.h> int main() { int *a = (int *) malloc(sizeof(int) * 3); a[1] = 0; printf("a0: %d

", *a); printf("a1: %d

", a[1]); printf("a2: %d

", a[2]); return 0; }

In this application we allocated memory for three integers, only set one

(the second) and print all three of them. This results in two

uninitialized memory access bugs (reading the first integer and the third

integer from the array.)

A simple fix might be done by replacing the malloc call with a

calloc [3]

call, which initializes the memory to zero, for example.

#include <stdio.h> #include <stdlib.h> int main() { int *a = (int *) calloc(3, sizeof(int)); a[1] = 0; printf("a0: %d

", *a); printf("a1: %d

", a[1]); printf("a2: %d

", a[2]); return 0; }

So we handle two types of uninitialized memory read access bug detections;

on the stack and on the heap.

On the stack goes as following. The prolog of a function (usually) starts

with making a backup of the stack pointer, followed by subtracting an

immediate from the stack pointer. The amount that is being subtracted

denotes the amount of memory needed for stack variables. In our Pintool

we detect such subtract instructions, and when they happen, we set the

memory which has been allocated (by subtracting from the stack

pointer) as uninitialized.

For the heap we deploy similar functionality. When a chunk of memory

has been allocated by the application, we mark it as uninitialized (unless

the zero-memory flag has been set for the RtlAllocateHeap

[2] function on

windows, which initializes the memory to zero’s.)

From here on, all read and write instructions are traced and we simply

keep track which memory has been written to and which has not

(uninitialized data.) When the application reads from a memory address

which is uninitialized, we print the address and the address of the

instruction pointer so somebody can investigate the problem further and

attempt to fix the problem using one of the (simple) techniques listed

earlier.

Other than that we store our taint data (data which keeps track which

memory is initialized and which is not) by working on 128kb chunks. That

is, we have a list in which every entry points to taint data for 128kb

memory (we store this in 16kb memory by using one bit taint per byte.)

These entries are allocated on-demand in order to try to reduce the memory

foot print, but the overhead is still fairly big (as always, with taint

data.)

Binaries of the Pintool and the Proof of Concepts presented earlier can

be found here,

up-to-date source can be found

here.

An example run of the Pintool against the Proof of Concept binaries looks

like the following.

gcc -std=c99 -O0 -o poc1.exe poc1.c ../../../ia32/bin/pin -t obj-ia32/readb4write.dll -- poc1.exe untainted address 0x0027ff1c is being read @ 0x004013c5.. a: 2130567168 gcc -std=c99 -O0 -o poc2.exe poc2.c ../../../ia32/bin/pin -t obj-ia32/readb4write.dll -- poc2.exe untainted address 0x0027ff10 is being read @ 0x004013c6.. untainted address 0x0027ff1c is being read @ 0x004013dd.. gcc -std=c99 -O0 -o poc3.exe poc3.c ../../../ia32/bin/pin -t obj-ia32/readb4write.dll -- poc3.exe untainted address 0x02792bd0 is being read @ 0x004013e6.. a0: 41490040 a1: 0 untainted address 0x02792bd8 is being read @ 0x00401418.. a2: 0