Understanding Buffer Overflows

A buffer is a data or memory holding area used to house data. The condition that causes a buffer overflow is when data is exceeding the allotted size of the buffer and thus overflows into other memory areas within the program. Think of two 5L buckets next to one another as the buffers and water as the data. When one bucket is filled with more than 5L of water the water will overflow

out into the other 5L bucket. This is an over simplified analogy of a buffer overflow however, it illustrates the basic concept. Where these buffers are located will determine the type of buffer overflow attack; either a stack buffer overflow or a heap buffer overflow.

Stack Buffer Overflows

Memory Architecture

A stack buffer overflow attack is defined as, “when the targeted

buffer is located on the stack, usually as a local variable in a function’s stack frame”.[1] In order to understand what a stack buffer overflow is the stack must be examined and understood. As a program is initialized the central processing unit (CPU) allocates virtual memory to the program’s processes.

Memory Structure of a Program

The allocated memory is organized into the following sections: text, data (initialized data and uninitialized data), stack and heap. The CPU provides each process with its own virtual address space. The structure for each of these regions is shown in the figure provided. Depending on CPU architecture the stack is allocated either at the bottom or top of the address space and either grows upward or downward. The stack area and heap grow in opposite directions into the spare memory allocation.

The stack area is a Last In first Out (LIFO) data structure in which a stack of objects are pushed onto the stack with the last object becoming the first object to be popped off the stack. The term PUSH and POP are operations used to add or remove objects from the stack. Imagine 5 boxes that need to be stacked on top of each other. Then there is a need for the 3rd box. Taking the box directly out would cause the top 2 boxes to fall. To prevent that from happening the top 2 boxes are popped off the stack first.

The stack is used to keep track of functions, procedures that the program is running as well as any parameters or local variables that the function needs. When this data that is saved on the stack and a function is called a new structure is created called a stack frame. The stack frame is used to support the execution of the function being called. The stack frame contains a return

address, local variables and any arguments passed to the function. The return address is used to return control back when the function finishes. This information is stored in CPU registers; which are small sets of data stores which are part of the processor. Registers are used to store instruction,

data or memory addresses that the processor can access quickly.

Microprocessors use general purpose registers for storing both data and addresses. Some important general-purpose registers are sp (stack pointer), bp (base pointer), and ax (the accumulator). For 16-bit processor architectures and for 32-bit processors the general-purpose registers are esp (extended stack pointer), ebp (extended base pointer), and eax (extended accumulator). The prefix “E” representing the registers were increased to 32-bits in x86 assembly language.The first register esp is the stack pointer, and this register stores the address to the top of the stack. As objects are pushed and popped onto the stack the address that is stored in esp changes. Next is the ebp register, or the base pointer which contains a fixed address at the base of the stack frame of the current function that is running.

Stack Frame and CPU Registers

The ebp register is a reference point for the access of arguments and local variables and only changes when another function is called or ends. Lastly, the eax register or the accumulator register which is used in storing the return value of a function and also used in arithmetic instructions. Figure 2 displays a single stack frame and the relation to these CPU registers.

To successfully exploit a stack buffer overflow the return address on the stack must be overwritten. This is achieved by overflowing local variables within the local buffer in the figure above. Overflowing the local buffer variable will overwrite the return address and create a new return address. This can cause issues when the return address is only partially overwritten and cause the stack to become misaligned. Determining the spacing or offset between the local variable and the return address will aid in keeping the stack aligned.

With the buffer offset determined a malicious payload can be crafted. The payload will overflow the local variable and return address with a different return address and inject shellcode successfully exploiting the program.

In the next post we will go over a simple C program that we will cause a stack overflow and exploit to return to libc.