New systems make attackers life hard and common exploitation techniques get harder to reproduce. The purpose of this article is to be very general on mitigation techniques and to cover attacks on x32 as a reference to x64 architectures to stick with the new constraints of today.

Here, you will find the first step which is an ELF format file analysis. After that we will speak about the protections and ways to bypass them. To finish, we will introduce the x86_64 that makes things more difficult for nowadays exploitations.

Pre-requisites:

Basics in Linux, asm x86,



a good understanding of buffer overflows, format string exploitations, heap overflows,

0x00900dc0ff33,

Ubuntu 11.04 on x86_64.

a default song…: Zeads dead – Paradise Circus (Massive attack Remix)

Here is the contents:

The ELF format

A standard



Where is it used?



ELF Layout

Dissecting the ELF

The “magic” field



Reversing ELF’s header



Sections



Relocations



Program Headers

Exploitations

Old is always better (for attackers)



Nonexecutable stack



Address Space Layout Randomization



Brute-force



Return-to-registers



Stack Canary



RELRO

The x86_64 fact and current systems hardening

References & Acknowledgements

The ELF format

A standard

Replacing the COFF and “a.out” formats that Linux previously used, ELF (Executable and Linking Format) increased flexibility. Indeed, when shared libraries are difficult to create and dynamically loading a shared library is cumbersome with “a.out” format, the ELF format has come with these two benefits[1]:

It is much simpler to make shared libraries,

It make dynamic loading and has comes with other suggestions for dynamic loading have included super-fast MUDs (Multi-User Domains also known as “Multi-User Dungeon”), where extra code could be compiled and linked into the running executable without having to stop and restart the program.

This format has been selected by the Tool Interface Standards committee (TIS) as a standard for portable object files for a variety of (Unix-Like) operating systems.

Where is it used?

Actually ELFs cover object files (.o), shared libraries (.so) and is also used for loadable kernel modules. As follows in listing 1, you can see also which systems[4] have adopted the ELF format:

Listing 1. Applications of ELF format

ELF Layout

An ELF as at least two headers: the ELF header (Elf32_Ehdr/Elf64_Ehdr struct) and the program header (Elf32_Phdr/struct Elf64_Phdr struct)[5]. But there is also a header which is called the “section header” (Elf32_Shdr/struct Elf64_Shdr struct) and which describes section like: .text, .data, .bss and so on (we will describe them later).

Figure 1. ELF Layout – execution view linking view

(source: ELF Format specifications[2])

As you can see in figure 1, there is two views. Indeed, the linking view is partitioned by sections and is used when program or library is linked. The sections contain some object files informations like: datas, instructions, relocation informations, symbols, debugging informations, and so on.

From the other part, the execution view, which is partitioned by segments, is used during a program execution. The program header as shown in the left, contains informations for the kernel on how to start the program, will walk through segments and load them into memory (mmap).

Dissecting the ELF

The “magic” field

In Linux forensic, it is common to use the “file” command to the type of a particular file, as follows:

fluxiux@nyannyan:~$ file /bin/ls

/bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.15, stripped

Now lets focus on the “ELF” string. As you had probably noticed using “hexdump” on any ELF file (like /bin/ls for example), the file starts with 0x7f then there are three next bytes for the encoded string “ELF”:

fluxiux@nyannyan:~$ hd -n 16 /bin/ls

00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|

The first 16 bytes represent the elf “magic” field, which is a way to identify an ELF file. But if bytes 1, 2 and 3 represent the encoded string “ELF”, what represent bytes 4, 5, 6, 7, 8, 9?

Just have a look at “elf.h” source code:

#define EI_CLASS 4 /* File class byte index */

#define ELFCLASSNONE 0 /* Invalid class */

#define ELFCLASS32 1 /* 32-bit objects */

#define ELFCLASS64 2 /* 64-bit objects */

#define ELFCLASSNUM 3



#define EI_DATA 5 /* Data encoding byte index */

#define ELFDATANONE 0 /* Invalid data encoding */

#define ELFDATA2LSB 1 /* 2's complement, little endian */

#define ELFDATA2MSB 2 /* 2's complement, big endian */

#define ELFDATANUM 3



#define EI_VERSION 6 /* File version byte index */

/* Value must be EV_CURRENT */



#define EI_OSABI 7 /* OS ABI identification */

#define ELFOSABI_NONE 0 /* UNIX System V ABI */

#define ELFOSABI_SYSV 0 /* Alias. */

#define ELFOSABI_HPUX 1 /* HP-UX */

#define ELFOSABI_NETBSD 2 /* NetBSD. */

#define ELFOSABI_LINUX 3 /* Linux. */

#define ELFOSABI_SOLARIS 6 /* Sun Solaris. */

#define ELFOSABI_AIX 7 /* IBM AIX. */

#define ELFOSABI_IRIX 8 /* SGI Irix. */

#define ELFOSABI_FREEBSD 9 /* FreeBSD. */

#define ELFOSABI_TRU64 10 /* Compaq TRU64 UNIX. */

#define ELFOSABI_MODESTO 11 /* Novell Modesto. */

#define ELFOSABI_OPENBSD 12 /* OpenBSD. */

#define ELFOSABI_ARM_AEABI 64 /* ARM EABI */

#define ELFOSABI_ARM 97 /* ARM */

#define ELFOSABI_STANDALONE 255 /* Standalone (embedded) application */



#define EI_ABIVERSION 8 /* ABI version */



#define EI_PAD 9 /* Byte index of padding bytes */

We can affirmatively say, that our file is an ELF of class 64, encoded in little endian with a UNIX System V ABI standard and has 0 padding bytes. By the way, if you did not expected yet, we have compared to the structure we have observed here the “e_ident” of “Elf64_Ehdr” structure.

Reversing ELF’s header

To begin the complete dissection, let’s just start making a simple binary file as follows:

#include <stdio.h>

main ( )

{

printf ( "huhu la charrue" ) ;

} main

And produce an ELF before linking it:

gcc toto.c -c

We will use now one of the most used tool as “objdump” to analysis ELF files which is readelf from binutils to display every fields. That will simplify our analysis but if you are interested for dissecting ELF files yourself, you can look for libelf and we will also talk about some interesting libraries in Python to do it much more quickly.

Now, we observe the ELF header:

fluxiux@nyannyan:~$ readelf -h toto

ELF Header:

Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00

Class: ELF64

Data: 2's complement, little endian

Version: 1 (current)

OS/ABI: UNIX - System V

ABI Version: 0

Type: REL (Relocatable file)

Machine: Advanced Micro Devices X86-64

Version: 0x1

Entry point address: 0x0

Start of program headers: 0 (bytes into file)

Start of section headers: 312 (bytes into file)

Flags: 0x0

Size of this header: 64 (bytes)

Size of program headers: 0 (bytes)

Number of program headers: 0

Size of section headers: 64 (bytes)

Number of section headers: 13

Section header string table index: 10

The result seems to be very implicit, but now just let’s try to identify these field using our lovely hexdump tool (in “warrior forensic style!” or not):

fluxiux@nyannyan:~$ hd -n 64 toto

00000000 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 |.ELF............|

00000010 01 00 3e 00 01 00 00 00 00 00 00 00 00 00 00 00 |..>.............|

00000020 00 00 00 00 00 00 00 00 38 01 00 00 00 00 00 00 |........8.......|

00000030 00 00 00 00 40 00 00 00 00 00 40 00 0d 00 0a 00 |....@.....@.....|

00000040

We already know the first line, but what can say about the three others? As you can see, in the second line, the first two bytes represent the “e_type”. Indeed, if you look at “elf.h” file, you could observe that “01 00” in little-Indian, means: “Relocatable file”.

Now look at the two next bytes. We have “3e 00” that is equivalent to 62 in decimal (3*16¹ + c = 62), which defines the AMD x86-64 architecture:

#define EM_X86_64 62 /* AMD x86-64 architecture */

After we have the “e_version” field with “01 00” as a value for “Current version”:

/* Legal values for e_version (version). */



#define EV_NONE 0 /* Invalid ELF version */

#define EV_CURRENT 1 /* Current version */

#define EV_NUM 2

Bytes 24 to 26 indicate the entry point address (which is 0x0 while it is not linked) . And we finish with two more most important think that we will talk about in this article :

Program Headers with 6 headers, starting at byte 64 (byte 32 and 33 in hexdump),



section headers with 29 headers, starting at byte (byte 40 – 43 in hexdump).

For the rest, we will use readelf and I will let you finish the header part by yourself.

Sections

Let’s just see “toto.o” sections with the following command:

fluxiux@nyanyan:~$ readelf -S toto.o

There are 13 section headers, starting at offset 0x138:



Section Headers:

[Nr] Name Type Address Offset

Size EntSize Flags Link Info Align

[ 0] NULL 0000000000000000 00000000

0000000000000000 0000000000000000 0 0 0

[ 1] .text PROGBITS 0000000000000000 00000040

0000000000000018 0000000000000000 AX 0 0 4

[ 2] .rela.text RELA 0000000000000000 00000598

0000000000000030 0000000000000018 11 1 8

[ 3] .data PROGBITS 0000000000000000 00000058

0000000000000000 0000000000000000 WA 0 0 4

[ 4] .bss NOBITS 0000000000000000 00000058

0000000000000000 0000000000000000 WA 0 0 4

[ 5] .rodata PROGBITS 0000000000000000 00000058

0000000000000010 0000000000000000 A 0 0 1

[ 6] .comment PROGBITS 0000000000000000 00000068

000000000000002b 0000000000000001 MS 0 0 1

[ 7] .note.GNU-stack PROGBITS 0000000000000000 00000093

0000000000000000 0000000000000000 0 0 1

[ 8] .eh_frame PROGBITS 0000000000000000 00000098

0000000000000038 0000000000000000 A 0 0 8

[ 9] .rela.eh_frame RELA 0000000000000000 000005c8

0000000000000018 0000000000000018 11 8 8

[10] .shstrtab STRTAB 0000000000000000 000000d0

0000000000000061 0000000000000000 0 0 1

[11] .symtab SYMTAB 0000000000000000 00000478

0000000000000108 0000000000000018 12 9 8

[12] .strtab STRTAB 0000000000000000 00000580

0000000000000014 0000000000000000 0 0 1

Key to Flags:

W (write), A (alloc), X (execute), M (merge), S (strings), l (large)

I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)

O (extra OS processing required) o (OS specific), p (processor specific)

As you can see, there is a lot of sections which are part of the ELF64_Shdr:

Code sections (.text),

data section (.data, .bss, .rodata),

the .comment which is used to store extra informations,

relocation tables (.rela.*),

symbol tables (.symtab),

section String Tables (.shstrtab) which stores the name of each section,

string tables (.strtab).

The address column normally shows where sections should be loaded into virtual memory, but this was not filled in for each section. The reason is that we did not linked it yet, so we will do that:

fluxiux@nyannyan:~$ gcc toto.o -o toto

fluxiux@nyannyan:~$ readelf -S toto

There are 30 section headers, starting at offset 0x1178:

Section Headers:

[Nr] Name Type Address Offset

Size EntSize Flags Link Info Align

[ 0] NULL 0000000000000000 00000000

0000000000000000 0000000000000000 0 0 0

[ 1] .interp PROGBITS 0000000000400238 00000238

000000000000001c 0000000000000000 A 0 0 1

[ 2] .note.ABI-tag NOTE 0000000000400254 00000254

0000000000000020 0000000000000000 A 0 0 4

[ 3] .note.gnu.build-i NOTE 0000000000400274 00000274

0000000000000024 0000000000000000 A 0 0 4

[ 4] .gnu.hash GNU_HASH 0000000000400298 00000298

000000000000001c 0000000000000000 A 5 0 8

[ 5] .dynsym DYNSYM 00000000004002b8 000002b8

0000000000000060 0000000000000018 A 6 1 8

[ 6] .dynstr STRTAB 0000000000400318 00000318

000000000000003f 0000000000000000 A 0 0 1

[ 7] .gnu.version VERSYM 0000000000400358 00000358

0000000000000008 0000000000000002 A 5 0 2

[ 8] .gnu.version_r VERNEED 0000000000400360 00000360

0000000000000020 0000000000000000 A 6 1 8

[ 9] .rela.dyn RELA 0000000000400380 00000380

0000000000000018 0000000000000018 A 5 0 8

[10] .rela.plt RELA 0000000000400398 00000398

0000000000000030 0000000000000018 A 5 12 8

[11] .init PROGBITS 00000000004003c8 000003c8

0000000000000018 0000000000000000 AX 0 0 4

[12] .plt PROGBITS 00000000004003e0 000003e0

0000000000000030 0000000000000010 AX 0 0 4

[13] .text PROGBITS 0000000000400410 00000410

00000000000001d8 0000000000000000 AX 0 0 16

[14] .fini PROGBITS 00000000004005e8 000005e8

000000000000000e 0000000000000000 AX 0 0 4

[15] .rodata PROGBITS 00000000004005f8 000005f8

0000000000000014 0000000000000000 A 0 0 4

[16] .eh_frame_hdr PROGBITS 000000000040060c 0000060c

0000000000000024 0000000000000000 A 0 0 4

[17] .eh_frame PROGBITS 0000000000400630 00000630

000000000000007c 0000000000000000 A 0 0 8

[18] .ctors PROGBITS 0000000000600e28 00000e28

0000000000000010 0000000000000000 WA 0 0 8

[19] .dtors PROGBITS 0000000000600e38 00000e38

0000000000000010 0000000000000000 WA 0 0 8

[20] .jcr PROGBITS 0000000000600e48 00000e48

0000000000000008 0000000000000000 WA 0 0 8

[21] .dynamic DYNAMIC 0000000000600e50 00000e50

0000000000000190 0000000000000010 WA 6 0 8

[22] .got PROGBITS 0000000000600fe0 00000fe0

0000000000000008 0000000000000008 WA 0 0 8

[23] .got.plt PROGBITS 0000000000600fe8 00000fe8

0000000000000028 0000000000000008 WA 0 0 8

[24] .data PROGBITS 0000000000601010 00001010

0000000000000010 0000000000000000 WA 0 0 8

[25] .bss NOBITS 0000000000601020 00001020

0000000000000010 0000000000000000 WA 0 0 8

[26] .comment PROGBITS 0000000000000000 00001020

0000000000000054 0000000000000001 MS 0 0 1

[27] .shstrtab STRTAB 0000000000000000 00001074

00000000000000fe 0000000000000000 0 0 1

[28] .symtab SYMTAB 0000000000000000 000018f8

0000000000000600 0000000000000018 29 46 8

[29] .strtab STRTAB 0000000000000000 00001ef8

00000000000001f2 0000000000000000 0 0 1

Wow! Some new sections appeared:

.interp which holds pathname of the program interpreter,



code sections (.plt, .init, .fini),

table of imported/exported symbols (.dynsym),

dynamic names table (.dynstr),

dynamic hash table (.hash),

new relocation tables (.rela.*),

constructor and Destructor tables (.ctors, .dtors),

section reserved for dynamic binaries (.got, .dynamic, .plt).

After the address column, you have the offset within the file of the section, then you have the size in byte of each section, the section header size in byte, the required alignment, the Flags (Read, Write, Execute), and so on.

In this article, we will discover some important sections to target for any attack.

Relocations

The relocation is made to modify the memory image of mapped segments to make them executable. As you saw before, there are some “.rela.*” sections which are used to show where to patch the memory and how. Let’s look the different relocations using our favorite tool “readelf”:

fluxiux@nyannyan:~$ readelf -r toto



Relocation section '.rela.dyn' at offset 0x380 contains 1 entries:

Offset Info Type Sym. Value Sym. Name + Addend

000000600fe0 000200000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0



Relocation section '.rela.plt' at offset 0x398 contains 2 entries:

Offset Info Type Sym. Value Sym. Name + Addend

000000601000 000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf + 0

000000601008 000300000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0

For example, it means for “printf” we need to patch the offset 0x000000600fe0 from the beginning of the .plt section.

For more informations, you have also a description of relocation types in “elf.h”:

/* x86-64 relocation types, taken from asm-x86_64/elf.h */

#define R_X86_64_NONE 0 /* No reloc */

[...]

#define R_X86_64_GLOB_DAT 6 /* Create GOT entry */

#define R_X86_64_JUMP_SLOT 7 /* Create PLT entry */

[...]

Program Headers

The section header table is not loaded into memory, because the kernel nor the dynamic loader will be able to use that table. To load a file into memory, program headers are used to provide informatios that are required:

fluxiux@nyanyan:~$ readelf -W -l toto



Elf file type is EXEC (Executable file)

Entry point 0x400410

There are 9 program headers, starting at offset 64



Program Headers:

Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align

PHDR 0x000040 0x0000000000400040 0x0000000000400040 0x0001f8 0x0001f8 R E 0x8

INTERP 0x000238 0x0000000000400238 0x0000000000400238 0x00001c 0x00001c R 0x1

[Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]

LOAD 0x000000 0x0000000000400000 0x0000000000400000 0x0006ac 0x0006ac R E 0x200000

LOAD 0x000e28 0x0000000000600e28 0x0000000000600e28 0x0001f8 0x000208 RW 0x200000

DYNAMIC 0x000e50 0x0000000000600e50 0x0000000000600e50 0x000190 0x000190 RW 0x8

NOTE 0x000254 0x0000000000400254 0x0000000000400254 0x000044 0x000044 R 0x4

GNU_EH_FRAME 0x00060c 0x000000000040060c 0x000000000040060c 0x000024 0x000024 R 0x4

GNU_STACK 0x000000 0x0000000000000000 0x0000000000000000 0x000000 0x000000 RW 0x8

GNU_RELRO 0x000e28 0x0000000000600e28 0x0000000000600e28 0x0001d8 0x0001d8 R 0x1



Section to Segment mapping:

Segment Sections...

00

01 .interp

02 .interp .note.ABI-tag .note.gnu.build-id .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame

03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss

04 .dynamic

05 .note.ABI-tag .note.gnu.build-id

06 .eh_frame_hdr

07

08 .ctors .dtors .jcr .dynamic .got

As you can see, each program header corresponds to one segment where you can find sections into it. But how does it work?

In the beginning, when the kernel sees the INTERP segment, it loads first the LOAD segments to the specified virtual addresses, then load segments from program interpreter (/lib64/ld-linux-x86-64.so.2) and jumps to interpreter’s entry point. After that, the loader gets the control and loads libraries specified in LD_PRELOAD and also DYNAMIC segments of the executable that are needed:

fluxiux@nyannyan:~$ readelf -d toto



Dynamic section at offset 0xe50 contains 20 entries:

Tag Type Name/Value

0x0000000000000001 (NEEDED) Shared library: [libc.so.6]

After relocations, the loader invokes all libraries INIT function and then jumps to executable’s entry point.

In static, there is less thinks to say because the kernel only loads LOAD segments to the virtual addresses and then jumps to the entry points (easy eh?).

For some more details (I think), you can see an old but very good article published in Linux Journal #13 about ELF dissection by Eric Youngdale[6].

Exploitation

Old is always better (for attackers)

Once upon a the time, you where at home and waiting for the rain to stop. As always you “googled” for some interesting informations (of course!) and you found a kind of bible: Smashing the stack for fun and Profit[7].

Identifying the stack address, putting your shellcode at the beginning, adding some padding and rewriting the EIP, you could see that we can execute anything we want while exploiting a stack overflow. But times have changed, and you’re now confronted to canaris, ASLR (Address Space Layout Randomization), no executable stack, RELRO (read-only relocations), PIE support, binary-to-text encoding, and so on.

Nonexecutable stack

To make the stack nonexecutable, we use the bit NX (No eXecute for AMD) or bit XD (eXecute Disable for Intel). In figure 2, you could see that it matches with the most significant bit of a 64-bit Page Table Entry:

Figure 2 – 64-bit Page Table Entry

(Source : A hardware-enforced BOF protection )

So trying to exploit a stack based overflow, you should be surprised by the fact your shellcode doesn’t produce what you expected, and that’s the power of the bit NX (NX = 0 → Execute, NX = 1 → No eXecute).

Using “readlef -l ” you can see if the stack is executable or not :

GNU_STACK 0x0000000000000000 0x0000000000000000 0x0000000000000000

0x0000000000000000 0x0000000000000000 RW 8

As you can see, the only flags we got is the Read and Write ones. You can disable the eXecute flag using “execstack -s [binaryfile]” and see the difference (RWE).

To bypass it, we can use a method called “Return-into-libc”. Endeed, we know that any program that includes libc will have access to its shared functions (such as printf, exit, and so on), and we can execute “system(“/bin/sh”)” to get a shell.

First, we fill the vulnerable buffer with some junk data up to EIP (“AAAAAAAAAHH…”! is great). After that, we have to find “system()” function, but if we want to exit the program properly, the “exit()” will be also needed (using gdb):

(gdb) r main

Starting program: /home/fluxius/toto main

huhu la charrue

Program exited with code 017.

(gdb) p system

$1 = {<text variable, no debug info>} 0x7ffff6b8a134 <system>

(gdb) p exit

$2 = {<text variable, no debug info>} 0x7ffff6b81890 <exit>

Then, we overwrite the return address with system() function’s address and follow it with the exit() function’s address. To finish, we put the address of “/bin/sh” (that you can retrieve from a memcmp() or an environment variable).

Inject = [junk][system()][exit()][”/bin/sh”]

Note : NX bit is only available in Physical Address Extension (PAE), but can be emulated by PaX or ExecShield.

Moreover, we will see after on x86_64 platforms that “return-into-libc” doesn’t work because of the ABI specifications[8], and that’s probably a problem you’ve already encountered.

Address Space Layout Randomization3>

To avoid attackers to execute a dangerous shellcode, people has created a concept named “ASLR” (Address Space Layout Randomization). Indeed, it is a technique to arrange the position of the stack, heap, text, vdso, shared libraries and the base address of the executable (when builded with Position-independent executable support). So if you try to execute any shellcode at a saved position, you’ll observe a little fail, because the shellcode isn’t executed (or you are very lucky) and you get the classic error for segmentation faults as we did not ended properly.

When performing a stack overflow for example, you could disable ASLR changing the current level to “0”:

# echo 0 > /proc/sys/kernel/randomize_va_space

Or there is another trick (proposed by “perror), that does not require “root” privileges, using “setarch” to change reported architecture in a new program environnement and setting personality flags:

$ setarch `uname -m` -R /bin/bash

But it’s not quite fun, is it? So, attackers have found some ways to bypass this kind of technique. Indeed, in older kernels, they saw the ESP points to the stack, and of course, the buffer is on the stack too. A technique using linux-gate’s instructions, that were static before the kernel 2.6.18, was used to retrieve the address of any interesting pattern “\xff\xe4” (“jump esp” on x86) in memory. Other techniques to bypass ASLR exist like Brute-force.

Brute-force

Thinking about exec() family functions, we can use “execl” to replace the current process image with a new process image. Let’s make a simple code to observe the randomization:

( )

{

char buffer [ 100 ] ;

printf ( "Buffer address: %p

" , & buffer ) ;

} mainbufferbuffer

If ASLR is enabled, you should see something like this:

fluxiux@handgrep : ~ / aslr$ . / buffer_addr

Buffer address : 0x7fff5e149710

fluxiux@handgrep : ~ / aslr$ . / buffer_addr

Buffer address : 0x7fff71f6f0b0

fluxiux@handgrep : ~ / aslr$ . / buffer_addr

Buffer address : 0x7fff763299c0

We see that 4 bytes change for each execution, and we have to be very lucky to point in our shellcode, if we try the brute-force way. So we will use “execl” now to see any weakness when the memory layout is randomized for the process:

( )

{

int stack ;

printf ( "Stack address: %p

" , & stack ) ;

execl ( "./buffer_addr" , "buffer_addr" , NULL ) ;

} mainstackstackexeclNULL

Compare the memory layouts with different runs of “buffer_addr”:

fluxiux@handgrep:~/aslr$ ./buffer_addr

Buffer address: 0x7fffc5cfa180

fluxiux@handgrep:~/aslr$ ./buffer_addr

Buffer address: 0x7fff1964d1f0

fluxiux@handgrep:~/aslr$ ./buffer_addr

Buffer address: 0x7fffba20bd30

fluxiux@handgrep:~/aslr$ ./buffer_addr

Buffer address: 0x7fffc8505ed0

fluxiux@handgrep:~/aslr$ ./buffer_addr

Buffer address: 0x7ffff39cbc10

fluxiux@handgrep:~/aslr$ ./buffer_addr

Buffer address: 0x7fff6eb3aa90

fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffc5cfa180 - 0x7fff1964d1f0"

$1 = 2892681104

fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffc8505ed0 - 0x7fffba20bd30"

$1 = 238002592

fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7ffff39cbc10 - 0x7fff6eb3aa90"

$1 = 2229866880

And now with “execl” function:

fluxiux@handgrep:~/aslr$ ./weakaslr

Stack address: 0x7fff526d959c

Buffer address: 0x7fff2e95efd0

fluxiux@handgrep:~/aslr$ gdb -q --batch -ex "p 0x7fffaffcde50 - 0x7fff54800abc"

$1 = 1534907284

fluxiux@handgrep:~/aslr$ ./weakaslr

Stack address: 0x7fffed12acfc

Buffer address: 0x7fffa3a4f8f0

fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffdaf7d5fc - 0x7fff08361da0"

$1 = 3535911004 If we dig a little bit more, we can reduce the domain of probabilistic addresses using “/proc/self/maps” files (local bypass), as shown below:

fluxiux@handgrep:~/aslr$ ./weakaslr

Stack address: 0x7ffffbe8326c

Buffer address: 0x7fff792120c0

fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7ffffbe8326c - 0x7fff792120c0"

$1 = 2194084268

fluxiux@handgrep:~/aslr$ ./weakaslr

Stack address: 0x7fffed12acfc

Buffer address: 0x7fffa3a4f8f0

fluxiux@handgrep:~$ gdb -q --batch -ex "p 0x7fffed12acfc - 0x7fffa3a4f8f0"

$1 = 1231926284

Using this method, we could fill the buffer with return address, add a large NOP sled after the return address + the shellcode and guess any correct offset, to point to it. As you can see, the degree of randomization is not the same, but you can play with that. Of course, this attack is more effective on 32-bits and on older kernel versions[9].

If we dig a little bit more, we can reduce the domain of probabilistic addresses using “/proc/self/maps” files (local bypass), as shown below:

...

00fa8000-00fc9000 rw-p 00000000 00:00 0 [heap]

7ffd77890000-7ffd77a1a000 r-xp 00000000 08:05 396967 /lib/x86_64-linux-gnu/libc-2.13.so

7ffd77a1a000-7ffd77c19000 ---p 0018a000 08:05 396967 /lib/x86_64-linux-gnu/libc-2.13.so

7ffd77c19000-7ffd77c1d000 r--p 00189000 08:05 396967 /lib/x86_64-linux-gnu/libc-2.13.so

7ffd77c1d000-7ffd77c1e000 rw-p 0018d000 08:05 396967 /lib/x86_64-linux-gnu/libc-2.13.so

7ffd77c1e000-7ffd77c24000 rw-p 00000000 00:00 0

7ffd77c24000-7ffd77c26000 r-xp 00000000 08:05 397045 /lib/x86_64-linux-gnu/libutil-2.13.so

7ffd77c26000-7ffd77e25000 ---p 00002000 08:05 397045 /lib/x86_64-linux-gnu/libutil-2.13.so

7ffd77e25000-7ffd77e26000 r--p 00001000 08:05 397045 /lib/x86_64-linux-gnu/libutil-2.13.so

7ffd77e26000-7ffd77e27000 rw-p 00002000 08:05 397045 /lib/x86_64-linux-gnu/libutil-2.13.so

7ffd77e27000-7ffd77e48000 r-xp 00000000 08:05 396954 /lib/x86_64-linux-gnu/ld-2.13.so

7ffd7801d000-7ffd78020000 rw-p 00000000 00:00 0

7ffd78043000-7ffd78044000 rw-p 00000000 00:00 0

7ffd78045000-7ffd78047000 rw-p 00000000 00:00 0

7ffd78047000-7ffd78048000 r--p 00020000 08:05 396954 /lib/x86_64-linux-gnu/ld-2.13.so

7ffd78048000-7ffd7804a000 rw-p 00021000 08:05 396954 /lib/x86_64-linux-gnu/ld-2.13.so

7fff7d479000-7fff7d49a000 rw-p 00000000 00:00 0 [stack]

7fff7d589000-7fff7d58a000 r-xp 00000000 00:00 0 [vdso]

ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]

Unfortunately, this leak is partially patched since 2.6.27 according to Julien Tinnes and Tavis Ormandy[10], and these files seem to be protected if you cannot ptrace a pid. Anyway, there was any other way using “/proc/self/stat” and “/proc/self/wchan” that leak informations such as stack pointer and instruction pointer (=>ps -eo pid,eip,esp,wchan), and by sampling “kstkeip”, we could reconstruct the maps (see fuzzyaslr by Tavis Ormandy[11]).

Brute-forcing is always a very offensive way to get what you want, it takes time, and you should know that every tries recorded in logs. The solution is maybe in registers.

Return-to-registers

Using a debugger like GDB, can help you to find other ways to bypass some protections like DEP as shown previously and ASLR of course. To study this case, we will work with a better example:

#include <stdio.h>

#include <string.h>

vuln ( char * string )

{

char buffer [ 50 ] ;

strcpy ( buffer , string ) ; // Guys! It's vulnerable!

}



main ( int argc , char ** argv )

{

if ( argc > 1 )

vuln ( argv [ 1 ] ) ;

} vulnstringbufferbufferstringmainargcargvargcvulnargv

By the way, don’t forget to disable the stack protector (compile as follows: gcc -fno-stack-protector -z execstack -mpreferred-stack-boundary=4 vuln2.c -o vuln2). Will see after what a canary is, but now, just let’s focus on ASLR for the moment.

With few tries, we see that we can rewrite the instruction pointer:

(gdb) run `python -c 'print "A"*78'`

The program being debugged has been started already.

Start it from the beginning? (y or n) y

Starting program: /home/fluxiux/aslr/vuln2 `python -c 'print "A"*78'`

Program received signal SIGSEGV, Segmentation fault.

0x0000414141414141 in ?? ()

Put now a break to the “vuln()” function’s call and the return address:

(gdb) disas main

Dump of assembler code for function main:

0x0000000000400515 <+0>: push %rbp

0x0000000000400516 <+1>: mov %rsp,%rbp

0x0000000000400519 <+4>: sub $0x10,%rsp

0x000000000040051d <+8>: mov %edi,-0x4(%rbp)

0x0000000000400520 <+11>: mov %rsi,-0x10(%rbp)

0x0000000000400524 <+15>: cmpl $0x1,-0x4(%rbp)

0x0000000000400528 <+19>: jle 0x40053d <main+40>

0x000000000040052a <+21>: mov -0x10(%rbp),%rax

0x000000000040052e <+25>: add $0x8,%rax

0x0000000000400532 <+29>: mov (%rax),%rax

0x0000000000400535 <+32>: mov %rax,%rdi

0x0000000000400538 <+35>: callq 0x4004f4 <vuln>

0x000000000040053d <+40>: leaveq

0x000000000040053e <+41>: retq (gdb) break *0x000000000040053e

Breakpoint 2 at 0x40053e

End of assembler dump.

(gdb) break *0x400538

Breakpoint 1 at 0x400538

(gdb) break *0x000000000040053e

Breakpoint 2 at 0x40053e

After that, put a break on the return address of the “vuln()” function:

(gdb) disas vuln

Dump of assembler code for function vuln:

0x00000000004004f4 <+0>: push %rbp

0x00000000004004f5 <+1>: mov %rsp,%rbp

0x00000000004004f8 <+4>: sub $0x50,%rsp

0x00000000004004fc <+8>: mov %rdi,-0x48(%rbp)

0x0000000000400500 <+12>: mov -0x48(%rbp),%rdx

0x0000000000400504 <+16>: lea -0x40(%rbp),%rax

0x0000000000400508 <+20>: mov %rdx,%rsi

0x000000000040050b <+23>: mov %rax,%rdi

0x000000000040050e <+26>: callq 0x400400 <strcpy@plt>

0x0000000000400513 <+31>: leaveq

0x0000000000400514 <+32>: retq

End of assembler dump.

(gdb) break *0x0000000000400514

Breakpoint 3 at 0x400514

As we can see, the RSP contains the return address:

(gdb) info reg rsp

rsp 0x7fffffffe148 0x7fffffffe148

(gdb) x/20x $rsp - 40

[...]

0x7fffffffe140: 0x00000000 0x00000000 0x0040053d 0x00000000

[…]

The return address as been overwritten (we also noticed that previously):

(gdb) info reg rsp

rsp 0x7fffffffe148 0x7fffffffe148

(gdb) x/20x $rsp - 40

0x7fffffffe120: 0x41414141 0x41414141 0x41414141 0x41414141

0x7fffffffe130: 0x41414141 0x41414141 0x41414141 0x41414141

0x7fffffffe140: 0x41414141 0x41414141 0x41414141 0x00004141

0x7fffffffe150: 0xffffe248 0x00007fff 0x00000000 0x00000002

0x7fffffffe160: 0x00000000 0x00000000 0xf7a66eff 0x00007fff

And running at the last breakpoint, we can observe that register RAX points to the beginning of our buffer:

(gdb) stepi

0x00000000004004f8 in vuln ()

(gdb) info reg rax

rax 0x7fffffffe520 140737488348448

(gdb) x/20x $rax - 40

0x7fffffffe4f8: 0x36387816 0x0034365f 0x00000000 0x2f000000

0x7fffffffe508: 0x656d6f68 0x756c662f 0x78756978 0x6c73612f

0x7fffffffe518: 0x75762f72 0x00326e6c 0x41414141 0x41414141

0x7fffffffe528: 0x41414141 0x41414141 0x41414141 0x41414141

0x7fffffffe538: 0x41414141 0x41414141 0x41414141 0x41414141

(Note that if you’re not sure, try with this payload: `python -c ‘print “A”*70+”B”*8’`).

After that, we look for a valid “jmp/callq rax”:

fluxiux@handgrep:~/aslr$ objdump -d ./vuln2 | grep "callq"

4003cc: e8 6b 00 00 00 callq 40043c <call_gmon_start>

[...]

400604: ff d0 callq *%rax

..

At “0x400604” could be great, we just have to replace the junk data (“A”) by NOP sled and a precious shellcode that fits on the buffer and we replace the instruction pointer by the address “0x400604”. On 32-bits, “Sickness” has written a good article about that if you are interested[12].

But as you know, by default on Linux (especially the user friendly one: Ubuntu), programs are compiled with the bit NX support, so be lucky to use this technique on nowadays systems. Indeed, we use also an option to disable the stack protector, but what is it exactly?

Stack Canary

Named for their analogy to a canary in a coal mine, stack canary are used to protect against stack overflow attacks. Compiling with the stack protector option (which is used by default), each dangerous function is changed in his prologue and epilogue.

If we compile the previous code letting stack protector to be used, we get something like that:

fluxiux@handgrep:~/ssp$ gcc -z execstack -mpreferred-stack-boundary=4 vuln2.c -o vuln3

fluxiux@handgrep:~/spp$ ./vuln3

fluxiux@handgrep:~/spp$ ./vuln3 `python -c 'print "A"*76'`

*** stack smashing detected ***: ./vuln3 terminated

Disassembling the “vuln()” function, we can see in the epilogue that a comparison is done:

(gdb) disas vuln

Dump of assembler code for function vuln:

[...]

0x000000000040058d <+41>: callq 0x400470 <strcpy@plt>

0x0000000000400592 <+46>: mov -0x8(%rbp),%rdx

0x0000000000400596 <+50>: xor %fs:0x28,%rdx

0x000000000040059f <+59>: je 0x4005a6 <vuln+66>

0x00000000004005a1 <+61>: callq 0x400460 <__stack_chk_fail@plt>

0x00000000004005a6 <+66>: leaveq

[...]

If the value in “fs:0x28” is the same as in ”%rdx”, the “vuln()” function will end properly. In other case, the function “__stack_chk_fail()” will be called and an error message shows up (“*** stack smashing detected ***: ./vuln3 terminated ”).

Putting a break on “__stack_chk_fail()” function, we can observe the values on $RSP:

(gdb) run `python -c 'print "A"*57'`

Starting program: /home/fluxiux/aslr/vuln3 `python -c 'print "A"*57'`



Breakpoint 1, 0x00000000004005cb in main ()

(gdb) c

Continuing.

Breakpoint 2, 0x00000000004005a1 in vuln ()

(gdb) x/30x $rsp

0x7fffffffe100: 0x00000000 0x00000000 0xffffe535 0x00007fff

0x7fffffffe110: 0x41414141 0x41414141 0x41414141 0x41414141

0x7fffffffe120: 0x41414141 0x41414141 0x41414141 0x41414141

0x7fffffffe130: 0x41414141 0x41414141 0x41414141 0x41414141

0x7fffffffe140: 0x41414141 0x41414141 0xbf630041 0xe3b6079a

0x7fffffffe150: 0xffffe170 0x00007fff 0x004005d0 0x00000000

0x7fffffffe160: 0xffffe258 0x00007fff 0x00000000 0x00000002

0x7fffffffe170: 0x00000000 0x00000000

At “0x7fffffffe148”, we have rewrote 1 byte of the stack cookie value saved on RSP (that’s why the breakpoint 2 stopped __stack_chk_fail()). At “0x7fffffffe158” , we see the return address of main. So the structure of this canary should be like in figure 3:

There are 3 kinds of canaries:

Null (0x0),

terminator (letting the first bytes to be “\a0\xff”),

random.

The first 2 kinds are easy to bypass[14], because you just have to fill the buffer with your shellcode, giving a desired value to be at the right position and rewrite the instruction pointer. But for the random one, it is a little more fun, because you have to guess its value at each execution (Ow! A kind like ASLR?).

For random canaries, the “__gard__setup()” fills a global variable with random bytes generated by “/dev/urandom”, if possible. Latter in the program, only 4|8 bytes are used to be the cookie. But, if we cannot use the entropy of “/dev/urandom”, by default we will get a terminator or a null cookie.

Brute-force is a way, but you will use to much time. By overwriting further than the return address, we can hook the execution flow using GOT entries. The canary will of course detect the compromising, but too late. A very good article covering the StackGuard and StackShield explain four ways to bypass these protections[15].

However, on new kernels you also have to noticed that the random cookie is set with a null-byte at the end, and trying to recover the value from forking or brute-forcing will not work with functions like “strcpy”. So the better way to do that, is to have the control of the initialized cookie.

Format string vulnerabilities or heap overflow for example, are more easy to exploit with this protection, but this article is not finished yet and we will see another memory corruption mitigation technique.

RELRO

In recent Linux distributions, a memory corruption mitigation technique has been introduced to harden the data sections for binaries/processes. This protection can be viewable reading the program headers (with readelf for example):

fluxiux@handgrep:~/ssp$ readelf -l vuln3

[...]

Program Headers:

[...]

GNU_RELRO 0x0000000000000e28 0x0000000000600e28 0x0000000000600e28

0x00000000000001d8 0x00000000000001d8 R 1

On current Linux, your binaries are often compiled with RELRO. So that mean that following sections are mapped as read-only:

08 .ctors .dtors .jcr .dynamic .got

Optionally, you can compare dissecting a non-RELRO binary, as follows:

fluxiux@handgrep:~/ssp$ gcc -Wl,-z,norelro vuln2.c -o vuln4

[…]

LOAD 0x0000000000000768 0x0000000000600768 0x0000000000600768

0x0000000000000200 0x0000000000000210 RW 200000

[…]

03 .ctors .dtors .jcr .dynamic .got .got.plt .data .bss

[...]

The exploitation of a format string bug for example, using the format parameter “%n” to write to any arbitrary address like GOTs is suppose to fail. But as we noticed previously, PLT GOTs have “write” permissions and then we are face to a partial-RELRO only.

With the example in trapkit’s article about RELRO[16], we could see that it is very easy to rewrite a PLT entry. But in some cases (mostly in dist-packages), binaries are compiled with a full-RELRO:

fluxiux@handgrep:~/relro$ gcc -Wl,-z,relro,-z,now -o fullrelro fmstr.c

[..]

fluxiux@handgrep:~/relro$ readelf -l ./fullrelro | grep "RELRO"

GNU_RELRO 0x0000000000000df8 0x0000000000600df8 0x0000000000600df8

fluxiux@handgrep:~/relro$ readelf -d ./fullrelro | grep "BIND"

0x0000000000000018 (BIND_NOW)

Note: BIND_NOW indicates that the binary is using full-RELRO.

The entire GOT is remapped as read-only, but there are other sections to write on. GOTs are use mostly for flexibility. Detour with “.dtors” can be perform as Sebastian Krahmer described in his article about RELRO[17].

We have seen common Linux protection used by default, but the evolution of kernels and architectures have made things more difficult.

The x86_64 fact and current systems hardening

With time, the new versions of Linux distribution become well hardened by default. In my studies, the Ubuntu one surprised me a lot, because in addition to these protections implanted by default, this system turns to take some openBSD solutions to be as user friendly and secure as possible. Moreover, we have seen few protections and ways to bypass it, but the 64-bits give us other difficulties.

As you notices, addresses have changed and it more difficult to exploit some memory corruption because of the byte “\x00”, considered as a EOF for some functions like “strcpy()”. We saw that NX is enabled and the compilation in gcc with its support are made by default. But the worst is coming. Indeed, we now that the randomization space is larger but what interest us, is the System V ABI for x86_64[8].

Things have changed for parameters in functions. Indeed, instead of copying parameters in the stack, the first 6 integer and 8 float/double/vector arguments are passed in registers, rest on stack. See an example:

// Passing arguments example

typedef struct {

int a , b ;

double d ;

} structparm ;

structparm s ;

int e , f , g , h , i , j , k ;

long double ld ;

double m , n ;

__m256 y ;

extern void func ( int e , int f ,

structparm s , int g , int h ,

long double ld , double m ,

__m256 y ,

double n , int i , int j , int k ) ;

func ( e , f , s , g , h , ld , m , y , n , i , j , k ) ;

The given register allocation looks like this (in figure 4):



Figure 4 – Register allocation example

(source: System V ABI x86_64)

I suggest you to read the slides Jon Larimer about “Intro to x64 Reversing”[18].

We could use the knowledge of borrowed code chunks’ article[19] that can help us to understand problems of NX, System V ABI x86_64 differences with x32, and ways to bypass them using instructions to write a value on one register, and call the function “system()”, for example, that will use this register as a parameter.

Other sophisticated attacks like Return-oriented Programing are use to bypass these protection that make life difficult in an exploit process.

As you could see, protections didn’t make things impossible, but just harder and harder. So be aware of new applied protections and conventions to not waste too much time.

References & Acknowledgements

[1] ELF HOWTO –

http://cs.mipt.ru/docs/comp/eng/os/linux/howto/howto_english/elf/elf-howto-1.html

[2] Tool Interface Standard (TIS) Executable and

Linking Format (ELF) Specification



[3] Working with the ELF Program

Format – http://www.ouah.org/RevEng/x430.htm

[4] Executable_and_Linkable_Format#Applications

-http://www.linuxjournal.com/article/1060

http://en.wikipedia.org/wiki/Executable_and_Linkable_Format#Applications

[5] elf.h: ELF types, structures, and macros –

http://sourceware.org/git/?p=glibc.git;a=blob_plain;f=elf/elf.h

[6] The ELF Object File Format by Dissection –

http://www.linuxjournal.com/article/1060

[7] Smashing the stack for fun and Profit –

http://www.phrack.org/issues.html?issue=49&id=14#article

[8] System V Application Binary Interface on x86-64 –

http://www.x86-64.org/documentation/abi.pdf

[9] Hacking – The art of exploitation (by Jon

Erickson)

[10] Local bypass of Linux ASLR through /proc

information leaks –

http://blog.cr0.org/2009/04/local-bypass-of-linux-aslr-through-proc.html

[11] Fuzzy ASLR – http://code.google.com/p/fuzzyaslr/

[12] ASLR bypass using ret2reg –

http://www.exploit-db.com/download_pdf/17049

[13] /dev/urandom –

http://en.wikipedia.org/wiki//dev/urandom

[14] Stack Smashing Protector (FreeBSD) –

http://www.hackitoergosum.org/2010/HES2010-prascagneres-Stack-Smashing-Protector-in-FreeBSD.pdf

[15] Four different tricks to bypass StackShield and

StackGuard protection –

http://www.coresecurity.com/files/attachments/StackguardPaper.pdf

http://tk-blog.blogspot.com/2009/02/relro-not-so-well-known-memory.http://tk-blog.blogspot.com/2009/02/relro-not-so-well-known-memory.htmlhtml

[17] RELRO by Sebastian Krahmer –

http://www.suse.de/%7Ekrahmer/relro.txt

[18] Intro to x64 reversing –

http://lolcathost.org/b/introx86.pdf

[19] x86-64 buffer overflow exploits and the borrowed

code chunks – http://www.suse.de/~krahmer/no-nx.pdf