(Part I) : ELF Header https://blog.k3170makan.com/2018/09/introduction-to-elf-format-elf-header.html (Part II) : Program Headers https://blog.k3170makan.com/2018/09/introduction-to-elf-format-part-ii.html (Part III) : Section Header Table https://blog.k3170makan.com/2018/09/introduction-to-elf-file-format-part.html (Part IV) : Section Types and Special Sections https://blog.k3170makan.com/2018/10/introduction-to-elf-format-part-iv.html this

.init_array

.fini_array

C Start Up

_start

__libc_start_main

main()

__libc_start_main

LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),

int argc, char **argv,

#ifdef LIBC_START_MAIN_AUXVEC_ARG

ElfW(auxv_t) *auxvec,

#endif

__typeof (main) init,

void (*fini) (void),

void (*rtld_fini) (void), void *stack_end)





int (*main) - no guessing here this is a pointer to the main method in the binary.

- no guessing here this is a pointer to the main method in the binary. int argc - the number of arguments passed to the binary from the command line, including the binary's name ( we will show this later ).

- the number of arguments passed to the binary from the command line, including the binary's name ( ).

char **argv the array holding the actual strings its important to remember some terms here, argv is passed to the _start function via the stack pointers essentially.

the array holding the actual strings its important to remember some terms here, argv is passed to the function via the stack pointers essentially. __typeof (main) init - This is a pointer to the function ( __libc_csu_init ) that handles calling the initializer or constructor functions. I'm going to call this a constructor function "call handler"[see footnote 1].

void (*fini) (void) - this is the analogous function pointer to the one that handles calling destructor functions.

this is the analogous function pointer to the one that handles calling destructor functions. void (*rtld_fini) (void) - The destructor function call handler for the dynamic linker, this value is passed to _start via edx from the loader ( we will see this being used soon ). - I won't get into how the destructor function call handler here works too much, its really a little off track for this discussion but when I cover dynamic linking I'll expand on it more ;)

The destructor function call handler for the dynamic linker, this value is passed to via from the loader ( ). - I void *stack_end end of stack marker.

_start

__libc_start_main

rtld_fini

_start

edx

_start

Reverse Engineering glibc _start

_start

0x400455 <+5>

ebp

rtld_fini

rdx

r9;

__libc_start_main

rdx

rdx

dl_fini

x/64ib $rdx

x/

r9

dl_fini

_start

0x400455 <+5>

pop

rsi

argc.

argc

_start

argc

argv

envp

1 2 3 4 5

argv

6

0x7fffffffddb0

argv

envp

_start

argv

<_start+6>

rdx

<_start+9>

__libc_start_main

rax

rax

_libc_start_main

















We are clearly using the SystemV ABI for x86_64 calling convention here. This is since instead of pushing all parameters onto the stack in a given order, we do the following:





Parameters to functions are passed in the registers rdi, rsi, rdx, rcx, r8, r9, and further values are passed on the stack in reverse order.









And as we see the registers contain the following:

rdi - pointer to first instruction in int (*main) function

- pointer to first instruction in function rsi - argc value

- value rdx - argv pointer

- pointer rcx - pointer to first instruction in libc_csu_init - the program's constructor call handler again .

- pointer to first instruction in - the program's constructor call handler again . r8 - pointer to __libc_csu_fini

- pointer to r9 - pointer to rtld_init the mysterious dymanic linker desctuctor call handler.

or that it actually works as the code intends) - coming through strong with the documentation once again [see footnote 1] The following extract is from





There are some other interesting details to what __libc_start_main does after this, some of which involves deep Elf sorcery like reading past the value of argv to find envp. There are wonderful articles on this on the



To summarize __libc_start_main , and bring the .init_array and .fini_array in to context. Essentially what start_main does is stuff like:



Setup stack guard:

https://github.com/lattera/glibc/blob/master/csu/libc-start.c#L205



https://github.com/lattera/glibc/blob/master/sysdeps/unix/sysv/linux/dl-osinfo.h#L51

Register destructors (including the one for the rtld) so they are called

https://github.com/lattera/glibc/blob/master/csu/libc-start.c#L238



https://github.com/lattera/glibc/blob/master/csu/libc-start.c#L248

Check that the file descriptors STDIO STDERR STDIN are setup properly:

are setup properly: https://github.com/lattera/glibc/blob/master/csu/check_fds.c#L87

Some other cool stuff and of course eventually makes the call to (*init) which in the context of start_main , means __libc_csu_init. This is the function that as we see in the footnotes actually makes the call to the init functions we define. Here's confirmation of that call chain from gdb:





foo_constructor is obviously our constructor and we can see it indeed does get call first from __libc_csu_init . These constructors are saved in the sections marked .init_array and the analogous array for deconstructors is called .fini_array . Next section covers how they work. .init_array and .fini_array Sections and hex sorcery I'd like to get straight into deconstructing how the .init_array and .fini_array sections work. Lets see what they look like in the section header table and annotate all their fields in an honest hexdump :









What we can see here is that the .init_array section points into the ELF file at 0x0e00 , which holds two addresses:

0x0e00 (.init_array)

0x400540 (frame_dummy) - not going to dig into this too much, but what I glean about this for now is that this sets up things to be able to do exception handling and reconstructing stack frames to aid debugging and stack forensics. More on this here and here.

-

0x400440 (foo_constructor) - our constructor! We also have the .fini_array section at 0x0e10 which is holds these entries: 0x0e10 (.fini_array)

0x400520 (__do_global_dtors_aux) - handles destructors when .fini_array is not defined according to this.

-

0x400430 (foo_destructor) - our destructor! And in case you don't believe me check out this dank documentation in the glibc libary confirming that we reverse engineered this correctly () -The following extract is fromThere are some other interesting details to whatdoes after this, some of which involves deep Elf sorcery like reading past the value of argv to find envp. There are wonderful articles on this on the internet and the code for __libc_start_main is also available. I take it you folks would enjoy the exercise of confirming it works as described.To summarize, and bring theandin to context. Essentially what start_main does is stuff like:I'd like to get straight into deconstructing how theandsections work. Lets see what they look like in the section header table and annotate all their fields in an honestWhat we can see here is that thesection points into the ELF file at, which holds two addresses:

.init_array

never_call

hexdump

.init_array

fini_array

References and Recommended Reading

How C Programs get run https://lwn.net/Articles/631631/ System V intel https://software.intel.com/sites/default/files/article/402129/mpx-linux64-abi.pdf System V ABI https://wiki.osdev.org/System_V_ABI Examining Memory with GDB https://sourceware.org/gdb/onlinedocs/gdb/Memory.html https://stackoverflow.com/questions/34966097/what-functions-does-gcc-add-to-the-linux-elf http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html

Footnotes

1 - why _dl_fini should be refered to as the desctructor "function call handler" in my opinion





This is since, though some folks refer to this as THE [de/con]-structor function, in reality it is only the standardized function that finds the pointers TO the user defined [de/con]-structor functions. Here's why I say so, extract from https://github.com/lattera/glibc/blob/master/elf/dl-fini.c#L137









What can I say folks that glibc comment game is solid though. Code speaks for its Elf around here ;)

So I take it, this makes it obvious that the pointer to the dl_fini function can actually be refereed to as more of a destructor "call handler", no? To close my point lets look at dl_init.c for the definition of __dl_init as well:







Pretty much the same thing, it uses some link map type object ( ElfW(Dyn) *preinit_array = main_map->l_info[DT_PREINIT_ARRAY]; ) loaded with the offsets and all the ELF format goodies. Uses this to calculate an offset to the init function array, and then just runs through them calling them with pointers to argv , argc and envp.



Anyway, while make that heavily egotistical point we actually traversed some pretty important code in the Elf world, this is the very definition of the _dl_fini function that handles your binary. If you wanna unlock the s e c r e t s you should spend some time digging through that /elf/ directory.









This post is part of a series on the ELF format, if you haven't checked out the other parts of the series here they are:In this post I'm going to cover how some of the aspects of C start up and mess around with theandsections to show how they work.So something must happen to get your code in the main function running. This process is called the C start up and it essentially involves running all the initialize code, setting up pointers to some important arrays and then branching over to main.What themethod needs to do essentially is perform a function call towhich is the function that will actually callNow if you haven't guessed, this means we need a pointer to the main function as an argument to. It has a couple other parameters here they are:update: I realized that the original version of the post had the wrong function header for start_main, I grabbed this one straight from glibc ( https://github.com/lattera/glibc/blob/master/csu/libc-start.c#L129 for an alternative explanation of this check out - http://dbp-consulting.com/tutorials/debugging/linuxProgramStartup.html (sorry no https :(So what we have here is:Just to re-iterate all of these wonderful things must be prepared byfor the call to, and we also know thatis passed toviaBeyond thatis loaded with a very helpful stack layout that makes locating the argv and argc easy to find. Lets how this is done in a real world example.Here's what start looks like for one of my binaries during execution:To clarify what is happening in the figure above. I am here setting a break point to thefunction. I'm highlighting the instruction that was just executed ().Digging into the assembler here the first instruction is essentially to clear out. After this it passes the pointer tofromtothis is actually prepping it already for its cozy position for the importantcall. It also saves the value from being destroyed whenis used later on.What the screen shot above also confirms is that theregister does indeed contain a pointer to thefunction; this is shown in theinstruction which says: "" () . You can of course do this equivalently onit will no doubt at this point in execution show the same value -. Before we dig into thisfunction[see footnote 1] lets look at the rest of the instructions in thecode.The next instruction atis aintowhich contains a pointer to theHow do we know this? Well we know that this part of the stack contains a pointer tobecause when the program enters for the first time andgets called () the stack essentially containsandwe can see this in the following screen dump:So from this figure we can see the arguments being passed to the binary is "". We can also see that the first entry in argv is the name of the binary itself, this means the length ofshould be, as is shown at the first address on the stack at. Next argument on the stack is the start of the actualarray, and after that we have a null terminator and the start of thearray.Back to themethod. After first pop off of the stack; the top of the stack holds a pointer toand at instructionwe save that to. After this at theinstruction we use a bit mask to clear a few bits from the stack value to ensure its aligned properly and then proceed to prep it for the call to).Once the stack is aligned it pushesonto the stack according to some stuff I've read on this says this is purely to preserve memory alignment boundaries as well, and that this value inisn't used and doesn't mean anything.I've dumped the register values when the call tohappens just to check out what is actually being passed to it:So we know where to find the pointers to our desctructor and constructor functions and we know when they will be called, lets see if we can force the binary to call another function instead.So if I were to make thepoint to the function, which as in the previous example is never called under normal execution here's what thewould look, like:Win! We can control the flow of execution by redirecting the entries in thesection! This works of course the same way forI'm going to leave that for you folks to figure out if you'd like to.Thanks for reading this one, more posts on deep Elf sorcery and other wonderful linuxy things comings soon!