In this post I'm going to go over in detail how some of the sections in the format work in a bit more detail. Previous posts didn't really expand on all the weirdness that each individual section type and format can harbor, especially in how it can break interpretation of the file under normal debugging and reverse engineering efforts. We're going to run through a couple sections here, talk about different section types and see what ELFs can make some of the binutils do if we mess around with the bytes. Hope you folks enjoy!

Section Types

SHT_STRTAB

.shstrtab

.strtab

SHT_NULL - purely for storing null bytes, documentation refers to this as directly for marking a section as unused and will most probably be skipped over by most semantically driven ELF utilities. This is also a field that sometimes avoids reading strings over-into other sections. One can imagine many C programmers enjoy scanning until the cows come home OR they hit a null byte - this is the odd reason why such fields are necessary sometimes.

- purely for storing null bytes, documentation refers to this as directly for marking a section as unused and will most probably be skipped over by most semantically driven ELF utilities. This is also a field that sometimes avoids reading strings over-into other sections. One can imagine many C programmers enjoy scanning until the cows come home OR they hit a null byte - this is the odd reason why such fields are necessary sometimes. SHT_PROGBITS - This is just a marking for a section that says it could contain anything, and the format is usually dictated by the program being executed essentially. PROGBIT s is pretty much for program specific behavior - which could be anything - literally anything even Turing complete anything! These are typically used for marking the sections that contain actual code for execution, the data section, initialization / finalization procedures (or perhaps even wilder concepts specific to the ABI or compiler producing the executable code sections and accompaniments - again this section type doesn't impose much format control really )

- This is just a marking for a section that says it could contain anything, and the format is usually dictated by the program being executed essentially. s is pretty much for program specific behavior - which could be anything - literally anything even Turing complete anything! These are typically used for marking the sections that contain actual code for execution, the data section, initialization / finalization procedures ) SHT_SYMTAB - This provides a pointer to a section that should have the format of the symbol table - I will of course flesh out how this works later on in the post because it needs it own space so in a literal way I'm going to use this keyword to mark a section further down in the post :)

This provides a pointer to a section that should have the format of the symbol table - I will of course flesh out how this works later on in the post because it needs it own space so in a literal way I'm going to use this keyword to mark a section further down in the post :) SHT_STRTAB - A section that holds a null terminated list of strings.

- A section that holds a null terminated list of strings. SHT_HASH - This section is for holding a hash table, usually to speed up looking for symbols. In fact documentation says that if an executable participate in dynamic linking it MUST have one of these sections. I will put that bold brave beautiful claim to the test later on in the post (if not in its own post depending on how exciting this potential lie becomes).

There are tons more section types, I thing its best to revert to the documentation on the full list instead of re-creating it here. Lets take a closer look at how some of these work though.



SHT_STRTAB section types (.shstrndx and friends)

Looking at what a typical SHT_STRTAB is like in a hexdump:

.shstrtab

shstrtab

readelf

hexdump

shstrtab

0x18F4

0x18FC

shstrtab

readelf

.interp

0x1910

0x1917

.interp

.note.ABI-tag.

I-tag

0x191F

0x1924

NULL





sh_type

hexdump

0x1a00

readelf

gdb





sh_*

strace

ltrace.

Moving on!

SHT_NOTE sections (.note.ABI-tag and friends)

SHT_NOTE

0x00 (4 bytes) namesz - size of the name field in bytes.

- size of the name field in bytes. 0x04 (4 bytes) descsz - size of the desc field in bytes

- size of the desc field in bytes 0x08 (4 bytes) type - the type field of the OS ABI

- the type field of the OS ABI 0x0C (4 bytes) name - the name field containing a null terminated list of characters

- the name field containing a null terminated list of characters 0x10 (4 bytes) desc - the description field holding some numbers that indicate

descsz

0x0B





namesz is set to 0x04 00 00 00 which means the name field is 4 bytes in size

which means the name field is 4 bytes in size descsz is set to 0x10 00 00 00 which means the description field is 16 bytes in size

which means the description field is 16 bytes in size type is set to 0x01 00 00 00 which means this is GNU/Linux ( because my machines are FREE machines! )

which means this is GNU/Linux ( ) name field reads 0x47 0x4e 0x55 0x00 which we can clearly see reads 'G' 'N' 'U'

which we can clearly see reads desc field holds an array of values starting at 0x268 -> 0x27C .

The desc field needs a little explaining and the documentation on it is slim but here's a couple places that may expand on it better than I do (I've included them in the reading and references section) To see how its handled check out this extract from glibc-2.28/elf/dl-load.c :





dl-load

Conclusion

References and Recommended Reading:

*<side-rant>

</side-rant>

From other posts I've already expanded on the section table header and in that header we have a field called sh_type, which indicates the section type. Each section type is like a model or layout type for a given kind of section and imposes certain attributes to how the bits and bytes are grouped together to mean things in those sections. For instance they might be simple lists or complex nested hash look up tables.To make this clearer; lets imagine how this aids problem solving in the ELF format. Lets say a compiler, malware or exploit developer needs a section to host a simple list of strings, in this case a section type ofwould be appropriate. And as we see theandare exactly those types:Here's a list of what the some of others are meant to be used for:As you can see the strings are nice and neatly delimited by null bytes, super easy to not mess this up when reading in strings in C :))).In previous posts I mentioned that theholds section names, which means it provides a good starting point for mangling the section attributes in a way that skews their interpretation by debug tools or other ELF interpreters - a key skill in understanding how they work!*So in this same method; for the first experiment I decided to point the start of thedown 8 bytes to see what happens to's output about the sections; I get the following results:Just to make the diagram clearer, what we have here is on the top frame, the rawof the start of the. Originally started atand we shifted it down to start atWhat you should see in this; is that by moving the start of thesection we've seen that the strings jump 8 bytes down for each entry. More accurately we can say they all start 8 bytes down, but because they are stringswill read bytes in until it hits a null byte. For instance we can see that the first section name instead ofwhich is atoriginally now points to. Thesection usually the first valid section is now calledThe following section name () is then,) and then reads until it hits the null byte at.. The rest of the sections follow the same pattern - good exercise would be to to confirm this on your own.Okay so what happens when we mangle the section types? Lets say we NULL them out, swap section types on some of them and see if the program still runs - and if it doesn't why and how far it manages to get close to running.Here's the results froming out the section types ():The large white column here marks the column in this ELF that contains thebytes, I'm really just being lazy with labeling here and leaving identification of the individual section types up to the reader if need be. But once you get in the swing of identifying the section table layout by hand, you'll quickly realize if this column is null it immediately means a whole bunch of section types are nulled out. The smaller boxes next to this column, shows some virtual addresses for some of the sections, I highlight them here so you can see quickly that we have indeed written over the records for sections shown on the right. We can also see in thethat the section header table starts at). To confirm another way we can see that in theoutput on the right, all the section types are indicated by NULL.We can also see this does strange things towhen its trying to load some information from those sections and can even break its ability to interpret it as an executable:Some rudimentary anti-debugging right there. Of course the immediate compliment of this as a reverse engineering effort would be to reconstitute the section headers from a stripped binary (). It might be worth it to explore what happens when you mangle other section attributes and pass it to other utilities likeandThetype sections are simple lists of integers that provide versioning and typing for vendors. The GNU folks tend to mark ELFs liberally with these sections on GNU/Linux systems. In fact these sections are meant to indicate that they were built by tools from these systems and indicate versioning information about them. So it lists your kernel version or GNU tool version potentially lets say ().This section holds some semantic versioning information about the ABI being used and the operating system this file is for. The format of the field is basically simply a list containing 4 32 bit-words or 4 groups of 4 bytes. The layout works as follows:Documentation describes that you can potentially have a note section that has no descriptor, in that case we just set theto 0, and don't have the section atHere's what a note section looks like in a hexdump:Here we can see the following settings for the field values:Essentially it indicates the OS version and this is clearly compared to a standardized value in the library whenhandles it. How exactly this OS version field works is going to take a little more research on my part before I get much more mouthy about it.That's going to be it for this post I don't like to bloat posts with too much text because as we know things are easier to understand when they are broken into smaller parts and carefully studied*(). In further posts in the series I will expand on the rest of the sections. For now I hope that cracking open these few I've started you on your way in detailing how the others work too; by understanding their types, and therefore layout gives us power to control how they are interpreted. There is a lot more tricks that can be pulled off by messing with these fields. So happy hacking!And stay tuned for the follow up posts on the GNU_HASH and other weird archaic section types.Why is this? Why do we need to break things to learn them? Especially in computers? As we know in many sciences we learn how things are build by breaking them down, tearing them apart and boiling away their non-essential parts and deciding what they mean from the perspective of their super-structures - we study how the "super" works by breaking open its "minor" parts i.e.More directly perhaps in the science of computer hacking, because we often work in the realms governed by) the capability of computer languages (); some have realized that our greatest pains and harshest challenges come often straight from underestimating the way languages work when they are allowed to be spoken with their broken, inconsistent and superstructure referencing parts ().Just to cleanly connect my points here - one language is the "bigger", around or hosting another language by the size of itspower and because of the references possible from its "hosting" or subset and computationally smaller languages i.e what it can possibly compute under certain proofs when using those small languages in these contexts. Sometimes they lend "subsets" of this power to isolated subsets of their literal symbols:So through these languages we can directly speak () we make reference to outer more powerful structures that appear within languages themselves (), that also impose or allow power over their ordering and labeling and effective interpretation. We say that these spirits called " weird machines " arise from learning what we can summon in apparent or seeming "non-weird machines" by giving execution and interpretation to the aspects of a language that are built in the "intersections" between other languages. Quick example relevant here is to say; if you can make string input to a program also impose meaning () on the stack layout (regardless of); namely the string is both character data and stack address data, it exposes an intersection of two languages which gives life to the string data in an unusual but powerful way - it is not just displayable but also executable!Anyway sorry for the philosophical rant - on with the section meta-data mangling!