Introduction

chunk

whether they are available for use,

how big they are,

and which chunks are around them in the list and other wonderful things.

Heap speak

chunk

INTERNAL_SIZE_T - is something I should probably explain, this is a size type, for the fields that define "bookeeping" functions in the heap management - stuff like pointers (addresses) and bit fields. This size definition that is left to be implementation defined. We can imagine that glibc would want to be portable and flexible across different hardware's and runtime implementations - so address sizes mapped to INTERNAL_SIZE_T can (but I don't think often do) vary. Anyway, the INTERNAL_SIZE_T is defined to be size_t - which falls back onto how ever your C runtime originally solved the problem.

- is something I should probably explain, this is a size type, for the fields that define "bookeeping" functions in the heap management - stuff like pointers (addresses) and bit fields. This size definition that is left to be implementation defined. We can imagine that glibc would want to be portable and flexible across different hardware's and runtime implementations - so address sizes mapped to can (but I don't think often do) vary. Anyway, the is defined to be - which falls back onto how ever your C runtime originally solved the problem. mchunk_prev_size - is the very first part of a chunk format and this is used whether its a free or used chunk. This field indicates the size of the chunk just before this one, and its least significant bit is set to 0x1 if the chunk referred to, is free. So if you are looking at a chunk, and its prev_size has a least sig bit of 0x1 , just before this is a chunk that is still "alive".

- is the very first part of a chunk format and this is used whether its a free or used chunk. This field indicates the size of the chunk just before this one, and its least significant bit is set to if the chunk referred to, is free. So if you are looking at a chunk, and its prev_size has a least sig bit of , just before this is a chunk that is still "alive". mchunk_size - pretty standard, actually just holds the current size in bytes lol.

- pretty standard, actually just holds the current size in bytes lol. struct malloc_chunk* fd - so this is a field in the chunk struct defining a space for an address to another chunk. This is because it forms a linked list. This linked list being defined here is the "free list", which snaps together all the chunks that are free on the heap. Here we are defining the "forward pointer" in the linked list.

- so this is a field in the chunk struct defining a space for an address to another chunk. This is because it forms a linked list. This linked list being defined here is the "free list", which snaps together all the chunks that are free on the heap. Here we are defining the "forward pointer" in the linked list. struct malloc_chunk *bk - you guess it, this the same type as the previously mentioned field, we are here just talking about the "backward pointer".

- you guess it, this the same type as the previously mentioned field, we are here just talking about the "backward pointer". struct malloc_chunk *fd_nextsize - so this field is from another layer of free listing tech in the heap. This pointer is added to a free chunk if its above a certain size threshold (we will cover this later on) - so that the heap manager can track huge chunks should they appear. Its kinda like being a high roller in a casino, when you come out they track your movements and wants even more intensely because you affect the profitability of the evening more.

Growing heap the natural way

$rax

$rax

free_string

0x602a00

Free Chunk Coalescence

0x602580

0x6024a0

0x6024a0

0x211

0xe1 + 0x130

References and Reading

Check out the following to see some inspiration for this post and more awesome things to find out about the heap works.

https://sourceware.org/glibc/wiki/MallocInternals http://phrack.org/issues/66/10.html http://www.phrack.org/issues/68/13.html https://github.com/shellphish/how2heap https://www.blackhat.com/presentations/bh-usa-07/Ferguson/Whitepaper/bh-usa-07-ferguson-WP.pdf 2007 BlackHat Vegas V82 Ferguson Understanding the Heap 00 - https://www.youtube.com/watch?v=VLnhV1T5Ng4 The Heap: what does malloc() do? - (bin 0x14) - https://www.youtube.com/watch?v=ZHghwsTRyzQ ftp://ftp.cs.utexas.edu/pub/garbage/allocsrv.ps https://www.cs.tufts.edu/~nr/cs257/archive/paul-wilson/fragmentation.pdf https://www.blackhat.com/docs/eu-17/materials/eu-17-Heelan-Heap-Layout-Optimisation-For-Exploitation-wp.pdf http://g.oswego.edu/dl/html/malloc.html https://fossies.org/linux/glibc/malloc/malloc.c









In this post and the others in this series, I will unpack some of the internals to glibc's dynamic heap data structures and associated beasts. This post specifically will start you off with no background insight on the heap (perhaps a little on ELF internals and debugging), and detail some experiments you can perform to learn how the heap works.The Heap is essentially a list of memory regions an executing program uses to store data. The data stored in heap regions are requested during runtime. It allows runtime environments like glibc to offer programs dynamic memory for allocating data. So because this offer's memory regions as kind of a "service" (), it must mean some where in this whole mess, there needs to be some accounting information about the memory regions. To this aid; the heap describes or decorates user data regions with an internal structure called a. Chunks are in turn classified and grouped according to their properties - basically properties like:The big TL;DR for heap management is that its basic movement will essentially be to perform elaborate dances around the functions of searching through chunks either to free or allocate them.The heap allocator I will focus on here is glibc version ptmalloc as implemented in versions glibc 2.23-2.28. But this of course not to say that only glibc is important to understand; there exists multiple approaches to heap allocation. Each approach is unique down to how they achieve various operations; like coalescing free chunks, sorting and searching free chunks and grouping them rapidly, as well as amongst perhaps even more things - security improvements. So there's a number of places that complexity can breed and fester into security problems. But the root of these problems will often be in how users requesting data, and the allocator managing data respond to meta-data about memory regions.To close the introduction, Heap can often seem very intense and complex and have very gnarly internals, but most of which; aid memorization and other computer science that serves to speed up searching linked lists. Another way you can say this is that; they are nothing more than elaborate ways to store some "cheat" meta-data that doesn't require searching the whole heap memory area for stuff every time. But the meta-data is interesting to us because there are instances when we want to influence the way the list is searched and interpreted.The basic unit of currency for the heap as you would have guessed is a. We probably want to know what these look like in glibc code, so here ya go:I should give each field its fair explanation ():So we'd probably also want to get a look at what this looks like in execution, see what different kinds of chunks look like () . We want to be able to understand the base language of the heap internals, before taking part in the conversation. So lets run a simple C program through gdb, and unpack the heap to show how it responds internally. The program I'm going to be looking at is the following:In the code above I've added a simple make shift "wrapper" function () and injected a break point just before the last return. This so that I can isolate the free and malloc calls effects on the memory regions we are studying.And now lets see what happens to the heap as it allocates memory. We need to find a pointer to the heap first. This is pretty easy since malloc will save it in rax after returning back to the main function, I show this in the first few gdb commands:So as I have it set up the hook-stop will just spit out everything around $rax-0x10 which is the address where the chunk header information will be saved. I do this because; when we hit this break point malloc will have just returned at set the register to its return value - which will be the address of the memory region allocated. We can see directly how these macros operate on heap metadata data in glibc/malloc/malloc.c So as you can see its a simple addition or subtraction of 2 addresses to get the mem () or the chunk information starting two addresses before. There are a number of other operations that extract and set other meta-data.Okay so that's the basic format pretty much covered lets look at how this looks in action.After the first break point hits you should see gdb display the first heap chunk allocated, here's an annotated version of that dump showing the heap format:Now after this hits your screen, try executing the "c" gdb command to skip to the next break point. You will get a couple more examples of allocated chunks until you see the following on your screen:This is essentially showing that we can't use the value inas before in the hook-stop. As you would guess this is becausedoes not hold the memory pointer anymore, its now embroiled in a free call so it hold some other value. Anyway, we can dump the chunk using the address passed to thefunction since it is conveniently displayed here for us. This is what the chunk looks like after it was free'd:What is shown in the screen dump above in addition to the free chunk is where the first free chunks fd () and bk () pointers go. We can see here if we follow them using gdb's memory examiner functions they eventually end up atwhich is the top chunk's address; the pointer to the top of the currently allocated heap addresses.Okay so that's what a chunk looks like when its allocated and free'd, can we have a look at how chunks are coalesced into bigger free chunks? Yes that's what the next section is for!After allocating the chunks, our program will free up each one in the same order they were allocated in. Now what this means is that we can expect two chunks allocated next to each other; to be free'd right after each other as well - and as a result we will have two chunks that get melded into one.Here's what that looks like:What we can see on the left of the screen dump; is the two chunks () at addressesand. On the right we have the new coalesced chunk atof course, but this time we can see that the size field after coalescing is(which as indicated is simply).This is pretty much all there is to this coalescing action really. And thats pretty much all I have for this post, I'll continue this series by moving onto fast-bins, large chunk management and potentially some heap redirection tricks. Stay tuned for the next one folks!