project zeus "You will not be informed of the meaning of Project Zeus until the time is right for you to know the meaning of Project Zeus."

Archives Current Posts April 2010 May 2010 August 2010 September 2012 February 2013 March 2013 April 2013 May 2013 June 2013 December 2013 March 2014 January 2015 Posts Attacking Zone Page Metadata in iOS 7 and OS X Mavericks Attacking Zone Page Metadata in iOS 7 and OS X Mavericks Over the past several years, zone corruption vulnerabilities have been frequently leveraged by attackers in exploiting the iOS and OS X kernel. In response to their prevalence and gained popularity, iOS 6 and OS X Mountain Lion



Zone Allocator The iOS/OS X kernel as well as IOKit drivers commonly request memory from the zone allocator. The zone allocator organizes memory by size, specifically dividing memory into regions called zones. This allows components to request memory for a specific use, and ensures low-memory footprint as well as performance benefits in leveraging CPU caching facilities as much as possible. In the following sections, we briefly cover the fundamental data structures and algorithms of the zone allocator. In particular, we show how the newly introduced zone page metadata structure now plays an important role in their operation.

Zone Initialization When requesting memory from the zone allocator, kernel components typically call zalloc() [osfmk/kern/zalloc.c] or one of its wrapper functions (e.g. kalloc() or IOMalloc()). In the former case, zalloc() takes the pointer to the zone [osfmk/kern/zalloc.h] structure describing the zone from which the new allocation is made. This structure is initialized when the kernel creates a new zone in zinit() [osfmk/kern/zalloc.c] and is defined as follows.



struct zone { struct zone_free_element *free_elements; struct { queue_head_t any_free_foreign; queue_head_t all_free; queue_head_t intermediate; queue_head_t all_used; } pages; /* list of zone_page_metadata structs */ int count; /* Number of elements used now */ int countfree; /* Number of free elements */ lck_attr_t lock_attr; /* zone lock attribute */ decl_lck_mtx_data(,lock) /* zone lock */ lck_mtx_ext_t lock_ext; /* placeholder for indirect mutex */ vm_size_t cur_size; /* current memory utilization */ vm_size_t max_size; /* how large can this zone grow */ vm_size_t elem_size; /* size of an element */ vm_size_t alloc_size; /* size used for more memory */ uint64_t page_count __attribute__((aligned(8))); /* number of pages used */ uint64_t sum_count; /* count of allocs (life of zone) */ uint32_t /* boolean_t */ exhaustible :1, /* (F) merely return if empty? */ /* boolean_t */ collectable :1, /* (F) garbage collect empty pages */ /* boolean_t */ expandable :1, /* (T) expand zone (with message)? */ /* boolean_t */ allows_foreign :1, /* (F) allow non-zalloc space */ /* boolean_t */ doing_alloc :1, /* is zone expanding now? */ /* boolean_t */ waiting :1, /* is thread waiting for expansion? */ /* boolean_t */ async_pending :1, /* asynchronous allocation pending? */ /* boolean_t */ zleak_on :1, /* Are we collecting allocation information? */ /* boolean_t */ caller_acct :1, /* do we account allocation/free to the caller? */ /* boolean_t */ doing_gc :1, /* garbage collect in progress? */ /* boolean_t */ noencrypt :1, /* boolean_t */ no_callout :1, /* boolean_t */ async_prio_refill :1, /* boolean_t */ gzalloc_exempt :1, /* boolean_t */ alignment_required :1, /* boolean_t */ use_page_list :1, /* future */ _reserved :16; int index; /* index into zone_info arrays for this zone */ struct zone *next_zone; /* Link for all-zones list */ const char *zone_name; /* a name for the zone */ #if ZONE_DEBUG queue_head_t active_zones; /* active elements */ #endif /* ZONE_DEBUG */ #if CONFIG_ZLEAKS uint32_t zleak_capture; /* per-zone counter for capturing every N allocations */ #endif /* CONFIG_ZLEAKS */ uint32_t zp_count; /* counter for poisoning every N frees */ vm_size_t prio_refill_watermark; thread_t zone_replenish_thread; #if CONFIG_GZALLOC gzalloc_data_t gz; #endif /* CONFIG_GZALLOC */ };

Specifically, zinit() sets the initial properties of a zone such as the size of the elements it should manage, the maximum amount of memory it should use, its allocation size (that is, the number of bytes to request when the zone is full), as well as its name. Additionally, it uses heuristics to determine the best allocation size for the given zone element size (minimizing fragmentation), and adjusts it accordingly.



Once a zone has been initialized by zinit(), kernel components can perform additional zone customization by leveraging the zone_change() [osfmk/kern/zalloc.c] API. For instance, zone_change() can enable a particular zone to accept foreign elements (i.e. zone elements with an address outside the zone region), restrict a zone from being garbage collected, or decide whether a zone should be exhaustible or expandable.

Zone Page Metadata iOS 7 and OS X Mavericks make notable changes to the zone allocator by introducing zone page metadata. Essentially, zone page metadata is used to reduce the overhead in managing zone pages and their associated blocks of memory. This page specific data structure is only used for page sized zone allocation sizes (4K) and only if the penalty for introducing it to the zone page is considered acceptable. The source listing from zinit() below shows these requirements.



/* * Opt into page list tracking if we can reliably map an allocation * to its page_metadata, and if the wastage in the tail of * the allocation is not too large */ if (alloc == PAGE_SIZE) { if ((PAGE_SIZE % size) >= sizeof(struct zone_page_metadata)) { use_page_list = TRUE; } else if ((PAGE_SIZE - sizeof(struct zone_page_metadata)) % size <= PAGE_SIZE / 100) { use_page_list = TRUE; } }

Zones that use page metadata essentially depend on the zone’s element size and whether its allocation size is page sized. The following table summarizes the use of page metadata in the kalloc zones in iOS 7 on 32-bit ARM. Note that on this specific platform, the size of the zone page metadata structure is 20 bytes.





zone name element size allocation size page metadata kalloc.8 4096 Yes kalloc.16 16 4096 Yes kalloc.24 24 4096 Yes kalloc.32 32 4096 Yes kalloc.40 40 4096 Yes kalloc.48 48 4096 No kalloc.64 64 4096 No kalloc.88 88 4096 Yes kalloc.112 112 4096 Yes kalloc.128 128 4096 No kalloc.192 192 4096 Yes kalloc.256 256 4096 No kalloc.384 384 4096 Yes kalloc.512 512 4096 No kalloc.768 768 4096 Yes kalloc.1024 1024 4096 No kalloc.1536 1536 12288 No kalloc.2048 2048 4096 No kalloc.3072 3072 12288 No kalloc.4096 4096 4096 No kalloc.6144 6144 12288 No kalloc.8192 8192 8192 No

In OS X Mavericks and iOS 7 on 64-bit ARM, the requirements are slightly different because of the larger zone_page_metadata structure (40 bytes). The use of page metadata in kalloc zones on this platform can be summarized with the following table.





zone name element size allocation size page metadata kalloc.16 16 4096 Yes kalloc.32 32 4096 Yes kalloc.64 64 4096 Yes kalloc.128 128 4096 No kalloc.256 256 4096 No kalloc.512 512 4096 No kalloc.1024 1024 4096 No kalloc.2048 2048 4096 No kalloc.4096 4096 4096 No kalloc.8192 8192 8192 No

If no page metadata is used (e.g. the zone size is very large), new allocations are made from the free_elements list managed by the zone structure. This is a singly linked list that holds all free elements currently managed by the zone, in no particular order. Because zones can grow very large, it may introduce a performance hit in particularly fragmented zones, where chunks subsequently allocated from the free elements lists may belong to entirely different pages (possibly resulting in costly page faults). Although the zone garbage collector is designed to trim the free elements list, this trimming is only applicable to certain zones. Additionally, the garbage collection process itself is very expensive.



When page metadata is used, the zone allocator completely ignores the free_elements list of the zone structure, and resorts to using the newly introduced pages lists, visible in the zone data structure shown previously.



struct { queue_head_t any_free_foreign; queue_head_t all_free; queue_head_t intermediate; queue_head_t all_used; } pages;

Specifically, four unique doubly linked lists are defined to hold all pages currently managed by a particular zone. The any_free_foreign list holds pages that do not belong to the zone memory map, but may still be allowed if the zone allows foreign elements (allows_foreign is set). If this is not the case, individual elements are instead checked against zone_map_min_address and zone_map_max_address in order to ensure that non-zone memory pages never end up in the zone lists. As its name indicates, the all_free list holds pages for which all elements have been freed. Pages where at least one element is used (but not all) are placed on the intermediate list, while pages for which no elements are free are placed on the all_used list.



In order to link zone pages to these lists, additional information on both their use and the zone and list they belong to need to be stored. This is the purpose of the zone_page_metadata structure, stored at the end of each page used by a zone that leverages page metadata.



struct zone_page_metadata { queue_chain_t pages; zone_free_element *elements; zone_t zone; uint16_t alloc_count; uint16_t free_count; };

The doubly linked pages list in the above structure links directly into one of the lists defined by the zone structure, discussed previously. It is followed by the elements list, which keeps track of all free elements in the given page. The zone pointer indicates the zone to which the page belongs, and allows the zone allocator to validate whether an element is freed to its rightful owner. Finally, the alloc_count and free_count fields indicate the total number of elements held by a page and the number of free elements respectively.



The information held by the zone_page_metadata structure allows pages to be dynamically moved between lists depending on their use. Both zalloc() and zfree() inspect the alloc and free counts and move the pages to the appropriate list when necessary. This is to always make sure elements are allocated from pages that are used more. We discuss both the zone allocation and free algorithms, and how zone page metadata applies to each of them, in the following sections.

Allocation Algorithm Upon allocating memory, zalloc() first calls try_alloc_from_zone() to attempt to retrieve an element from the specified zone. If the zone was initialized with page metadata, the function iterates over the individual page lists and attempts to retrieve a page from the list head. The order in which these lists are checked is important in understanding how the zone allocator retrieves new elements. In particular, the allocator first checks if foreign elements are allowed and inspects the any_free_foreign list if that is the case. If foreign elements are not allowed, or the any_free_foreign list is empty, the allocator proceeds to the intermediate list. Finally, if no pages in this list can be found, the allocator checks the all_free list.



if (zone->allows_foreign && !queue_empty(&zone->pages.any_free_foreign)) page_meta = (struct zone_page_metadata *)queue_first(&zone->pages.any_free_foreign); else if (!queue_empty(&zone->pages.intermediate)) page_meta = (struct zone_page_metadata *)queue_first(&zone->pages.intermediate); else if (!queue_empty(&zone->pages.all_free)) page_meta = (struct zone_page_metadata *)queue_first(&zone->pages.all_free); else { return 0; }

If a page has been found in either list, the allocator validates it by calling is_sane_zone_page_metadata(). This function essentially validates the pointer to the zone_page_metadata structure (pointed to by page lists) and calls is_sane_zone_ptr() to ensure that the pointer is aligned and within kernel and possibly zone bounds. The source listing for this function is shown below.



/* * Zone checking helper function. * A pointer that satisfies these conditions is OK to be a freelist next pointer * A pointer that doesn't satisfy these conditions indicates corruption */ static inline boolean_t is_sane_zone_ptr(zone_t zone, vm_offset_t addr, size_t obj_size) { /* Must be aligned to pointer boundary */ if (__improbable((addr & (sizeof(vm_offset_t) - 1)) != 0)) return FALSE; /* Must be a kernel address */ if (__improbable(!pmap_kernel_va(addr))) return FALSE; /* Must be from zone map if the zone only uses memory from the zone_map */ if (zone->collectable && !zone->allows_foreign) { /* check if addr is from zone map */ if (addr >= zone_map_min_address && (addr + obj_size - 1) < zone_map_max_address ) return TRUE; return FALSE; } return TRUE; }

Note that the above function simply validates the page metadata pointer and not any of the values held by the zone_page_metadata structure. The lack of metadata validation may potentially allow the attacker to target this data structure when attempting to exploit a zone corruption vulnerability. We explore possibilities in this area in the last section of this blog post.



Once the zone_page_metadata pointer has been validated, the allocator retrieves an element (element) from the head of he metadata’s elements list (or the first element from the zone’s free_elements list if page metadata is not enabled). The next pointer held by the retrieved element is then validated in a call to is_sane_zone_element(), which operates much like is_sane_zone_ptr(), but also accepts null pointers (indicating the end of a list).



vm_offset_t *primary = (vm_offset_t *) element; vm_offset_t *backup = get_backup_ptr(zone->elem_size, primary); vm_offset_t next_element = *primary; vm_offset_t next_element_backup = *backup; /* * backup_ptr_mismatch_panic will determine what next_element * should have been, and print it appropriately */ if (__improbable(!is_sane_zone_element(zone, next_element))) backup_ptr_mismatch_panic(zone, next_element, next_element_backup);

Since iOS 6 and OSX Mountain Lion, the next pointer held by an element on a free list is protected by an additional measure that involves storing an encoded version of the pointer at the end of the specific element. This mitigation was introduced to address a popular exploitation technique where the attacker could target the free list pointer in order to coerce the allocator to return an arbitrary address on subsequent allocations. Essentially, the encoded value is computed by XOR encoding the next pointer with a pre-computed cookie, unknown to the attacker. For more information regarding this mitigation, we refer the reader to our



/* Check the backup pointer for the regular cookie */ if (__improbable(next_element != (next_element_backup ^ zp_nopoison_cookie))) { /* Check for the poisoned cookie instead */ if (__improbable(next_element != (next_element_backup ^ zp_poisoned_cookie))) /* Neither cookie is valid, corruption has occurred */ backup_ptr_mismatch_panic(zone, next_element, next_element_backup); /* * Element was marked as poisoned, so check its integrity, * skipping the primary and backup pointers at the beginning and end. */ vm_offset_t *element_cursor = primary + 1; for ( ; element_cursor < backup ; element_cursor++) if (__improbable(*element_cursor != ZP_POISON)) zone_element_was_modified_panic(zone, *element_cursor, ZP_POISON, ((vm_offset_t)element_cursor) - element); }

Before the element is returned back to the caller, both its next pointer and encoded pointer are replaced by the sentinel value. This aims to prevent the attacker from potentially learning these values, e.g. by leveraging functions which do not properly initialize the contents of the buffer before it is returned back to the user. If page metadata is used, the allocator also inspects the current free count of the associated page and moves it to the appropriate list. If the free count is lowered to zero and the last free element is allocated, the page is unlinked from its current list and linked into the all_used list. If the page was previously on the all_free list and the free count is lowered (alloc_count == free_count + 1), then the page is linked into the intermediate list.



/* * Clear out the old next pointer and backup to avoid leaking the cookie * and so that only values on the freelist have a valid cookie */ *primary = ZP_POISON; *backup = ZP_POISON; /* Remove this element from the free list */ if (zone->use_page_list) { page_meta->elements = (struct zone_free_element *)next_element; page_meta->free_count--; if (zone->allows_foreign && !from_zone_map(element, zone->elem_size)) { if (page_meta->free_count == 0) { /* move to all used */ remqueue((queue_entry_t)page_meta); enqueue_tail(&zone->pages.all_used, (queue_entry_t)page_meta); } else { /* no other list transitions */ } } else if (page_meta->free_count == 0) { /* remove from intermediate or free, move to all_used */ remqueue((queue_entry_t)page_meta); enqueue_tail(&zone->pages.all_used, (queue_entry_t)page_meta); } else if (page_meta->alloc_count == page_meta->free_count + 1) { /* remove from free, move to intermediate */ remqueue((queue_entry_t)page_meta); enqueue_tail(&zone->pages.intermediate, (queue_entry_t)page_meta); } }

If no memory could be returned by try_alloc_from_zone(), the zone allocator attempts to call kernel_memory_allocate() [osfmk/vm/vm_kern.c] to request additional memory from the zone memory map. The number of bytes requested is determined by the zone’s allocation size, originally set upon first initializing the zone in zinit(). The retrieved memory is then passed to zcram() [osfmk/kern/zalloc.c], a function responsible for dividing the memory into equally sized blocks (of the zone’s element size), and returning them to the zone’s free list. Specifically, if page metadata is used, zcram() carves out a page sized region of the provided memory, initializes a page metadata structure, and places the page at the tail of the all_used list. It then calls free_to_zone() to free each element (from low address to high) to the page metadata’s free list (elements), eventually causing the page to be placed at the tail of the all_free list. If the zone doesn’t use page metadata, the allocated region is divided into element-sized blocks and put onto the zone’s free list (free_elements) in a similar fashion.

Free Algorithm When a zone element is freed, zfree() calls free_to_zone() to perform the actual freeing operation. Initially, free_to_zone() locates the list head of the free elements list, either from the page metadata structure (elements) or the zone structure (free_elements) depending on the use of page metadata. Subsequently, the function calls is_sane_zone_element() to validate the element’s pointer alignment and zone locality. It then checks if the element size is below or equal to zp_tiny_zone_limit (cache line size of the current processor), in which case it applies block poisoning and fills the element buffer content with a sentinel value (0xdeadbeef). If the size is above the tiny zone limit, the free function checks if the zone allocator is configured to use the zone sample factor (zp_factor), in which case it increments the zone poison count for the target zone (zone->zp_count) and compares it to the sample factor value. If the zone poison count is above or equal to the zone sample factor, block poisoning is applied to the element.



boolean_t poison = FALSE; /* Always poison tiny zones' elements (limit is 0 if -no-zp is set) */ if (zone->elem_size <= zp_tiny_zone_limit) { poison = TRUE; } else if (zp_factor != 0 && ++zone->zp_count >= zp_factor) { /* Poison zone elements periodically */ zone->zp_count = 0; poison = TRUE; } if (poison) { vm_offset_t *element_cursor = primary + 1; for ( ; element_cursor < backup; element_cursor++) { *element_cursor = ZP_POISON; } }

As the element to be freed is placed at the head of the free list, its next pointer (located at the top of the element buffer) is updated to point to the current free list head. In order to protect this pointer against possible zone attacks, it creates an encoded copy using the generated zone cookies and places it at the end of the element’s buffer.



/* * Always write a redundant next pointer * So that it is more difficult to forge, xor it with a random cookie * A poisoned element is indicated by using zp_poisoned_cookie * instead of zp_nopoison_cookie */ *backup = old_head ^ (poison ? zp_poisoned_cookie : zp_nopoison_cookie);

Finally, if the zone doesn’t use page metadata, the element is placed at the head of the zone’s free_elements list. If this is the case, on the other hand, the element is placed on the free list (elements) held by the page metadata structure. However, before this takes place, free_to_zone() checks whether the zone page needs to be moved to a different zone page list given updated free count. If the free count was previously 0, the page is moved to the tail of the intermediate list or the any_free_foreign list if foreign elements are allowed and the element is outside the zone memory map. If, on the other hand, the last element of a page was freed, the page is moved to the tail of the all_free list.



if (zone->use_page_list) { page_meta->elements = (struct zone_free_element *)element; page_meta->free_count++; if (zone->allows_foreign && !from_zone_map(element, zone->elem_size)) { if (page_meta->free_count == 1) { /* first foreign element freed on page, move from all_used */ remqueue((queue_entry_t)page_meta); enqueue_tail(&zone->pages.any_free_foreign, (queue_entry_t)page_meta); } else { /* no other list transitions */ } } else if (page_meta->free_count == page_meta->alloc_count) { /* whether the page was on the intermediate or all_used, queue, move it to free */ remqueue((queue_entry_t)page_meta); enqueue_tail(&zone->pages.all_free, (queue_entry_t)page_meta); } else if (page_meta->free_count == 1) { /* first free element on page, move from all_used */ remqueue((queue_entry_t)page_meta); enqueue_tail(&zone->pages.intermediate, (queue_entry_t)page_meta); } } Attacking Zone Page Metadata As page metadata falls into the same area as zone elements, it may potentially be targeted by a zone corruption vulnerability. Recall that the page metadata structure not only holds usage information on a particular page, but also includes a pointer to the parent zone structure, a free elements list pointer, as well as a doubly linked pages list entry. Both the usage count information and the page list entry are frequently used in moving a page between zone page lists. In particular, targeting this information may allow the attacker to control the pointers leveraged by linked list operations and therefore trigger an arbitrary write.



As mentioned previously, operations on page lists occur whenever the free count for a page reaches a certain value. The free count values and their associated lists are shown in the table below.





Free Count All Free Intermediate All Used Any Free Foreign





1+



All





The lists themselves are represented as “queues”, a generic doubly linked list defined by the queue_entry / queue_t structure [osfmk/kern/queue.h].



/* * A generic doubly-linked list (queue). */ struct queue_entry { struct queue_entry *next; /* next element */ struct queue_entry *prev; /* previous element */ }; typedef struct queue_entry *queue_t; typedef struct queue_entry queue_head_t; typedef struct queue_entry queue_chain_t; typedef struct queue_entry *queue_entry_t;

When a zone page is removed from a list, the zone allocator calls remqueue() [osfmk/kern/queue.h], shown below.



#ifdef XNU_KERNEL_PRIVATE #define __DEQUEUE_ELT_CLEANUP(elt) do { \ (elt)->next = (queue_entry_t) 0; \ (elt)->prev = (queue_entry_t) 0; \ } while (0) #else #define __DEQUEUE_ELT_CLEANUP(elt) do { } while(0) #endif /* !XNU_KERNEL_PRIVATE */ static __inline__ void remqueue( queue_entry_t elt) { elt->next->prev = elt->prev; elt->prev->next = elt->next; __DEQUEUE_ELT_CLEANUP(elt); }

Notably, the lack of linked list pointer validation in remqueue() may allow the attacker to target the next and previous pointers of the queue_entry structure in order to create a classic case of “write-4” (or “write-8” on 64-bit). This may enable the attacker to write a pointer-wide value to an arbitrary address in a subsequent unlink operation. Moreover, if the free count in the page metadata structure is also overwritten, the attacker can trigger an unlink immediately on the next freed element, for instance by setting the free count to zero or one value less than the allocation count. Although reaching the count values also requires the (free) elements pointer of the page metadata structure to be overwritten, this value can be set to null.



Targeting the linked list pointers of zone page metadata

The output below shows the panic dump for a fault caused by overwriting the linked list pointers in the metadata structure of a zone page, after which an element was freed. In the example, the kernel attempts to unlink a page from a doubly linked page list while both the next (r0) and previous (r1) pointers are controlled by the attacker.



Incident Identifier: C7E5ECB7-DC8D-4985-9353-D11A604B9033 CrashReporter Key: a4f686722fa9266b993d445e3ab46603c25b7902 Hardware Model: iPhone5,4 Date/Time: 2013-12-16 21:28:43.980 +0100 OS Version: iOS 7.0.4 (11B554a) panic(cpu 1 caller 0x82621abd): kernel abort type 4: fault_type=0x3, fault_addr=0x41414145 r0: 0x41414141 r1: 0x42424242 r2: 0xdeadbeef r3: 0x3f7aa000 r4: 0xc2ae3600 r5: 0x9fd55a00 r6: 0x9fd55fec r7: 0xde3dbe34 r8: 0x9fd55a24 r9: 0x82600000 r10: 0x43434343 r11: 0x8297c3a0 r12: 0xc4246110 sp: 0xde3dbe1c lr: 0x826d38c1 pc: 0x826d3578 cpsr: 0x60000033 fsr: 0x00000805 far: 0x41414145

Moreover, the following disassembly shows that the next and previous pointers were retrieved from the zone page metadata (r6) when attempting to move the page into the all_free list. Notably, see that the faulting instruction at 0x826d3578 attempts to write the value held by r1 to the address pointed to by r0 + 4.



__text:842D356C loc_842D356C ; CODE XREF: free_to_zone+12Cj __text:842D356C LDRH R1, [R6,#0x10] ; free count __text:842D356E UXTH R0, R0 __text:842D3570 CMP R0, R1 __text:842D3572 BNE loc_842D35BC ; free_count != alloc_count __text:842D3574 LDRD.W R0, R1, [R6] ; unlink from current list __text:842D3578 STR R1, [R0,#4] ; replace 'prev' in next entry __text:842D357A LDRD.W R0, R1, [R6] __text:842D357E STR R0, [R1] ; replace 'next' in prev entry __text:842D3580 MOVS R1, #0 __text:842D3582 ADD.W R0, R4, #0xC __text:842D3586 STRD.W R0, R1, [R6] __text:842D358A LDR R0, [R4,#0x10] __text:842D358C STR R0, [R6,#4] __text:842D358E STR R6, [R0] ; enqueue to tail of all_free list __text:842D3590 STR R6, [R4,#0x10] __text:842D3592 B loc_842D35DE

Note that before remqueue() returns, __DEQUEUE_ELT_CLEANUP() is called to clear the next and prev pointer values. These values are later initialized when when the allocator calls enqueue_tail() [osfmk/kern/queue.h] to link the zone page into the new list.



Abusing linked list operations in the way just presented is trivial due to the absence of safe unlinking. As none of the queue functions verify linked list pointers, this problem also persists across all kernel components that implement linked list functionality using these functions. Prior to removing an element from a doubly linked list, safe unlinking verifies that both its next and previous links point to elements which also point back to the element being unlinked. This effectively mitigates the attacker’s ability to corrupt arbitrary memory, as the memory at the chosen location must hold a valid list pointer. Over the past several years, zone corruption vulnerabilities have been frequently leveraged by attackers in exploiting the iOS and OS X kernel. In response to their prevalence and gained popularity, iOS 6 and OS X Mountain Lion introduced numerous mitigations and hardening measures in order to increase the security and robustness of the zone allocator. In particular, these enhancements seek to prevent an attacker from leveraging well-known exploitation primitives such as the free list pointer overwrite. In iOS 7 and OS X Mavericks, further improvements have been made to the zone allocator, primary aimed at improving its efficiency. Notably, these improvements have caused significant changes to zone page management and have introduced a new zone page metadata structure. In this blog post, we revisit the zone allocator and detail the updates made by Apple in iOS 7 and OS X Mavericks. We then show how these changes yet again may allow an attacker to generically exploit zone corruption vulnerabilities.The iOS/OS X kernel as well as IOKit drivers commonly request memory from the zone allocator. The zone allocator organizes memory by size, specifically dividing memory into regions called zones. This allows components to request memory for a specific use, and ensures low-memory footprint as well as performance benefits in leveraging CPU caching facilities as much as possible. In the following sections, we briefly cover the fundamental data structures and algorithms of the zone allocator. In particular, we show how the newly introduced zone page metadata structure now plays an important role in their operation.When requesting memory from the zone allocator, kernel components typically call[osfmk/kern/zalloc.c] or one of its wrapper functions (e.g.or). In the former case,takes the pointer to the[osfmk/kern/zalloc.h] structure describing the zone from which the new allocation is made. This structure is initialized when the kernel creates a new zone in[osfmk/kern/zalloc.c] and is defined as follows.Specifically,sets the initial properties of a zone such as the size of the elements it should manage, the maximum amount of memory it should use, its allocation size (that is, the number of bytes to request when the zone is full), as well as its name. Additionally, it uses heuristics to determine the best allocation size for the given zone element size (minimizing fragmentation), and adjusts it accordingly.Once a zone has been initialized by, kernel components can perform additional zone customization by leveraging the[osfmk/kern/zalloc.c] API. For instance,can enable a particular zone to accept foreign elements (i.e. zone elements with an address outside the zone region), restrict a zone from being garbage collected, or decide whether a zone should be exhaustible or expandable.iOS 7 and OS X Mavericks make notable changes to the zone allocator by introducing zone page metadata. Essentially, zone page metadata is used to reduce the overhead in managing zone pages and their associated blocks of memory. This page specific data structure is only used for page sized zone allocation sizes (4K) and only if the penalty for introducing it to the zone page is considered acceptable. The source listing frombelow shows these requirements.Zones that use page metadata essentially depend on the zone’s element size and whether its allocation size is page sized. The following table summarizes the use of page metadata in thezones in iOS 7 on 32-bit ARM. Note that on this specific platform, the size of the zone page metadata structure is 20 bytes.In OS X Mavericks and iOS 7 on 64-bit ARM, the requirements are slightly different because of the largerstructure (40 bytes). The use of page metadata inzones on this platform can be summarized with the following table.If no page metadata is used (e.g. the zone size is very large), new allocations are made from thelist managed by thestructure. This is a singly linked list that holds all free elements currently managed by the zone, in no particular order. Because zones can grow very large, it may introduce a performance hit in particularly fragmented zones, where chunks subsequently allocated from the free elements lists may belong to entirely different pages (possibly resulting in costly page faults). Although the zone garbage collector is designed to trim the free elements list, this trimming is only applicable to certain zones. Additionally, the garbage collection process itself is very expensive.When page metadata is used, the zone allocator completely ignores thelist of the zone structure, and resorts to using the newly introducedlists, visible in thedata structure shown previously.Specifically, four unique doubly linked lists are defined to hold all pages currently managed by a particular zone. Thelist holds pages that do not belong to the zone memory map, but may still be allowed if the zone allows foreign elements (is set). If this is not the case, individual elements are instead checked againstandin order to ensure that non-zone memory pages never end up in the zone lists. As its name indicates, thelist holds pages for which all elements have been freed. Pages where at least one element is used (but not all) are placed on thelist, while pages for which no elements are free are placed on thelist.In order to link zone pages to these lists, additional information on both their use and the zone and list they belong to need to be stored. This is the purpose of thestructure, stored at the end of each page used by a zone that leverages page metadata.The doubly linkedlist in the above structure links directly into one of the lists defined by the zone structure, discussed previously. It is followed by thelist, which keeps track of all free elements in the given page. Thepointer indicates the zone to which the page belongs, and allows the zone allocator to validate whether an element is freed to its rightful owner. Finally, theandfields indicate the total number of elements held by a page and the number of free elements respectively.The information held by thestructure allows pages to be dynamically moved between lists depending on their use. Bothandinspect the alloc and free counts and move the pages to the appropriate list when necessary. This is to always make sure elements are allocated from pages that are used more. We discuss both the zone allocation and free algorithms, and how zone page metadata applies to each of them, in the following sections.Upon allocating memory,first callsto attempt to retrieve an element from the specified zone. If the zone was initialized with page metadata, the function iterates over the individual page lists and attempts to retrieve a page from the list head. The order in which these lists are checked is important in understanding how the zone allocator retrieves new elements. In particular, the allocator first checks if foreign elements are allowed and inspects thelist if that is the case. If foreign elements are not allowed, or thelist is empty, the allocator proceeds to thelist. Finally, if no pages in this list can be found, the allocator checks thelist.If a page has been found in either list, the allocator validates it by calling. This function essentially validates the pointer to thestructure (pointed to by page lists) and callsto ensure that the pointer is aligned and within kernel and possibly zone bounds. The source listing for this function is shown below.Note that the above function simply validates the page metadata pointer and not any of the values held by thestructure. The lack of metadata validation may potentially allow the attacker to target this data structure when attempting to exploit a zone corruption vulnerability. We explore possibilities in this area in the last section of this blog post.Once thepointer has been validated, the allocator retrieves an element () from the head of he metadata’s elements list (or the first element from the zone’slist if page metadata is not enabled). The next pointer held by the retrieved element is then validated in a call to, which operates much like, but also accepts null pointers (indicating the end of a list).Since iOS 6 and OSX Mountain Lion, the next pointer held by an element on a free list is protected by an additional measure that involves storing an encoded version of the pointer at the end of the specific element. This mitigation was introduced to address a popular exploitation technique where the attacker could target the free list pointer in order to coerce the allocator to return an arbitrary address on subsequent allocations. Essentially, the encoded value is computed by XOR encoding the next pointer with a pre-computed cookie, unknown to the attacker. For more information regarding this mitigation, we refer the reader to our iOS 6 Kernel Security presentation from last year.Before the element is returned back to the caller, both its next pointer and encoded pointer are replaced by the sentinel value. This aims to prevent the attacker from potentially learning these values, e.g. by leveraging functions which do not properly initialize the contents of the buffer before it is returned back to the user. If page metadata is used, the allocator also inspects the current free count of the associated page and moves it to the appropriate list. If the free count is lowered to zero and the last free element is allocated, the page is unlinked from its current list and linked into thelist. If the page was previously on thelist and the free count is lowered (==+ 1), then the page is linked into thelist.If no memory could be returned by, the zone allocator attempts to call[osfmk/vm/vm_kern.c] to request additional memory from the zone memory map. The number of bytes requested is determined by the zone’s allocation size, originally set upon first initializing the zone in. The retrieved memory is then passed to[osfmk/kern/zalloc.c], a function responsible for dividing the memory into equally sized blocks (of the zone’s element size), and returning them to the zone’s free list. Specifically, if page metadata is used,carves out a page sized region of the provided memory, initializes a page metadata structure, and places the page at the tail of thelist. It then callsto free each element (from low address to high) to the page metadata’s free list (), eventually causing the page to be placed at the tail of thelist. If the zone doesn’t use page metadata, the allocated region is divided into element-sized blocks and put onto the zone’s free list () in a similar fashion.When a zone element is freed,callsto perform the actual freeing operation. Initially,locates the list head of the free elements list, either from the page metadata structure () or the zone structure () depending on the use of page metadata. Subsequently, the function callsto validate the element’s pointer alignment and zone locality. It then checks if the element size is below or equal to(cache line size of the current processor), in which case it applies block poisoning and fills the element buffer content with a sentinel value (). If the size is above the tiny zone limit, the free function checks if the zone allocator is configured to use the zone sample factor (), in which case it increments the zone poison count for the target zone () and compares it to the sample factor value. If the zone poison count is above or equal to the zone sample factor, block poisoning is applied to the element.As the element to be freed is placed at the head of the free list, its next pointer (located at the top of the element buffer) is updated to point to the current free list head. In order to protect this pointer against possible zone attacks, it creates an encoded copy using the generated zone cookies and places it at the end of the element’s buffer.Finally, if the zone doesn’t use page metadata, the element is placed at the head of the zone’slist. If this is the case, on the other hand, the element is placed on the free list () held by the page metadata structure. However, before this takes place,checks whether the zone page needs to be moved to a different zone page list given updated free count. If the free count was previously 0, the page is moved to the tail of thelist or thelist if foreign elements are allowed and the element is outside the zone memory map. If, on the other hand, the last element of a page was freed, the page is moved to the tail of thelist.As page metadata falls into the same area as zone elements, it may potentially be targeted by a zone corruption vulnerability. Recall that the page metadata structure not only holds usage information on a particular page, but also includes a pointer to the parent zone structure, a free elements list pointer, as well as a doubly linked pages list entry. Both the usage count information and the page list entry are frequently used in moving a page between zone page lists. In particular, targeting this information may allow the attacker to control the pointers leveraged by linked list operations and therefore trigger an arbitrary write.As mentioned previously, operations on page lists occur whenever the free count for a page reaches a certain value. The free count values and their associated lists are shown in the table below.The lists themselves are represented as “queues”, a generic doubly linked list defined by thestructure [osfmk/kern/queue.h].When a zone page is removed from a list, the zone allocator calls[osfmk/kern/queue.h], shown below.Notably, the lack of linked list pointer validation inmay allow the attacker to target the next and previous pointers of thestructure in order to create a classic case of “write-4” (or “write-8” on 64-bit). This may enable the attacker to write a pointer-wide value to an arbitrary address in a subsequent unlink operation. Moreover, if the free count in the page metadata structure is also overwritten, the attacker can trigger an unlink immediately on the next freed element, for instance by setting the free count to zero or one value less than the allocation count. Although reaching the count values also requires the (free) elements pointer of the page metadata structure to be overwritten, this value can be set to null.The output below shows the panic dump for a fault caused by overwriting the linked list pointers in the metadata structure of a zone page, after which an element was freed. In the example, the kernel attempts to unlink a page from a doubly linked page list while both the next () and previous () pointers are controlled by the attacker.Moreover, the following disassembly shows that the next and previous pointers were retrieved from the zone page metadata () when attempting to move the page into thelist. Notably, see that the faulting instruction at 0x826d3578 attempts to write the value held byto the address pointed to by+ 4.Note that beforereturns,is called to clear theandpointer values. These values are later initialized when when the allocator calls[osfmk/kern/queue.h] to link the zone page into the new list.Abusing linked list operations in the way just presented is trivial due to the absence of safe unlinking. As none of the queue functions verify linked list pointers, this problem also persists across all kernel components that implement linked list functionality using these functions. Prior to removing an element from a doubly linked list, safe unlinking verifies that both its next and previous links point to elements which also point back to the element being unlinked. This effectively mitigates the attacker’s ability to corrupt arbitrary memory, as the memory at the chosen location must hold a valid list pointer. Labels: Exploitation, iOS, Kernel, OS X, Vulnerabilities 1 comments: Subscribe to Post Comments [Atom] << Home