Broadcom is one of the major vendors of wireless devices worldwide. Since these chips are so widespread they constitute a high value target to attackers and any vulnerability found in them should be considered to pose high risk. In this blog post I provide an account of my internship at Quarkslab which included obtaining, reversing and fuzzing the firmware, and finding a few new vulnerabilities.

Introduction Broadcom is one of the major vendors of wireless devices worldwide. They sell wireless chips labelled under the 43 series. You can find these chips almost everywhere from smartphones to laptops, smart-TVs and IoT devices. You probably use one without knowing it, for example if you have a Dell laptop, you may be using a bcm43224 or a bcm4352 card. It is also likely you use a Broadcom WiFi chip if you have an iPhone, a Mac book, a Samsumg phone or a Huawei phone, etc. Since these chips are so widespread they constitute a high value target to attackers and any vulnerability found in them should be considered to pose high risk. In 2018 I did a 6 months internship at Quarkslab with the purpose of reproducing and porting publicly known vulnerabilities to other vulnerable devices, to learn and improve several common infosec practices and to contribute to increase Quarkslab's knowledge of these devices. In this blog post I provide an account of my journey which included obtaining, reversing and fuzzing the firmware, and finding a few new vulnerabilities. But first let's briefly speak about the 802.11 standard and its implementation on Linux to support the family of chips I studied.

I. A few words about WLAN and Linux Before diving in let us have a look at the 802.11 wireless standard. The first IEEE 802.11 standard , created in 1997, standardized the PHY and MAC layers, the two lowest OSI layers. For the PHY layer, two frequency bands were chosen: the Infrared (IR) band and Microwave band (2.4GHz). After that, other standards, like the 802.11a , brought another frequency range (5GHz). The MAC layer uses three types of frames: management, data and control. The Frame Control field of the 802.11 header's frame identifies the type on any given frame. Management frames are managed by an entity called MLME (MAC subLayer Management Entity). Depending on the location of the core that processes MLME we get two major types of wireless chip implementations: SoftMAC, where the MLME is running in the kernel driver, and HardMAC (also called FullMAC) where the MLME is in the firmware, embedded in the chip. But life is not so simple and some hybrid implementations also exist where, for example, probe responses and requests are managed by the driver, but association requests and authentication are dealt by the chip's firmware. FullMAC devices offer better performance in terms of power consumption and speed, that's why they are heavily used in smartphones and tend to be the most used kind of chips in the market. Their main disadvantage is that they limit ability of the users to send specific frames or to set them in monitor mode. For that one will need to edit directly the firmware running on the chips. From the Linux Operating System perspective the above gives us two major layouts of components in the wireless stack: When the wireless device is a SoftMAC device, the kernel will use a specific Linux Kernel Module (LKM) called 'mac80211'. This driver exposes the MLME API in order to manage the Management frames, otherwise the kernel will use directly a hardware driver and offload MLME processing to the chip's firmware.

II. Introducing Broadcom's bcm43xxx chips The Broadcom bcm43xxx series have both HardMAC and SoftMAC cards. Unfortunately we could not find all the datasheets for all the chips we analyzed. The few datasheets available have been released by Cypress after their acquisition of the "IoT business" branch of Broadcom. It's worth mentioning that some chips integrate both WLAN and Bluetooth capabilities, like the bcm4339 or the bcm4330. All the chips analysed use an ARM Cortex-M3 or an ARM Cortex-R4 as the main MCU for non-time-critical operations, so we deal with two similar instruction sets: armv7m and armv7r. These MCUs have one ROM and one RAM, their size varies depending on the chipset's version. All time-critical operations are realised by a Broadcom proprietary processor called D11 core, mostly responsible of the PHY layer. Firmwares used by these chips are split in two parts: one part is written into the ROM and cannot be modified, the other part is uploaded by the driver into the chip's RAM. By doing so the vendor is able to add new features or write updates for their chips, just by changing the RAM portion of the firmware. FullMAC chips are very interesting, first as stated before the MLME layer is implemented into the firmware code, but they also offer offloading features like ARP cache, mDNS, EAPOL, etc. These chips also have some hardware cryptographic modules allowing to encrypt and decrypt the traffic, manage the keys, etc. All of the offloading features increase the attack surface, giving us a nice playground. In order to communicate with the Host (Application Processor), several bus interfaces are used accross the b43 family: USB, SDIO and PCIe. On the driver side, we can split the set of bcm43xxx drivers in two categories; Open source and proprietary. Open source: b43 (reversed from proprietary wl / old SoftMAC / Linux)

brcmsmac (SoftMAC / Linux)

brcmfmac (FullMAC / Linux)

bcmdhd ( FullMAC / Android) Proprietary: broadcom-sta aka 'wl' ( SoftMAC && FullMAC / Linux) The 'wl' driver is the most used on embedded systems like routers. It is also usually used on laptops that have a chip not supported by the brcmfmac/brcmsmac driver, like the bcm4352 chip on the Dell XPS. Also, the wl driver uses it's own MLME and doesn't need the LKM 'mac80211' to process Management frames, expanding the attack surface for an attacker. The version distributed by Broadcom is generally called an 'hybrid' driver because the main part of the code comes in two compiled ELF - objects used at compile time. Why two? because one is for the x86_64 architecture and the other for i386. These objects hold the main code of the driver and therefore expose a lot of Broadcom API's functions. It is important to mention that the chip's firmware and the wl driver share a lot of code so vulnerabilities found in one may also be present in the other.

III. Getting the firmware 1) Getting the first part: RAM firmware part As explained, the firmware is split in two parts. The easiest part to grab is the RAM part, which is loaded into the RAM by the driver. This part contains code and data used by the main MCU but also the microcode used by the D11 core. This part of the firmware is not signed, and integrity is 'verified' using a CRC32 checksum. This has lead to several firmware modifications in order to add functionalities like the monitor mode. For example, SEEMO Lab has released the NEXMON project , which is an amazing framework for modifying these firmwares by writing patches for them in C. During our study we encountered two possible formats for the RAM firmware image; the first and most commonly encountered was a simple binary blob with no particular structure. The second was the TRX format which could be easily parsed, when working on the bcm43236 chip. When working on the .bin RAM firmware we generally have a string at the end of the file exposing: The chip's version

The bus used by the chip for the host to dongle communication

The features offered by the firmware; p2p, TDLS, etc.

The firmware's version

The CRC checksum

The date on which it was created. When the driver used is brmfmac or bcmdhd we can get the RAM firmware directly from the host filesystem. On linux we can find it in /lib/firmware/brcm or on Android in /system/vendor/firmware . In other cases it will vary depending on the system we use. If the driver used is the proprietary wl we may find the firmware's RAM part in the .data section of the LKM. It can be easily extracted with LIEF . >>> import lief >>> wl = lief . parse ( "wl.ko" ) >>> data = wl . get_section ( ".data" ) >>> for symbol in wl . symbols : ... if "dlarray_" in symbol . name : ... print ( symbol . name ) ... dlarray_4352pci dlarray_4350pci >>> b4352 = wl . get_symbol ( "dlarray_4352pci" ) >>> bcm4352_fw = data . content [ b4352 . value : b4352 . value + b4352 . size ] >>> with open ( "/tmp/bcm4352_ramfw.bin" , 'wb' ) as f : ... f . write ( bytes ( bcm4352_fw )) ... 442233 >>> $ strings /tmp/bcm4352_ramfw.bin | tail -n 1 4352pci-bmac/debug-ag-nodis-aoe-ndoe Version: 6.30.223.0 CRC: ff98ca92 Date: Sun 2013-12-15 19:30:36 PST FWID 01-9413fb21 It is interesting to notice that the firmware released for the bcm4352, used in the latest wl driver on Linux dates from 2013... 2) Recovering the second part: introduction to the ROM part The ROM part of the firmware is the most important one to grab to understand the internals of these chips. In order to grab the ROM part, we need to know where it is mapped. The best way to find the base address is to read the driver's header, for example in the bcmdhd's headers file /include/hndsoc.h . An alternate way is to read the Nexmon project README which gives us other base addresses depending on the MCU models. The astute reader may see that these addresses differ. The Nexmon project specifies that the ROM for chips with a Cortex-M3 is loaded at 0x800000, and the bcmdhd's header says at 0x1e000000. Both are correct. It seems that the ROM and the RAM are mapped twice. Furthermore, knowing the base address gives us a clue about the MCU used, for example if we dump the ROM at 0x000f0000, we know that the chip is using an ARM Cortex-R4. 3) Getting the ROM part on an Android system On Android, we can use the dhdutil tool which is an Android opensource improved fork of the old wlctl utility. By using the 'membytes' function of this tool we can dump the RAM of the chipsets, and in some cases the ROM as well. adb shell /data/local/tmp/dhdutil -i wlan0 membytes -r 0x0 0xa0000 > rom.bin For example, on the bcm4339 chip used in the Nexus 5 which relies on a Cortex-R4, the ROM is directly dumped. Unfortunately, on the older bcm4330 (Cortex-M3) this doesnt work. But as long as you can interact with the RAM, it is possible to hook a function with a little stub which will copy the ROM, slice by slice, into an emtpy arena in the RAM. After that we can dump all the ROM's slices. 4) Recovering the ROM part on an Linux system On Linux with the brcmfmac driver, we cannot directly access the ROM. Therefore, we need to find a way to interact with the chip's memory, directly within the ROM or the RAM. Luckily, when the chip uses a SDIO bus for communicating with the host the opensource brcmfmac driver exposes the function brcmf_sdiod_ramrw . This function allows us to read and write into the chipset's RAM from the host. If we modify the driver in order to add an ioctl wrapper around this function, we may be able to read and write into the chipset's RAM from a tiny userspace utility. Prior to calling brcmf_sdiod_ramrw , we must call sdio_claim_host in order to reclaim the utilisation of the SDIO bus. Note that if the device is not connected to any Access Point, the device may be in a low power-mode and the bus may be idle, so we need to ensure that the device's bus is up by calling bcmf_sdio_bus_sleep and brcmf_sdio_clkctl . int brcmf_ioctl_entry ( struct net_device * ndev , struct ifreq * ifr , int cmd ) { ... sdiobk -> alp_only = true ; sdio_claim_host ( sdiobk -> sdiodev -> func [ 1 ]); brcmf_sdio_bus_sleep ( sdiobk , false , false ); brcmf_sdio_clkctl ( sdiobk , CLK_AVAIL , false ); res = brcmf_sdiod_ramrw ( sdiobk -> sdiodev , margs -> op , margs -> addr , buff , margs -> len ); if ( res ) { printk ( KERN_DEFAULT "[!] Dumpmem failed for addr %08x.

" , margs -> addr ); sdio_release_host ( sdiobk -> sdiodev -> func [ 1 ]); kfree ( buff ); return ( - 1 ); } if ( copy_to_user ( margs -> buffer , buff , margs -> len ) != 0 ) printk ( KERN_DEFAULT "[!] Can't copy buffer to userland.

" ); ... } We need to write a small program to interact with our ioctl from the userland. With it we should be able to read and write into the device RAM: ... memset ( & margs , 0 , sizeof ( t_broadmem )); margs . addr = strtol ( ar [ 1 ], NULL , 16 ); margs . op = 1 ; if ( errno == ERANGE ) prt_badarg ( ar [ 1 ]); len = strtol ( ar [ 2 ], NULL , 10 ); if ( errno == ERANGE ) prt_badarg ( ar [ 2 ]); margs . buffer = hex2byte (( unsigned char * ) ar [ 3 ], len ); if (( s = socket ( AF_INET , SOCK_DGRAM , 0 )) < 0 ) return ( - 1 ); strncpy ( ifr . ifr_name , ar [ 0 ], IFNAMSIZ ); margs . len = len ; ifr . ifr_data = ( char * ) & margs ; if ( ! ( ret = ioctl ( s , SIOCDEVPRIVATE , & ifr ))) printf ( "[+] Write succesfull!

" ); else printf ( "[!] Failed to write.

" ); close ( s ); free ( buf ); return ( ret ); ... Now that we can read and write into the RAM of the chip, we can dump the ROM by: Hooking a function located in the RAM and called by an action X

Copying the ROM, slice by slice, into an empty area in the RAM (in our hook's stub)

Dumping all the freshly copied ROM slices and concatenating them. This protocol is the same as the one we used when the chip's MCU is a Cortex-M3 on Android. However, this time we had to modify the driver and build our own tools to use our new driver's ioctl. We have chosen this method when working on the RPI3 chip's (bcm43430). 5) Getting the ROM part in specific cases There are still a lot of other possible scenarios: . What if your chip is using the brcmfmac driver with an PCIe bus? . What if your chip is on an embbedded system using the proprietary driver 'wl'? . What if you don't have a shell on the host OS? Or if you lack permissions? And so on... In all of these other cases, you are left with several possibilities: if you have access to the hardware, you can look for UART access, or you may hook the wl driver. We have choosen the UART access when working on the 'SFR minidecoder TV' (bcm43236). RTE (usbrdl) v5.90 (TOB) running on BCM43235 r3 @ 20/96/96 MHz. rdl0: Broadcom USB Remote Download Adapter ei 1, ebi 2, ebo 1 RTE (USB-CDC) 6.37.14.105 (r) on BCM43235 r3 @ 20.0/96.0/96.0MHz 000000.007 ei 1, ebi 2, ebo 1 000000.054 wl0: Broadcom BCM43235 802.11 Wireless Controller 6.37.14.105 (r) 000000.060 no disconnect 000000.064 reclaim section 1: Returned 91828 bytes to the heap 000001.048 bcm_rpc_buf_recv_mgn_low: Host Version: 0x6250e69 000001.054 Connected Session:69! 000001.057 revinfo 000063.051 rpc uptime 1 minutes > ? 000072.558 reboot 000072.559 rmwk 000072.561 dpcdump 000072.563 wlhist 000072.564 rpcdump 000072.566 md 000072.567 mw 000072.569 mu 000072.570 ? > The baudrate was 115200 b/s. The command md allows to dump memory at a specific address. You should specify the address and how much DWORD you want to dump. With a tiny PySerial script we have been able to dump the ROM and make live RAM snaphot. #!/usr/bin/env python3 import serial import binascii nb = 65535 baseaddr = 0 uart = serial . Serial ( '/dev/ttyUSB0' , 115200 ) uart . write ( b 'md 0x %08x 4 %d

' % ( baseaddr , nb )) i = 0 dump = b "" while i != nb : read = uart . readline () . split ( b ' ' ) if b ">" in read [ 0 ]: continue if b "rpc" in read [ 2 ]: continue print ( "Dump %s %s \r " % ( read [ 1 ][: - 1 ], read [ 2 ]), end = "" ) dump += binascii . unhexlify ( read [ 2 ][: - 2 ])[:: - 1 ] i += 1 uart . close () with open ( "/tmp/bcm43236_rom.bin" , 'wb' ) as f : f . write ( dump )

IV. Reversing the firmware: the journey of a beacon In the last part, we used a lot the term 'RAM firmware', it must not be mistaken with a 'RAM snaphot' which is a dump of the entire RAM at runtime. As stated by Gal.Beniamini , after firmware initialisation some code inside the RAM will be reclaimed and used for the internal heap of the chipset. If one wants to analyze those firmwares, it is needed to analyse them with a genuine RAM firmware and with a RAM snapshot. 1) Reversing notes When everything is loaded in IDA, you will notice that nothing is recognised nor defined. We will need to select everything and force IDA to analyse it. Even if IDA recognizes and correctly defines much of the code and data, there will still be a lot of strings and unrecognized code, or data interpreted as code. This is where IDApython comes in handy; using a tiny script, we were able to correctly define the code and the data. When we feel that everything is correctly recognised by IDA, the fun part begins. Usually, if you have correctly set the base address, a lot of Xrefs should pop and several thousand functions should be detected. We don't have any symbols and all of the code is in thumb mode. The code itself looks very hard to understand. One of the first things to do is to identify libc-like functions used, like memcpy , memove , etc. This can be done manually or using Sybil, a function divination tool . The firmware relies on its own internal 'console' to print information. This console is a simple buffer of 2048 bytes lying in RAM. So the firmware gets its own home-brewed printf which is easily recognised via the numerous format strings present. There are other string formatting functions, like sprintf/snprintf, which are easily identified when the internal formating function is found and cross-referenced. Functions related to the heap memory management (malloc and free) can be identified in different ways: we can find malloc via debug strings, or by looking for the classic pattern: x= malloc(y); memset(x, 0, y); When malloc is found, we see that the allocator uses a single linked list of free chunks. Cross referencing the pointer of the linked list gives us the free function. The allocator is easy to understand: it is a best-fit allocator with coalescing. The allocator is usually in RAM, so it can be updated and change from device to device, or from one version to another. The firmware uses a lot of structures, notably one called wlc_info , containing everything needed to control the chip. Matthias SCHULTZ (SEEMO Lab) who is behind the NEXMON Project released his thesis a few months ago. In his thesis he gave lots of information about these different structures and linked the API's symbol names to the structure they take in arguments. The firmware intialisation routine can be easily spotted by: Following the reset address call (generally found at 0x0).

Searching the function responsible of the CRC check. The CRC32 function can be easily found by searching one of the table value (e.g: 77073096). Then cross-referencing this function leads to the firmware identity check.

Searching for the 'WFI' instruction and cross-referencing backward. After the initialisation, the chip just waits for any interrupt. 2) Packet flow Now, let us see how the frames are managed in a FullMAC device. When a frame is received, an interrupt is triggered, the frame processing starts in the FIQ interrupt handler. Let's take a look at how frames are processed in the bcm4339 firmware. We start by analyzing the Fast Interrupt handler (FIQ), we notice that this handler will grab a function pointer located on 0x181100 and pointing to a function at 0x181e48. This function contains two branches: one used to catch bugs, like bad memory accesses, the othe for the actual frame processing. If a memory violation occurs, the first branch will print a register dump and a stack trace on the internal console (very convenient . It's very useful when developing an exploit :) If we follow the second branch, we end up in a function at 0x0181A88 which will iterate through a linked list located at 0x00180E5C and containing pointers to functions: If we follow all the nested calls we end in the wlc_dpc functions. This function retrieves a variable called macintstatus (as called in an old version of brcmsmac) from the wlc_hw struct, and some checks are performed. The one we are interested in relies on the binary mask defined in the macro MI_DMAINT (value 0x8000), if these bits are set, we will jump into the function wlc_bmac_recv . This function will remove a frame from a linked list (rx_fifo) located in the shared memory of the MCU and the D11core, and construct a custom sk_buff structure with it. Then the function wlc_recv is called with two arguments: a pointer to the wlc structure and the freshly initialised skb_buff . This function can be considered the entry point of frame handling. The skb_buff structure may depend on the device and version, but the wlc_recv and wlc_bmac_recv can easily help to redefine it. The wlc_recv function will strip the custom header added to the frame by the d11core and retrieve the MAC header of the frame. A check is done on the type subfield of the FC field in order to correctly dispatch the frame to two handlers. One handler is for Management and control frames ( wlc_recv_mgmt_ctl ), the other is for Data frames ( wlc_recvdata ). If we want to know how beacon frames are processed, we just have to look inside the wlc_recv_mgmt_ctl function, which will extract the subtype from the FC field of the frame, then dispatch it to the conrresponding handler.

V. Emulation and fuzzing Only one article mentions emulation of these firmwares. It was released by COMSECURIS along with their tools, a modified Qemu version which is scriptable in Lua . Since we did not want to emulate all the firmware we decided to follow our own path. First, we tried to emulate some parts of the code (a simple call to printf in whatever function) with the Unicorn framework. We designed a tiny class wrapper around the Unicorn emulation engine, allowing us to easily define all the emulation's parameter and load them with jscon configuration file. These parameters include: the ROM file and its base address

the RAM snapshot file and its base address

start emulation address

stop emulation address

CPU context at the start We use our RAM snapshot and our previously gathered ROM. The RAM snapshot contains everything needed, code and initialised structures. We then decided to start fuzzing at the wlc_recv function. For that we need to put the wlc struct pointer in r0 , and craft a skb_buff structure with our frame data then put its pointer in r1 . To get a sample corpus, we have sniffed traffic sent to our device in various situations then worked directly with the pcap file. The fuzzing strategy was naive, as we only used random bitflips, with a static seed for easy reproduction of the results. In this scenario it is important to mention that the context in which the RAM snapshot was made influences the fuzzing and the code's paths taken. For example, if we want to fuzz a frame used during the association with an AP, we need to dump the RAM when the chip is not connected to any AP. So our procedure was the following; for each frame in our pcap file, randomly flip some bits, write the fuzzed frame's data with crafted d11 header in our RAM snaphot, then craft a skb_buff for our data and also write it in the snapshot. { "rom" : { "addr" : "0x0" , "file" : "../../bcm4339/bcm4339_ROM.bin" }, "ram" : { "addr" : "0x180000" , "file" : "../../tmp/unassoc_ram.bin" }, "cpu_context" : { "sp" : "0x23d194" , "r0" : "0x001e8d8c" , "r1" : "0x23e6cf" }, "start_at" : "0x1aafdc" , "stop_at" : "0x1aafe0" , "console_ptr" : "0x1eb5d8" , "zone0" : { "addr" : "0x18000000" , "file" : "old/conf/mem1" } } We must ensure that: Our frames are correctly parsed and processed.

During fuzzing we do not get stuck in the same code path, again and again. To ensure we emulate the frame processing correctly, we have produced a trace by printing each pc address and then verified that we correctly visited the corresponding frame handler. In this way we can answer questiosn like: if we are fuzzing a beacon frame, are we correctly reaching the wlc_recv_bcn function? and, How is our beacon being parsed? In order to determine if are discovering new code paths with our fuzzing, we have implemented a dirty new path's metric. First, we do a blank emulation run without fuzzing the frame from our pcap file. During this blank run, we record all the PC addresses and store them as keys in a dictionary. When we start fuzzing, we keep recording all pc addresses. If an address from the fuzzing run is not in our dictionary, we conclude we discovered a new path. We also need to correctly detect bugs. Memory access violation are spotted by Unicorn if we try to read or write outside a valid mapping, but how can we detected heap overflows? COMSECURIS gives the solution: hook the allocator functions. In order to follow the different action realized, we have implemented a trace format like drcov. This allows us to replay and carefully analyze a fuzzing session in IDA Pro.

VI. Finding vulnerabilities Several vulnerabilities were discovered and publicly disclosed in the past, like CVE-2017-9417 discovered by Nitay Artenstein in . Gal Beniamini also discovered several vulnerabilities in the chip's firmware and in the Linux's kernel driver. Chaining these vulnerabilities allows remote compromise the host, as it was shown with an iPhone 7. So far the majority of vulnerabilities discovered in the chip's firmware are due to misuse of the length value of Information Element. An Information Element, IE for short, is a Tag Length Value (TLV) data structure used in the IEEE 802.11b Management/Data frames. These IEs are used to carry any information needed by either the supplicant or the access point. There are two kinds of IEs: normal and vendor specific. Vendor specific IEs have a tag with value 221 (0xdd) and the data field starts with four bytes: 3 bytes containing the vendor OUI and one byte indicating the IE type. In the firmwares we analyzed the function dedicated to parsing of these IEs is named bcm_parse_tlvs . This function returns the following structure: typedef struct bcm_tlv { uint8_t id ; uint8_t len ; uint8_t data [ 1 ]; } bcm_tlv_t ; By cross-referencing it we find all the call sites where IEs are manipulated. Some of these functions are just a wrapper that looks for a vendor IE with a specific vendor OUI. Cross-referencing this wrapper leads us to yet more functions. There are more than one hundred call sites where these TLVs are manipulated. By iterating thought all these Xref we found the previously discovered vulnerabilities such as CVE-2017-0561, a heap buffer overflow due to the direct use of the length value of an Fast Transition IE as the size parameter during an memcpy call . It is worth noting that in the different firmwares we analyzed the function vulnerable to CVE-2017-0561 was located in ROM and so its code is unpatchable. In order to 'fix' the vulnerability the vendor had to deactivate the TDLS feature. CVE-2019-9501 and CVE-2019-9502: Two heap overflows discovered We continued iterating over the bcm_parse_tlvs call sites on the bcm4339 firmware and found one wrapper function at 0x14310 that searches for a vendor IE with an OUI of 00:0F:AC , which is used in the 802.11i (Enhanced Security Mechanisms) protocol specification to select the Cipher suite, the Authentication and Key Management (AKM) suite, and the EAPOL-Key Key Data Encapsulation to use . Cross-referencing this function lead us to another wrapper at 0x14304 which we named wlc_find_gtk_encap that is only called from one function located at 0x7B45C, named wlc_wpa_sup_eapol after the formating strings referenced inside. Let's look at what this function does with the returned bcm_tlv structure: The function calls wlc_find_gtk_encap and checks if a pointer to a bcm_tlv structure is returned, if so it puts IE length value in register r2 , address of IE data in r1 , a pointer to a buffer structure in r0 and calls memcpy() to copy the IE's data to the buffer pointed at by r0 . Notice that there is no check that the size of the destination buffer is enough to hold as many bytes as indicated by r2 . We have a potential buffer overflow in a structure but we don't yet know if the destination buffer is big enough to hold the copied data, let's keep following the execution flow. Next, the function wlc_wpa_plumb_gtk is called with the IE's length and the freshly copied buffer. The pseudocode of this function is: int wlc_wpa_plumb_gtk (..., uint8_t * ie_data , uint32_t len_ie , ...) { ... uint8_t * buffer ; ... buffer = malloc ( 164 ); if ( ! buffer ) { ... } memset ( buffer , 0 , 164 ); memcpy ( buffer , ie_data , len_ie ); ... } Here we have an obvious heap buffer overflow, the IE data are copied to a fixed-size buffer using a length controlled by an untrusted source (a potentially malicious AP). Gal Beniamini had already found other issues in the same wlc_wpa_plumb_gtk function: CVE-2017-11121 and CVE-2017-7065. So far we have a heap buffer overflow, and potentially another one. We need to understand how to reach this code path and we need to check the size of the buffer used in the memcpy call right after the IE extraction. Upon further inspection of the wl driver we find out that the buffer size is fixed at 32 bytes. In summary, we found two buffer overflows: the first allow us to overflow at most 219 bytes, and the second 87 bytes, the next question we want to answer is "How can we trigger these bugs ?" The WPA2 protocol use EAPOL (EAP On LAN), and a temporary key (GTK, which stands for Group Transient Key) to encrypt multicast traffic in the WLAN. This key is sent to a station during an EAPOL 4-way handshake, encapsulated in an vendor IE in EAPOL-Key Message 3. The wlc_wpa_sup_eapol function is responsible for parsing the Access Point messages during an EAPOL exchange. If we supply an GTK with a size of 255 in the EAPOL-M3 we will trigger these overflows. To accomplish this easily we simply have to patch two lines of hostapd: As the firmware code and the wl proprietary driver share a lot of code, we found the same issues in the driver. This means that on systems using FullMAC devices an attacker controlling a malicious Access Point can compromise the chip, whereas on systems with SoftMAC devices an attack would lead to direct compromise of kernel memory. To verify our findings we tried to connect a vulnerable SoftMAC bcm43263 chip, using the driver wl , to a rogue Access Point that delivered our PoC during the EAPOL exchange: These issues were present in all the firmwares analyzed, and in all versions of the wl driver analyzed. However, although the code is present on all firmwares, it doesn't seem to be used on all versions. For example, it's not used on the firmware version of the bcm4339 that we analyzed but it's used on all firmware's version of the bcm43430 device. In order to sucessfully exploit these bugs it is necessary to manipulate the heap layout remotely to obtain overlapping chunks. Gal Beniamini has already covered all aspects of heap exploitation of chip firmware . Another researcher, Nitay Artenstein talked about this too , in his case the overflow was more easily exploitable because he was able to directly smash a pointer in the adjacent chunk which enabled a write-anything-anywhere primitive. As stated above, one major problem on heap exploitation on these chips is the heap layout manipulation. There is almost no primitive that allows controlled size allocations with a controlled lifespan. We may find several controled size allocation primitive in several Management Action frame handlers but the allocated chunks are freed each time the primitive is used. On the other hand, all the RAM on these chips is set with RWX permissions and there are no exploit mitigation mechanisms. Vulnerabilities in the Linux brcmfmac driver During the time researching the Broadcom firmware we also discovered two bugs in brcmfmac , the Linux kernel's open source wireless driver for FullMAC card. As we said earlier, these chips use one of the three following BUS interface: USB, SDIO and PCIe. Built on the top of the bus, two mechanisms are use for the dongle to host and host to dongle communication. The first communication method is mostly used for host to dongle communication and is based on custom ioctls. We may find in the firmware code the ioctl handler as an ugly big switch case. The second communication mechanism is called firmware events . These firmware events are used by the chip to notify the host of differents events: scanning results, association/disassociation, authentification, etc. These events are encapsulated in regular TCP packets with an ethertype of 0x886c. Gal Beniamini from Google Project Zero already found several issues related to these firmware events in the Android Broadcom driver bcmdhd which allowed an attacker to remotely compromise the host or to escalate from a compromised dongle to the kernel’s host. CVE-2019-9503: Remotely sending firmware events bypassing is_wlc_event_frame Reading Gal.Beniamini articles , we learn that before April 2017, it was possible to remotely send crafted firmware events, using the chips like a proxy between the outside world and the kernel. Broadcom implemented a new mechanism to prevent frames coming from the exterior to be interpreted as firmware events. In order to do that, they introduced in the firmware a new function called is_wlc_events_frame which checks if a frame is a firmware event. In the bcmdhd driver used on Android, the same function is present since in order to be an effective solution the same check must be done in the firmware and the driver. We have the following logic: On the firmware side, if a data frame received appear to be a firmware event it is directly discarded.

In the driver, if the frame is an event it is processed. Let's look how frames are managed on the open source linux driver brcmfmac and how firmware events are processed. When the bus used is SDIO, there two different channels are set: one for event frames and one for all other frames. In the file sdio.c , at the function brcmf_sdio_readframes : ... if ( brcmf_sdio_fromevntchan ( & dptr [ SDPCM_HWHDR_LEN ])) brcmf_rx_event ( bus -> sdiodev -> dev , pfirst ); else brcmf_rx_frame ( bus -> sdiodev -> dev , pfirst , false ); ... We clearly see that if the frame comes from the event channel then a dedicated function is used brcmf_rx_event , else the function brcmf_rx_frame is called. The function brcmf_rx_frame is prototyped as follow in bus.h : void brcmf_rx_frame ( struct device * dev , struct sk_buff * rxp , bool handle_event ); The last arguments is a boolean used to indicate whether or not frames that contain a firmware event are processed. So we've checked the driver's code to see if this function was called with a handle_event parameter with a true value. When a USB bus is used there is not a dedicated channel to receive events and all frames are processed, even firmware events. In usb.c at function brcmf_usb_rx_complete : ... if ( devinfo -> bus_pub . state == BRCMFMAC_USB_STATE_UP ) { skb_put ( skb , urb -> actual_length ); brcmf_rx_frame ( devinfo -> dev , skb , true ); brcmf_usb_rx_refill ( devinfo , req ); } else { brcmu_pkt_buf_free_skb ( skb ); brcmf_usb_enq ( devinfo , & devinfo -> rx_freeq , req , NULL ); } ... So, if the bus is USB and if we find a way to bypass the firmware function is_wlc_event frame, we may me able to remotely send firmware event to the driver. Let's take a look at how firmware events are processed from the function brcmf_rx_frame : void brcmf_rx_frame ( struct device * dev , struct sk_buff * skb , bool handle_event ) { struct brcmf_if * ifp ; struct brcmf_bus * bus_if = dev_get_drvdata ( dev ); struct brcmf_pub * drvr = bus_if -> drvr ; brcmf_dbg ( DATA , "Enter: %s: rxp=%p

" , dev_name ( dev ), skb ); if ( brcmf_rx_hdrpull ( drvr , skb , & ifp )) return ; if ( brcmf_proto_is_reorder_skb ( skb )) { brcmf_proto_rxreorder ( ifp , skb ); } else { /* Process special event packets */ if ( handle_event ) brcmf_fweh_process_skb ( ifp -> drvr , skb ); brcmf_netif_rx ( ifp , skb ); } } If handle_event is set to true , the skb (socket buffer) is passed to the function brcmf_fweh_process_skb . This function is defined in fweh.h : static inline void brcmf_fweh_process_skb ( struct brcmf_pub * drvr , struct sk_buff * skb ) { struct brcmf_event * event_packet ; u16 usr_stype ; /* only process events when protocol matches */ if ( skb -> protocol != cpu_to_be16 ( ETH_P_LINK_CTL )) return ; if (( skb -> len + ETH_HLEN ) < sizeof ( * event_packet )) return ; /* check for BRCM oui match */ event_packet = ( struct brcmf_event * ) skb_mac_header ( skb ); if ( memcmp ( BRCM_OUI , & event_packet -> hdr . oui [ 0 ], sizeof ( event_packet -> hdr . oui ))) return ; /* final match on usr_subtype */ usr_stype = get_unaligned_be16 ( & event_packet -> hdr . usr_subtype ); if ( usr_stype != BCMILCP_BCM_SUBTYPE_EVENT ) return ; brcmf_fweh_process_event ( drvr , event_packet , skb -> len + ETH_HLEN ); } This function is responsible of validation of event frames. The function checks if the protocol is 0x886c, then checks if the size is sufficient for containing a structure brcmf_event . This structure is defined as follow: /** * struct brcm_ethhdr - broadcom specific ether header. * * @subtype: subtype for this packet. * @length: TODO: length of appended data. * @version: version indication. * @oui: OUI of this packet. * @usr_subtype: subtype for this OUI. */ struct brcm_ethhdr { __be16 subtype ; __be16 length ; u8 version ; u8 oui [ 3 ]; __be16 usr_subtype ; } __packed ; struct brcmf_event_msg_be { __be16 version ; __be16 flags ; __be32 event_type ; __be32 status ; __be32 reason ; __be32 auth_type ; __be32 datalen ; u8 addr [ ETH_ALEN ]; char ifname [ IFNAMSIZ ]; u8 ifidx ; u8 bsscfgidx ; } __packed ; /** * struct brcmf_event - contents of broadcom event packet. * * @eth: standard ether header. * @hdr: broadcom specific ether header. * @msg: common part of the actual event message. */ struct brcmf_event { struct ethhdr eth ; struct brcm_ethhdr hdr ; struct brcmf_event_msg_be msg ; } __packed ; Finally, the OUI and the usr_subtype are checked. If our frame is a correctly formatted firmware event, it will be sent to the function brcmf_fweh_process_event which will queue the event for processing. Now, let's look at how the function is_wlc_event_frame works inside the chip's firmware. We can also look at its definition in the bcmdhd source code, as normally the function used by the driver and the chipset need to be identic, otherwise the validation of the frame event could be bypassed. To find the location of is_wlc_event_frame inside the chip's firmware and where it is called, we have several options: follow the execution flow of frame data processing or simply search for code locations where the value 0x886c is used. If is_wlc_event_frame returns a result different of -30, the frame is discarded. int is_wlc_event_frame ( bcm_event * pktdata , unsigned int pktlen , int exp_usr_subtype , signed int a4 ) { ... if ( ( bcmeth_hdr_t * )(( char * ) pktdata + pktlen ) > & pktdata -> bcm_hdr && SLOBYTE ( pktdata -> bcm_hdr . subtype ) >= 0 ) return - 30 ; ... If the lowbyte of the field bcm_hdr.subtype is greater or equal to 0 then the function will return -30. The field subtype isn't checked in brcmf_fweh_processed_skb , so by supplying a subtype >= 0, we will pass the firmware check, the frame will be passed to the driver and then processed as valid in a firmware handler. When the bus used is PCIe, Broadcom implemented their own protocol called MSGBUF which doesn't use a particular channel for firmware event reception like SDIO. This vulnerability can be used to remotely send firmware events to the host on chips using an USB or PCIe bus, bypassing the firmware's internal check done in is_wlc_event_frame . CVE-2019-9500: Heap buffer overflow in brcmf_wowl_nd_results Now that we're able to remotely send firmware events let's look into how they are processed and dispatched. Firmware event processing starts in the function brcmf_fweh_event_worker , which will call the function brcmf_fweh_call_event_handler . static int brcmf_fweh_call_event_handler ( struct brcmf_if * ifp , enum brcmf_fweh_event_code code , struct brcmf_event_msg * emsg , void * data ) { struct brcmf_fweh_info * fweh ; int err = - EINVAL ; if ( ifp ) { fweh = & ifp -> drvr -> fweh ; /* handle the event if valid interface and handler */ if ( fweh -> evt_handler [ code ]) err = fweh -> evt_handler [ code ]( ifp , emsg , data ); else brcmf_err ( "unhandled event %d ignored

" , code ); } else { brcmf_err ( "no interface object

" ); } return err ; } The evt_handler is an array of function pointers. This array is populated by calling the function brcmf_fweh_register : /** * brcmf_fweh_register() - register handler for given event code. * * @drvr: driver information object. * @code: event code. * @handler: handler for the given event code. */ int brcmf_fweh_register ( struct brcmf_pub * drvr , enum brcmf_fweh_event_code code , brcmf_fweh_handler_t handler ) By searching where this function is called we find all event handler functions. When the WOWL (Wake Up On WirelessLAN) feature is activated the handler of the event of type BRCMF_E_PFN_NET_FOUND is unregistered and another handler is registered. This handler is the function brcmf_wowl_nd_results shown below: brcmf_wowl_nd_results ( struct brcmf_if * ifp , const struct brcmf_event_msg * e , void * data ) { struct brcmf_cfg80211_info * cfg = ifp -> drvr -> config ; struct brcmf_pno_scanresults_le * pfn_result ; struct brcmf_pno_net_info_le * netinfo ; brcmf_dbg ( SCAN , "Enter

" ); if ( e -> datalen < ( sizeof ( * pfn_result ) + sizeof ( * netinfo ))) { brcmf_dbg ( SCAN , "Event data to small. Ignore

" ); return 0 ; } pfn_result = ( struct brcmf_pno_scanresults_le * ) data ; if ( e -> event_code == BRCMF_E_PFN_NET_LOST ) { brcmf_dbg ( SCAN , "PFN NET LOST event. Ignore

" ); return 0 ; } if ( le32_to_cpu ( pfn_result -> count ) < 1 ) { brcmf_err ( "Invalid result count, expected 1 (%d)

" , le32_to_cpu ( pfn_result -> count )); return - EINVAL ; } data += sizeof ( struct brcmf_pno_scanresults_le ); netinfo = ( struct brcmf_pno_net_info_le * ) data ; memcpy ( cfg -> wowl . nd -> ssid . ssid , netinfo -> SSID , netinfo -> SSID_len ); //OVERFLOW YAY! cfg -> wowl . nd -> ssid . ssid_len = netinfo -> SSID_len ; cfg -> wowl . nd -> n_channels = 1 ; cfg -> wowl . nd -> channels [ 0 ] = ieee80211_channel_to_frequency ( netinfo -> channel , netinfo -> channel <= CH_MAX_2G_CHANNEL ? NL80211_BAND_2GHZ : NL80211_BAND_5GHZ ); cfg -> wowl . nd_info -> n_matches = 1 ; cfg -> wowl . nd_info -> matches [ 0 ] = cfg -> wowl . nd ; /* Inform (the resume task) that the net detect information was recvd */ cfg -> wowl . nd_data_completed = true ; wake_up ( & cfg -> wowl . nd_data_wait ); return 0 ; } When memcpy is called in order to copy the SSID, the length used is the one provided in the event's frame data and is not checked. The 802.11 standard specifies that an eSSID will never exceed 32 bytes but an attacker may remotely send a firmware event with a ssid size greater than 32 bytes, triggering an heap buffer-overflow. This issue has been silently patched (cf: disclosure timelime). A similar issue was found in the brcmf_notify_sched_scan_results , the handler for the same event ( BRCMF_E_PFN_NET_FOUND ) when WOWL is deactivated. The issue was silently patched by Broadcom in April 2017 , but the handler used when WoWL was enabled was forgotten. As we were working on an outdated brcmfmac version at the time we found these issue, a PoC triggering the overflow in brcmf_notify_sched_scan_results and panicking the kernel was accomplished by modifying airbase-ng , a tool from the aircrack-ng suite. An exploit or just a PoC can also be made using scapy or modifying wpa_supplicant or hostapd .