Console Hacking 2016 PS4: PC Master Race @marcan42

In memory of Ben ‘bushing’ Byer

It's a bit different from previous consoles ✓ x86 ✓ FreeBSD ✓ WebKit ✗ Hypervisor But not completely different ✓ Security processor (that you can just ignore)

How to pwn a PS4 Step 1: Write a WebKit exploit Step 2: Write a FreeBSD exploit

How to pwn a PS4 Step 0. Dump the code Step 1. Write a WebKit exploit Step 2. Write a FreeBSD exploit Step 3. ? Step 4. PROFIT

Black-box code extraction The fun way

A long time ago in a hackerspace far,

far away.... (fail0verflow got together after 31c3)

PCIe: Fun for the whole family Bus mastering

Complicated

Robust

Delay tolerant

Drivers full of fail

PCIe 101 PCIe is a reliable switched packet network Transaction Layer Packets (TLPs): Memory reads/writes

IO reads/writes

Configuration reads/writes

Message signaled interrupts (MSI) (writes)

Legacy interrupts

Completions

Go wild with DMA! Except there's an IOMMU...

void load_some_stuff(void) { char buf[32]; plz2read_from_flash(SOME_ADDRESS, buf, 32); } void plz2read_from_flash(uint32_t addr, void *buf, size_t size) { iommu_map(buf, size); flash_send_read_command(addr, buf, size); iommu_unmap(buf, size); }

✓ Code execution ✓ FreeBSD kernel dump ✓ WebKit and OS libs dump

How to pwn a PS4 ✓ Step 0. Dump the code ✓ Step 1. Write a WebKit exploit ✓ Step 2. Write a FreeBSD exploit ✓ Step 3. ?ps4-kexec ✓ Step 4. PROFIT (Linux)

From FreeBSD to Linux PS4-kexec

jmp linux Not so fast... we need to: Load Linux into contiguous physical RAM

Set up Linux boot parameters

Shut down FreeBSD cleanly

Halt secondary CPUs

Make new pagetables and GDT

Disable the IOMMU

Relocate various things in memory

And more...

Okay, now jmp linux , right? Sure, Linux will technically run For a little bit anyway And then it stops No video, no serial output, nothing

Let's talk about hardware

What is x86 ? A mediocre instruction set architecture The PS4 is x86 (x86-64)

What is a PC? A horrible, horrible thing built upon piles and piles of legacy nonsense dating back to 1981 The PS4 is NOT a PC

PC 101 8259 Programmable Interrupt Controller (PIC)

8253 Programmable Interval Timer (PIT)

8250 UART at I/O 3f8h

8042 PS/2 Keyboard Controller

MC146818 RTC/CMOS

ISA bus

VGA The PS4 has none of these

AMD FCH Southbridge Implements Intel legacy (1981) Marvell Aeolia Southbridge Implements Intel legacy (2002) ????

That's No Southbridge That's a Marvell ARMADA SoC Descendant from Intel StrongARM/XScale

ARM SoC with a bunch of peripherals

They stuck a PCIe bridge on it

Exposes ARM peripherals to the x86 side

Some extra stuff (e.g. HPET, ACPI stuff)

256MB DDR RAM

Also runs FreeBSD in standby mode

runs FreeBSD in standby mode Batshit insane

00:01.2 00: bus number (8 bits) 01: device number (5 bits) 2: function number (3 bits)

Intel “Panther Point” PCH 00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04) 00:16.0 Communication controller: Intel Corporation 7 Series/C216 Chipset Family MEI Controller #1 (rev 04) 00:1a.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #2 (rev 04) 00:1b.0 Audio device: Intel Corporation 7 Series/C216 Chipset Family High Definition Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 7 Series/C216 Chipset Family PCI Express Root Port 1 (rev c4) 00:1c.1 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 2 (rev c4) 00:1c.2 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 3 (rev c4) 00:1c.3 PCI bridge: Intel Corporation 7 Series/C216 Chipset Family PCI Express Root Port 4 (rev c4) 00:1d.0 USB controller: Intel Corporation 7 Series/C216 Chipset Family USB Enhanced Host Controller #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation HM77 Express Chipset LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) 00:1f.3 SMBus: Intel Corporation 7 Series/C216 Chipset Family SMBus Controller (rev 04)

Marvell “Aeolia” [...] 00:12.0 System peripheral: Sony Corporation Aeolia ACPI [...] 00:13.0 System peripheral: Sony Corporation Aeolia ACPI [...] 00:14.0 System peripheral: Sony Corporation Aeolia ACPI 00:14.1 System peripheral: Sony Corporation Aeolia Ethernet Controller (Marvell Yukon 2 Family) 00:14.2 System peripheral: Sony Corporation Aeolia SATA AHCI Controller 00:14.3 System peripheral: Sony Corporation Aeolia SD/MMC Host Controller 00:14.4 System peripheral: Sony Corporation Aeolia PCI Express Glue and Miscellaneous Devices 00:14.5 System peripheral: Sony Corporation Aeolia DMA Controller 00:14.6 System peripheral: Sony Corporation Aeolia Memory (DDR3/SPM) 00:14.7 System peripheral: Sony Corporation Aeolia USB 3.0 xHCI Host Controller 00:15.0 System peripheral: Sony Corporation Aeolia ACPI [...] 00:16.0 System peripheral: Sony Corporation Aeolia ACPI [...] 00:17.0 System peripheral: Sony Corporation Aeolia ACPI [...] It clones itself across all PCI device numbers

8 functions ain't enough for everybody 00:14.4 “PCI Express Glue” PCIe bridge config

MSI interrupt controller

ICC

HPET

Flash controller

RTC

Timers

2 serial ports

I²C

Linux Minimum System Requirements A timer (PIT)

Interrupts (PIC)

Some kind of console PS4: no PIT, no PIC, no standard serial Board has testpoints for an 8250-derived serial port

dmesg plz Linux earlycon: early console for debugging No IRQs required console=uart8250,mmio32,0xd0340000,3200n8 Clock is different... 3200 means 115200 This gets us a boot log

Time Stamp Counter (tsc) Newfangled timer, in-CPU PS4 Liverpool APU supports proper TSC Linux tries to calibrate it... ... against PIC or PMTIMER Fail

Again, It Really Isn't A PC enum { X86_SUBARCH_PC = 0, X86_SUBARCH_LGUEST, X86_SUBARCH_XEN, X86_SUBARCH_INTEL_MID, X86_SUBARCH_CE4100, + X86_SUBARCH_PS4, X86_NR_SUBARCHS, }; Subarch specified by bootloader (ps4-kexec) Enables custom TSC calibration code Disables legacy PIC and RTC

ACPI Not just “power” Needed for proper PCI config, IOMMU, CPU frequency scaling... PS4 has broken ACPI tables... Fix them in ps4-kexec

PCI MSI 101 Message Signaled Interrupts Device configuration registers for address and value

To fire an interrupt, devices write a value to an address

CPU IRQ controller (LAPIC) receives and fires interrupt vector

The message value directly defines the CPU IRQ vector

Aeolia MSI 101 Device MSI configuration registers ignored

Function 4 ("glue") implements custom MSI controller

Each function gets shared addr and top 27 bits of data

Each "sub-function" only gets separate bottom 5 bits

All MSIs originate from Function 4 (ノಠ益ಠ)ノ彡┻━┻

Driver hell Sibling devices are inter-dependent

Linux IRQ vector allocation not sequential

Need to modify all drivers to use custom IRQ code

Aeolia on Linux Core driver implements IRQ controller interface

Linux probe-defer mechanism to fix ordering issue

Some drivers (SDHCI, GigE) modified to request Aeolia IRQs

Some drivers (serial, USB) instantiated from wrappers

Each function uses a single shared IRQ :(

IOMMU to the rescue Allows interrupt remapping

Consecutive message numbering

Can use unique IRQs per sub-function :)

Falls back to shared IRQs if IOMMU off

The ACPI table for the IOMMU is missing :(

Checklist ✓ IRQs (apcie) ✓ Timer (TSC) ✓ Early serial ✓ Late serial with IRQs (apcie-uart) ✓ Initramfs userspace ✗ Serial I/O hangs sometimes :(

More cleanup needed FreeBSD masks some IRQ vectors on CPU#0 with nonstandard AMD LAPIC features Clean them up in ps4-kexec ✓ Serial is stable This took *ages* to debug

Just add drivers ✓ USB xHCI (3 USB controllers in one function...) ✓ SDHCI (Nonstandard PCI config, needs quirks...) ✓ Ethernet (Driver needs hacks; still partially broken...) Worked fine on Linux 4.4 Failed on 4.9 - DMA broken? Aeolia Strikes Back

31-bit DMA ┻━┻ ︵ヽ(`Д´)ﾉ︵﻿ ┻━┻ More Linux driver patching...

And now for something completely different

PS4 Radeon Graphics “Starsha”

dce_ihdef_get_info_crtc_linea_ liverpool

" LVP A0" Starsha AsicStateRegInfo

ThJ Starsha AGESA ThebeJ BDK Nobody (not even Sony/AMD) agrees on the APU codename We're calling it Liverpool

Liverpool Graphics AMD GCN “Sea Islands” (CI) GPU

Similar to other chips in the generation

Some quirks, customizations, oddities

We used Bonaire as a base

Hacking on AMD drivers AMD publishes 3D shader and command queue documentation They do NOT publish register docs for recent GPUs That's what we need to hack on kernel drivers :( “The code is the documentation” - incomplete, magic numbers

Google Time

http://www.siliconkit.com/pragmatic/bonaire.xml XML dump of Bonaire register documentation? <field> <fname> <token>P_ALWAYS_USE_FAST_TXCLK</token> </fname> <frange> <token>13:13</token> </frange> <ftype> <token>ALPHA</token> <token>{</token> <fieldtexts> <fieldtext> <quoted> <token>"TXCLK will be either 250MHz, 500MHz, or 1GHz depends on port speeds "</token> </quoted> Broken, incomplete

Tell me more

What is RAI http://www.siliconkit.com/pragmatic/RAI/rai.grammar4.txt root ::= sections sections ::= section sections section ::= 'SECTION_START' 'CHIP_INFO' statements 'SECTION_END' section ::= 'SECTION_START' 'CHIP_SPACES' chipspaces 'SECTION_END' section ::= 'SECTION_START' 'CHIP_STREAMS' '[a-zA-Z0-9_]*' 'SECTION_END' section ::= 'SECTION_START' 'CHIP_MEMORIES' '[a-zA-Z0-9_]*' 'SECTION_END' section ::= 'SECTION_START' 'CHIP_PARAMETERS' '[a-zA-Z0-9_]*' 'SECTION_END' section ::= 'SECTION_START' 'BLOCK_INFO' statements 'SECTION_END' section ::= 'SECTION_START' 'BLOCK_REGISTERS' register 'SECTION_END' register ::= title spaces size rattribute '{' fields '}' ';' register ::= title spaces size '{' fields '}' ';' title ::= '[a-zA-Z0-9_]*' spaces ::= space spaces [...] AMD internal register description file?

Bingo //Version 1.0.1.0 //CL# 890079 //Version 1.0.0.0 //CL# 883050 SECTION_START CHIP_INFO CHIP_NAME = "bonaire"; DESCRIPTION = "R8xx GPU Chip"; RELEASE = "Chip Spec 0.28"; // Edit Vendor ID Here: Default(0xFFFF) means search for all ASIC_VENDOR_ID = 0x1002; [...]

So I wrote a *working* parser $ python showregname.py HDP_NONSURFACE_INFO HDP_NONSURFACE_INFO (GpuF0Reg:0x2c08,GpuF1Reg:0x2c08) 32bit: 0 NONSURF_ADDR_TYPE - 0: physical address with no translation. - 1: virtual address, requires page table translation. 4:1 NONSURF_ARRAY_MODE - 0: ARRAY_LINEAR_GENERAL: Unaligned linear array - 1: ARRAY_LINEAR_ALIGNED: Aligned linear array [...] Also does annotated register dumps, diffs, #define generation 4000+ registers documented in GpuF0Reg alone

Road to the framebuffer

HDMI is easy, right? GPU has HDMI, DisplayPort ports

HDMI not connected; DP connected ????

External HDMI encoder

We must go deeper Panasonic I²C DisplayPort → HDMI bridge Requires configuration to work Hooked up to the GPU I²C bus? You wish

ICC RPC protocol used to send commands to system MCU

Message box / doorbell protocol

Accessed via Aeolia

Used for things like power, buttons, LEDs...

And the HDMI encoder I²C

ICC I²C Let's build a simple I²C interface? Nah, let's make a bytecode scripting engine to issue I²C commands Why?!? Because ICC is too slow to issue requests one by one

More hacks... HDMI encoder requires all 4 DisplayPort lanes active

Scanout memory bandwidth calculation is broken

Mouse cursor size is from previous generation (wat?) ✓ Framebuffer console working ✗ X won't start with radeon driver

A tale of two memories PS4 uses a unified memory architecture Linux legacy driver expectes a usable amount of "video" memory PS4 configures emulated VRAM as 16MiB... Solution: reconfigure memory controller in ps4-kexec to assign 1GiB of RAM as VRAM ✓ X starts

It's 3D time

Radeon GPU 101 Commands are sent to the GPU by putting them in rings: Graphics ring

Compute rings

DMA rings Commands are processed by the GPU Command Processor It contains multiple sub-units (ME, PFP, CE), each of which is a custom ‘F32’ CPU running microcode firmware Rings can call out to Indirect Buffers (IBs) with more commands

radeon: ring 0 test failed The graphics ring isn't working WREG32(scratch, 0xCAFEDEAD ); radeon_ring_lock(rdev, ring, 3); radeon_ring_write(ring, PACKET3(PACKET3_SET_UCONFIG_REG, 1) ); radeon_ring_write(ring, ((scratch - PACKET3_SET_UCONFIG_REG_START) >> 2) ); radeon_ring_write(ring, 0xDEADBEEF ); radeon_ring_unlock_commit(rdev, ring, false); The ring test writes to a GPU register from the ring, then checks to see if the write happened Debug registers (thanks bonaire.rai!) show the CP is stuck... ... waiting for data in the ring... ... after a NOP command?

NOP is hard, let's go stalling Packet headers have a length field of size - 2 2-word packet: size = 0. They added a 1-word NOP: size = 0x3fff (-1) Old microcode... interprets it as a huge packet Hawaii has the same issue on old microcode: if (rdev->family == CHIP_HAWAII) { if (rdev->new_fw) nop = PACKET3(PACKET3_NOP, 0x3FFF) ; else nop = RADEON_CP_PACKET2; } else { nop = PACKET3(PACKET3_NOP, 0x3FFF); }

radeon: ring 3 test failed That's the SDMA ring radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_WRITE, SDMA_WRITE_SUB_OPCODE_LINEAR, 0) ); radeon_ring_write(ring, lower_32_bits(gpu_addr) ); radeon_ring_write(ring, upper_32_bits(gpu_addr) ); radeon_ring_write(ring, 1 ); /* number of DWs to follow */ radeon_ring_write(ring, 0xDEADBEEF ); Same idea: write a value to memory, check for it Debugging, the write happens... but it writes zero?

Double it up So I tried queuing two writes instead: radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_WRITE, SDMA_WRITE_SUB_OPCODE_LINEAR, 0) ); radeon_ring_write(ring, lower_32_bits(gpu_addr) ); radeon_ring_write(ring, upper_32_bits(gpu_addr) ); radeon_ring_write(ring, 1 ); /* number of DWs to follow */ radeon_ring_write(ring, 0xDEADBEEF ); <-- What it *should* radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_WRITE, write SDMA_WRITE_SUB_OPCODE_LINEAR, 0) ); radeon_ring_write(ring, lower_32_bits(gpu_addr2) ); radeon_ring_write(ring, upper_32_bits(gpu_addr2) ); radeon_ring_write(ring, 1 ); /* number of DWs to follow */ <-- What it writes radeon_ring_write(ring, 0x0BADF00D ); Now it writes... 1 to the first destination?

SDMA: Off-by-four Linear writes from the ring start 4 words too late in the ring IBs work fine, only the ring is broken Workaround: use FILL opcode instead: radeon_ring_write(ring, SDMA_PACKET(SDMA_OPCODE_CONSTANT_FILL, 0, SDMA_CONSTANT_FILL_EXTRA_SIZE(2))); radeon_ring_write(ring, lower_32_bits(gpu_addr2) ); radeon_ring_write(ring, upper_32_bits(gpu_addr2) ); radeon_ring_write(ring, 0xDEADBEEF ); /* Fill value */ radeon_ring_write(ring, 4 ); /* number of bytes */

Still no worky Can't write to pagetable config registers via GPU commands :( Linux uses this to configure pagetables Special register firewall in Liverpool? Security? Workaround by directly writing from CPU, but it sucks Maybe the register firewall is in the firmware?

Speaking of firmware The Command Processor blocks require “microcode” Thus far undocumented We pull the firmware blobs from FreeBSD in ps4-kexec and pass them in initramfs (avoids redistribution issues) Let's dig deeper

Reversing CPU architectures 101 Guess an instruction Try running it See what it did GOTO 1 We can upload custom F32 firmware easily and have it write to scratch regs, then read what it wrote The basic "write to GPU reg" instruction is easy to find from GPU register offsets, in the microcode blobs

f32dis Disassembler for the AMD proprietary ‘F32’ GPU microcode CLEAR_STATE: 5e cc800000 | stw r2, [r0, #0x0] 5f cc400000 | stw r1, [r0, #0x0] 60 cc000016 | stw r0, [r0, #0x16] 61 80000672 | b 0x672 INDEX_BUFFER_SIZE: 62 cc40002d | stw r1, [r0, #0x2d] 63 7c408001 | mov r2, r1 64 88000000 | btab Instruction syntax shamelessly stolen from ARM Not complete, but disassembles all instructions used in Liverpool and Bonaire firmwares for PFP, ME, CE, MEC, RLC

Alas Register blocking not in the firmware It seems it is blocked in hardware, when issued from GFX block (debug registers show an access violation) Haven't found how to turn it off yet 3D does work with the CPU write workaround, though!

Current Checklist ✓ IRQs / Timer

✓ Serial port

✓ Shutdown / reboot

✓ Power button

✓ USB

✓ HDD

✓ Blu-Ray

✓ WiFi

✓ Bluetooth

✓ Ethernet (mostly)

✓ Framebuffer / KMS

✓ HDMI (basic)

✓ 3D (with ugly hack)

✓ S/PDIF audio





✗ HDMI audio

✗ RTC



✓ Blinkenlights



Code github.com/fail0verflow/ps4-kexec kexec and hardware reconfiguration / "bootloader" code github.com/fail0verflow/ps4-linux Kernel tree github.com/fail0verflow/ps4-radeon-patches Userspace library patches github.com/fail0verflow/radeon-tools f32dis and RAI tools