22 Apr 2017

I started to reverse engineer APFS and want to share what I found out so far. You can send me feedback and ideas on this post via Twitter.

Notice: I created a test image with macOS Sierra 10.12.3 (16D32). All results are guesses and the reverse engineering is work in progress. Also newer versions of APFS might change structures. The information below is neither complete nor proven to be correct.

Update 2017-04-30: Added a section for the checksum Update 2017-06-16: Add apfs.ksy respository

Contents

Overview

APFS is structured in a single container that can contain multiple APFS volumes. A container needs to be >512 MB to contain more than one volume, >1024MB to contain more than two volumes and so on. The following image shows an overview of the APFs structure.

Each element of this structure (except for the allocation file) starts with a 32 byte block header, which contains some general information about the block. Afterwards the body of the structure is following. The following types exist:

0x01 : Container Superblock

: Container Superblock 0x02 : Node

: Node 0x05 : Spacemanager

: Spacemanager 0x07 : Allocation Info File

: Allocation Info File 0x11 : Unknown

: Unknown 0x0B : B-Tree

: B-Tree 0x0C : Checkpoint

: Checkpoint 0x0D: Volume Superblock

Each of this structures is described in detail below. A more detailed version of the APFS structure is available as a Kaitai struct file: apfs.ksy. You can use it to examine APFS dumps in the Kaitai IDE or create parsers for various languages. This .ksy file must considered experimental.

General information:

The filesystem uses litte-endian values for storing information

values for storing information Timestamps are 64bit nanoseconds (1 / 1,000,000,000 seconds!) starting from 1.1.1970 UTC (unix epoch). The current timestamp is around 0x14b11800f375e000 .

(1 / 1,000,000,000 seconds!) starting from 1.1.1970 UTC (unix epoch). The current timestamp is around . Standard block size seems to be 4096 byte per block.

APFS is a copy-on-write filesystem so each block is copied before changes are applied so a history of all unoverwritten files and filesystem structures exists. This might result in a huge amount of forensic artefacts.

Structures

Block header

Each filesystem structure in APFS starts with a block header. This header starts with a checksum for the whole block. Other informations in the header include the copy-on-write version of the block, the block id and the block type.

pos size type id 0 8 uint64 checksum 8 8 uint64 block_id 16 8 uint64 version 24 2 uint16 block_type 26 2 uint16 flags 28 4 uint32 padding

Checksum

According to the apple docs the Fletcher’s checksum algorithm is used. Apple uses a variant of the algorithm described in a paper by John Kodis. The following algorithm shows this procedure. The input is the block without the first 8 byte.

func createChecksum ( data [] byte ) uint64 { var sum1 , sum2 uint64 modValue := uint64( 2 << 31 - 1 ) for i := 0 ; i < len( data ) / 4 ; i ++ { d := binary . LittleEndian . Uint32 ( data [ i * 4 : ( i + 1 ) * 4 ]) sum1 = ( sum1 + uint64( d )) % modValue sum2 = ( sum2 + sum1 ) % modValue } check1 := modValue - (( sum1 + sum2 ) % modValue ) check2 := modValue - (( sum1 + check1 ) % modValue ) return ( check2 << 32 ) | check1 }

The nice feature of the algorithm is, that when you check a block in APFS with the following algorithm you should get null as a result. Note that the input in this case is the whole block, including the checksum.

func checkChecksum ( data [] byte ) uint64 { var sum1 , sum2 uint64 modValue := uint64( 2 << 31 - 1 ) for i := 0 ; i < len( data ) / 4 ; i ++ { d := binary . LittleEndian . Uint32 ( data [ i * 4 : ( i + 1 ) * 4 ]) sum1 = ( sum1 + uint64( d )) % modValue sum2 = ( sum2 + sum1 ) % modValue } return ( sum2 << 32 ) | sum1 }

Container Superblock

The container superblock is the entry point to the filesystem. Because of the structure with containers and flexible volumes, allocation needs to handled on a container level. The container superblock contains information on the blocksize, the number of blocks and pointers to the spacemanager for this task. Additionally the block IDs of all volumes are stored in the superblock. To map block IDs to block offsets a pointer to a block map b-tree is stored. This b-tree contains entries for each volume with its ID and offset.

pos size type id 0 4 byte magic “NXSB” 4 4 uint32 blocksize 8 8 uint64 totalblocks 40 16 byte guid 56 8 uint64 next_free_block_id 64 8 uint64 next_version 104 4 uint32 previous_containersuperblock_block 120 8 uint64 spaceman_id 128 8 uint64 block_map_block 136 8 uint64 unknown_id 144 4 uint32 padding2 148 4 uint32 apfs_count 152 8 uint64 offset_apfs (repeat apfs_count times)

Node

Nodes are flexible containers that are used for storing different kinds entries. They can be part of a B-tree or exist on their own. Nodes can either contain flexible or fixed sized entries. A node starts with a list of pointers to the entry keys and entry records. This way for each entry the node contains an entry header at the beginning of the node, an entry key in the middle of the node and an entry record at the end of the node.

pos size type id 0 4 uint32 alignment 4 4 uint32 entry_count 10 2 uint16 head_size 16 8 entry meta_entry 24 … entry entries (repeat entry_count times)

Spacemanager

The spacemanager (sometimes called spaceman) is used to manage allocated blocks in the APFS container. The number of free blocks and a pointer to the allocation info file(s?) are stored here.

pos size type id 0 4 uint32 blocksize 16 8 uint64 totalblocks 40 8 uint64 freeblocks 144 8 uint64 prev_allocationinfofile_block 352 8 uint64 allocationinfofile_block

Allocation Info File

The allocation info file works as a missing header for the allocation file. The allocation files length, version and the offset of the allocation file are stored here.

pos size type id 4 4 uint32 alloc_file_length 8 4 uint32 alloc_file_version 24 4 uint32 total_blocks 28 4 uint32 free_blocks 32 4 uint32 allocationfile_block

Unknown

The structure with type 0x11 is quite empty and seems to be related to the spacemanager as it occurs adjacent to it. Its purpose it unknown.

B-trees manage multiple nodes. They contain the offset of the root node.

pos size type id 16 8 uint64 root

Checkpoint

A checkpoint structure exists for every container superblock. But I have no clue what it is good for.

Volume Superblock

A volume superblock exists for each volume in the filesystem. It contains the name of the volume, an ID and a timestamp. Similarly to the container superblock it contains a pointer to a block map which maps block IDs to bock offsets. Additionally a pointer to the root directory, which is stored as a node, is stored in the volume superblock.

pos size type id 0 4 byte magic “APSB” 96 8 uint64 block_map 104 8 uint64 root_dir_id 112 8 uint64 pointer3 120 8 uint64 pointer4 208 16 byte guid 224 8 uint64 time1 272 8 uint64 time2 672 8 str(ASCII) name

Allocation File

Allocation files are simple bitmaps. They do not have a block header and therefore no type id.