This post is the third part of my series about tracking skips in the Spotify client. This post is a direct continuation of my work on the MacOS client first detailed here: https://medium.com/@lerner98/skiptracing-reversing-spotify-app-3a6df367287d.

Hardcoding Addresses

In the previous article, I hooked the target functions using HookCase to track when the skip subprocedure was called. However, there was one big problem with this approach that I didn’t realize at the time.

One day, I decided to see how many skipped songs I have logged. It seemed low. I then decided to skip a few songs and again print out the number of songs. It didn’t change. Dammit! Something broke and I have no clue what.

Finding the Problem

Let’s crack open Spotify in IDA and go to our hook addresses as a sanity check. In the previous article, I made a big deal about finding sub_100CC2E20 so let’s go there and see what could’ve gone wrong:

This doesn’t look anything like our next procedure. In fact, 0x100CC2E20 isn’t even on an instruction boundary. This is a big problem.

Going back to the mediaKeyTap method, we find a familiar-looking CFG:

However, there is one big (highlighted) difference. The address of our called procedure has changed from 0x10006FE10 to 0x100069CF0, a difference of -24864 bytes.

Now going to our function with the big switch statement:

We see that is now located at 0x100067E40 when it was located at 0x10006DE40 previously: a difference of -24576.

The closeness of these offsets gives us a clue to what is going on. My theory is that Spotify occasionally updates itself and on this particular update (or set of updates), around 24000 bytes were removed before our target procedures and a couple hundred were added in between them.

This presents us with a conundrum: how do we find the correct addresses to hook when they could change between runs? The answer is Objective-C.

Objective-C

Objective-C is at heart a dynamic language. You can add methods to a class, change a method’s implementation, and do all sorts of fuckery at runtime. To support this behavior, the class information must be stored in the application binary somewhere. If you run objdump -h on the Spotify binary, you’ll see the following interesting sections:

namely the sections that begin with __objc. The first section we’ll want to take a look at is the __objc_classlist section. While undocumented, this section contains an array of pointers into the __objc_data section where each pointer points to an objc_class struct. We will discuss the layout of the struct later.

Our end goal will be to find the addresses of the unnamed next and previous subprocedures, but our bridge to these addresses will be mediaKeyTap method because we can always find it with the help of the Objective-C class data.

Resolving Objective-C Methods

The class that responds to the mediaKeyTap:receivedMediaKeyEvent: selector is SPTBrowserClientMacObjCAnnex. Therefore, we can iterate over the objc_class structures pointed to by the __objc_classlist section until the name of the struct is equal to SPTBrowserClientMacObjCAnnex. Let’s get to it.

First, we have to iterate over the __objc_classlist section. But to do that, we need to know where the section is. This information is contained within the Mach-O header (which is why it was revealed with objdump -h ).

Parsing the Header

There are plenty of existing articles about the Mach-O file format and the documentation is fairly lucid so I won’t go into too much detail here. All you really need to know is that there are several “segment load commands” contained within the header. A segment load command (LC) simply specifies a region of the file and where to map it into memory.

Directly after the segment LC, there will be a number of section structs. Each section’s extent both within the file and virtual memory are contained within its corresponding segment but the sections offer a more fine-grained mapping.

If you were paying attention before, the __objc_classlist section is contained within the __DATA segment. Therefore, we can find it’s region in the file like so:

#include <mach-o/loader.h> FILE *fp;

size_t i,j,curr_off;

struct mach_header_64 header;

struct load_cmd load_cmd;

struct segment_command_64 seg_cmd;

struct section_64 sect; struct section_64 objc_classlist_sect; fp = fopen("/Applications/Spotify.app/Contents/MacOS/Spotify", "r");

fread(&header, sizeof(header), 1, fp);

Here, we are simply setting up some variables and reading in the Mach-O header.

Then, we can iterate over the load commands:

for (i=0; i<header.ncmds; i++)

{

fread(&load_cmd, sizeof(load_cmd), 1, fp);



if (load_cmd.cmd != LC_SEGMENT_64) {

fseek(fp, load_cmd.cmdsize - sizeof(load_cmd), SEEK_CUR);

continue;

} fread((char *)&seg_cmd + sizeof(load_cmd),

sizeof(seg_cmd) - sizeof(load_cmd),

1,

fp); if (strcmp(seg_cmd.segname, "__DATA")) {

fseek(fp, load_cmd.cmdsize - sizeof(seg_cmd), SEEK_CUR);

continue;

}

Here, we ignore any LC’s that are not segment LC’s (there are many different types specified in the ABI). Then, we read in the LC as a segment LC and ignore it if it is not the __DATA segment.

Then, we will iterate over sections in the __DATA segment:

for (j=0; j<seg_cmd.nsects; j++)

{

fread(§, sizeof(sect), 1, fp);



if (!strncmp(sect.sectname, "__objc_classlist", 16)) {

memcpy(&objc_classlist_sect, §, sizeof(sect));

break;

}

} break;

}

Once we find the section with the correct name, we copy it into our target variable and exit the loop.

Now that we can iterate over the objc_class structs, we need to know how to get the class name and method names for each class. While the Objective-C runtime is open source, I couldn’t find the type declarations corresponding to the version of Objective-C used in the Spotify binary, so you can declare the types like so:

The fields are of type uint64_t instead of pointers because they are used as offsets into the file. The __DATA segment could be mmap 'd and then the values treated as pointers but this leads to complications when mmap is unable to allocate the segment at its original address.

Anyways, the data field of objc_class “points” to an objc_class_data structure. This structure contains both the name of the class and a base_methods “pointer” to the methods defined for this class. The method list consists of an objc_methodlist struct followed by objc_methodlist.count objc_method structures. Each objc_method struct will tell us the method name and it’s imp pointer (and it’s type signature but we don’t really care about that).

I’ll link to the code later but it’s a straightforward extension of the previous code listings to iterate through the classes to find our SPTBrowserClientMacObjCAnnex class and iterate through the class’s methods to find the mediaKeyTap:receivedMediaKeyEvent: selector.

Finding the Call

Assuming we have the imp pointer for our mediaKeyTap: method, we can then use the Capstone library to disassemble the function and find the call to the media key tap handler:

#include <capstone/capstone.h> uint8_t code[500];

size_t start_addr, i, insn_count;

uint64_t target_addr; csh handle;

cs_insn *nsn; fseek(fp, (meth->imp) & 0xffffff, SEEK_SET);

fread(code, 1, 500, fp); cs_open(CS_ARCH_X86, CS_MODE_64, &handle); start_addr = meth->imp;

insn_count = cs_disasm(handle,

code,

500,

0,

&insn) for (i=0; i<insn_count; i++)

{

if (strcmp(insn[i].mnemonic, "mov") || strcmp(insn[i].op_str, "esi, 3"))

continue; target_addr = strtoll(insn[i+2].op_str, NULL, 16);

break;

}

We look for the “next” case (that is mov esi, 3 ) since if you look at the disassembly:

this case actually comes first in memory. We then take the instruction two after the mov esi, 3 instruction to find our target call .

Remember that this subprocedure is actually a wrapper for our final target, so we have to perform the same disassembly procedure on the following function:

Taking note of the highlight, after checking some conditions, this function jumps to our final destination five instructions are preparing register esi with the contents of register r14d . Therefore, we can do something like:

... if strcmp(insn[i].mnemonic, "mov") || strcmp(insn[i].op_str, "esi, r14d"))

continue; *reloc_addr = insn[i+5].address+1;

*reloc_pc = insn[i+5].address + insn[i+5].size;

target_addr = strtoll(insn[i+5].op_str, NULL, 16); ...

Here, when we find our sentinel instruction, mov esi, r14d , in addition to setting our target address (the address of the function with the large switch statement), we set two additional variables: reloc_addr and reloc_pc .

To understand these two variables, we first need to cover how we will automate the hooking process.

Automatic Hooking

Normally, the control flow from our media key handler wrapper will look like:

However, we will patch instructions to make it to look like:

The redirect to “my MK Handler” will be done through patching the jmp sub_100067E40 instruction in the Wrapper to actually be jmp &new MK Handler .

Since jmp tells the CPU to set the new program counter (PC) relative to where it currently is, we need to know what the program counter after this instruction occurs. This is where the variable reloc_pc comes into play. We set it to insn[i+5].address + insn[i+5].size because that is what the PC will be after the jmp executes.

We also need to know the address of the relative offset in the jmp instruction in order to patch it. Since the jmp opcode is only one byte, we set reloc_addr to insn[i+5].address+1 .

Patching the jmp

Now that we have the PC after the jump and the address of the offset, we can actually patch the instruction to jump to our own code.

To do this, we will create a dylib and insert an LC_LOAD_DYLIB LC into the Mach-O load commands much like in my iOS post: https://medium.com/@lerner98/skiptracing-part-2-ios-3c610205858b.

Assume in the library constructor, we called our resolve function and got three pieces of information:

mkHandler , the address of Spotify’s media key handler function mk_reloc_addr , the address of the offset in the jump to mkHandler mk_reloc_pc , the PC after the aforementioned jump

We now have to adjust the memory protections for the bytes we wish to write since by default the __TEXT segment has only RX permissions initially. Thankfully, the max protections specified in the binary are RWX (even though we could patch this as well). Let’s do this:

uint64_t prot_mask;

uint64_t prot_addr;

size_t prot_size; prot_mask = ~((1 << 12) - 1); // since page size is 4k

prot_addr = mk_reloc_addr & prot_mask;

prot_size = 4 + mk_reloc_addr - prot_addr; mprotect((void *)prot_addr, prot_size, PROT_WRITE); *mk_reloc_addr = (int32_t)((int64_t)(&new_mkHandler) - mk_reloc_pc); mprotect((void *)prot_addr, prot_size, PROT_READ | PROT_EXEC);

where new_mkHandler is defined as:

void new_mkHandler(void ***appDelegate, int32_t keyCode);

Note that we have to mask off the lower twelve bits of mk_reloc_addr since mprotect requires that the address we pass be page-aligned. We then need to adjust the size of our protected region from four bytes (since the jump offset is 32 bits) to account for the difference between mk_reloc_addr and prot_addr .

Let’s put some dummy code in new_mkHandler just to see if we hit it:

void new_mkHandler(void ***appDelegate, int32_t keyCode)

{

printf("here!

");

exit(69);

}

To load our dylib, we can use the code from Part 2 to insert a LC_LOAD_DYLIB command into the Mach-O file.

If we do this and run the Spotify application, then sure enough we should see our print statement (if running through the command line) and we should have a nice exit code.

Overwriting the Function Pointers

Now to call our own prev and next subprocedures instead of Spotify’s will require some ingenuity. To see why, let’s take a look at the “next” case in the MK handler switch statement:

Note that in the beginning of the function, the app delegate pointer (in rdi ) is moved into register r13 . Therefore, the code dereferences the app delegate twice and then moves a function pointer at offset 0x58 of that struct into register r12 . This is the register that holds the address to the next subprocedure and is called at the bottom of the listing.

Looking through the rest of MK handler, we can see that the offset 0x58 into the rax struct is only referenced here, so we can safely overwrite the function pointer at that address so the address of our own next subprocedure will be loaded into r12 and subsequently called.

If we look at the “prev” case, we can see that the exact same steps are taken except the function pointer is located at offset 0x50 in the rax struct. Therefore, we can write some code in our new_mkHandler to overwrite these function pointers because we can make no assumptions as to the address of rax struct before the MK handler is called.

The code will look something like this:

typedef prev_next_func_t(int64_t, int64_t, int64_t); uint64_t prot_mask;

uint64_t prot_addr;

size_t prot_size;

uint64_t fp_off;

uint64_t *fp; prot_mask = ~((1 << 12) - 1); // since page size is 4k fp_off = (uint64_t)(**appDelegate) + 0x50;

fp = (uint64_t *)fp_off; prevHandler = (prev_next_func_t *)(*fp);

nextHandler = (prev_next_func_t *)(*(fp+1)); prot_addr = fp_off & prot_mask;

prot_size = 16 + fp_off - prot_addr; mprotect((void *)prot_addr, prot_size, PROT_WRITE); *fp = (uint64_t)(&new_prevHandler);

*(fp+1) = (uint64_t)(&new_nextHandler); mprotect((void *)prot_addr, prot_size, PROT_READ | PROT_EXEC);

Where new_prevHandler and new_nextHandler are defined as:

void new_prevHandler(int64_t p1, int64_t p2, int64_t p3);

void new_nextHandler(int64_t p1, int64_t p2, int64_t p3);

It can be seen from the disassembly that the prev and next subprocedures take three 64-bit parameters but we don’t really need to know what they are.

One gotcha is that we should only overwrite the function pointers once. To see why, think about what will happen the second time new_mkHandler is called. We set prevHandler to the first function pointer. However, we have already overwritten this function pointer with &new_prevHandler . Therefore, when we want to actually go to the previous track in new_prevHandler and call (*prevHandler)(p1, p2, p3) , we will actually be calling new_prevHandler and will eventually overflow the stack.

Therefore, we add a simple guard at the beginning to check if we have already overwritten the handlers:

if (gHandlersSet)

goto call_original; ...

overwrite function pointers

...

gHandlersSet = 1; call_original:

(*mkHandler)(appDelegate, keyCode);

Now in new_prevHandler and new_nextHandler , all we have to do is push/pop a skip when appropriate and call (*prevHandler)(p1, p2, p3) or (*nextHandler)(p1, p2, p3) .

Wrapping Up

All that’s left to do is get the current track and player position. Since these are exposed by Objective-C methods, we can use the functionality of the Objective-C runtime to call the appropriate functions without any patching, much like in Part 2.

Here’s the link to the final repository which I’ve refactored to include the MacOS and iOS code: https://github.com/SamL98/SPSkip.

I hope you enjoyed this exploration in patching and automated reverse engineering — I sure did!