I had to pull out all the stops. I went deep into my bag of tricks and recalled Dr. Tom Murphy VII’s SIGBOVIK entry that was both a human readable, fully printable text file that was also a fully functioning DOS executable (no compilation needed).

That’s to say, his human readable text file consisted of just the right selection of bytes that corresponded to just the right machine code instructions to run a program. All of those bytes are considered printable ( \r ,

, and 0x20 - 0x7E ).

I decided to do the same.

I hadn’t fully read Murphy’s paper, and I considered that half of the fun.

Needing to know exactly what kinds of instructions I could use (can I return? push? pop? add? XOR?), I pulled up my favorite x86_64 reference, used some handy JavaScript:

JSON.stringify([].slice.call(document.querySelectorAll('table.ref_table tbody tr')).map(row => [row.children[0].innerHTML, row.children[8]?row.children[8].innerHTML:null, row.children[21]?row.children[21].innerHTML:null]))

and produced a JSON string of just the opcodes and their descriptions, pulled that into a local Node REPL and used a quick filter based on if the opcode’s corresponding byte was printable.

I realized two things: opcodes aren’t necessarily the only bytes in the instruction (the bad news), but there are a lot of things you can do with printable opcodes (the good news).

Needing to test these different instructions, I set up a quick little command line tool that would show me the resulting machine code bytes along with their printable text, just so I could experiment.

$ (fasm /dev/stdin /dev/stderr > /dev/null) 2>&1 <<< “use64^Mpush rax” | xxd

00000000: 50 P

Beautiful. The first test was promising; pushing the value in the RAX register onto the stack was a single printable byte P . This was going better than I expected.

I experimented with a few more instructions ( use64^M was removed from fasm 's input string for brevity):

$ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "push rdi" | xxd

00000000: 57 W $ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "push rcx" | xxd

00000000: 51 Q $ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "pop rax" | xxd

00000000: 58 X $ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "pop rcx" | xxd

00000000: 59 Y $ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "pop rdi" | xxd

00000000: 5f _

Too easy. After about 3 hours of toying around with different instructions, I ended up finding a few that were quite useful:

$ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "sub eax, dword 0x55555555" | xxd

00000000: 2d55 5555 55 -UUUU $ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "xor [eax], edi" | xxd

00000000: 6731 38 g18 $ (fasm /dev/stdin /dev/stderr >/dev/null) 2>&1 <<< "jns 113" | xxd

00000000: 796f yo

Yes, that’s right. Jump near (relative to the current position) with an argument that is an immediate (aka “literal”) operand (argument) that is within the bounds of a signed 8-bit integer (in this case, 113) yields the two printable bytes yo .

I also found out, through farting around enough, that <.< (with the space) is an innocuous series of AL register comparisons (I wasn’t using the zero flag at all, so it didn’t apply to my code). Of course, I was going to throw that in there as much as I could.

My toolbelt was defined, and I couldn’t have been happier.

Well I could have, if ret or any of its variants (e.g. pop rax; jmp rax , which yields the bytes 58 ff e0) were printable. The closest I could do was simply ret itself, which was \xC3 . Naturally, I went with \xC3P0 .

After a few hours of pacing, thinking and determining just exactly what I wanted to do, I determined I wanted to use a string as a function and output something simple, like “Hi!”.

Coincidentally, Hi!\0 fits into four bytes. An integer. Way too perfect, especially since a few 64-bit instruction variants yielded unprintable characters — since we were only dealing with 32-bits overall (four bytes in our Hi!\0 string), we only needed the top 32 bits of our registers to work with the data, thus eax and equivalent extended register instructions would suffice (especially since our machines are usually little endian).

It was settled. I was going to overwrite the first four bytes of my string “function” to the bytes "Hi!\0" and then return the string itself back to something like puts() .

I drafted up some harness code to cast a string to a function pointer and ended up with the following:

It worked:

W pushes rdi , which is the register the first argument of x86_64 function calls is passed into on System-V-type ABIs. (hint: this is the cheat-sheet I’ve come to rely on). X pops that value into rax , the register used to pass return values back to the caller (also according to the System V AMD64 ABI). \xC3 returns execution to the caller. For the less machine-code inclined, this is the equivalent of popping the return address off of the stack and jmp -ing to it.

LLDB confirmed it as well:

Process 21727 stopped

* thread #1: tid = 0x3ad74, 0x0000000100000fa6 testlol, queue = 'com.apple.main-thread', stop reason = instruction step into

frame #0: 0x0000000100000fa6 testlol

-> 0x100000fa6: pushq %rdi

0x100000fa7: popq %rax

0x100000fa8: retq

I casted that string to a function pointer that takes a string and returns a string, and called it, using the same string. It is the equivalent of:

The above code transfers execution to the address of the string’s content (its bytes) and begins execution.