\$\begingroup\$

Solution 1: C (Mac OS X x86_64), 109 bytes

The source for golf_sol1.c

main[]={142510920,2336753547,3505849471,284148040,2370322315,2314740852,1351437506,1208291319,914962059,195};

The above program needs to be compiled with execution access on the __DATA segment.

clang golf_sol1.c -o golf_sol1 -Xlinker -segprot -Xlinker __DATA -Xlinker rwx -Xlinker rwx

Then to execute the program run the following:

./golf_sol1 $(ruby -e 'puts "\xf5\xff\xff\xfe\xff\xff\x44\x82\x57\x7d\xff\x7f"')

Results:

Unfortunately Valgrind does not watch for memory allocated from system calls, so I can't show a nice detected leak.

However we can look at vmmap to see the large chunk of allocated memory (MALLOC metadata).

VIRTUAL REGION REGION TYPE SIZE COUNT (non-coalesced) =========== ======= ======= Kernel Alloc Once 4K 2 MALLOC guard page 16K 4 MALLOC metadata 16.2M 7 MALLOC_SMALL 8192K 2 see MALLOC ZONE table below MALLOC_TINY 1024K 2 see MALLOC ZONE table below STACK GUARD 56.0M 2 Stack 8192K 3 VM_ALLOCATE (reserved) 520K 3 reserved VM address space (unallocated) __DATA 684K 42 __LINKEDIT 70.8M 4 __TEXT 5960K 44 shared memory 8K 3 =========== ======= ======= TOTAL 167.0M 106 TOTAL, minus reserved VM space 166.5M 106

Explanation

So I think I need to describe what's actually going on here, before moving onto the improved solution.

This main function is abusing C's missing type declaration (so it defaults to int without us needing to waste characters writing it), as well how symbols work. The linker only cares about whether of not it can find a symbol called main to call to. So here we're making main an array of int's which we're initializing with our shellcode that will be executed. Because of this, main will not be added to the __TEXT segment but rather the __DATA segment, reason we need to compile the program with an executable __DATA segment.

The shellcode found in main is the following:

movq 8(%rsi), %rdi movl (%rdi), %eax movq 4(%rdi), %rdi notl %eax shrq $16, %rdi movl (%rdi), %edi leaq -0x8(%rsp), %rsi movl %eax, %edx leaq -9(%rax), %r10 syscall movq (%rsi), %rsi movl %esi, (%rsi) ret

What this is doing is calling the syscall function to allocate a page of memory (the syscall mach_vm_allocate uses internally). RAX should equal 0x100000a (tells the syscall which function we want), while RDI holds the target for the allocation (in our case we want this to be mach_task_self()), RSI should hold the address to write the pointer to the newly created memory (so we are just pointing it to a section on the stack), RDX holds the size of the allocation (we're just passing in RAX or 0x100000a just to save on bytes), R10 holds the flags (we're indicating it can be allocated anywhere).

Now it's not plainly obvious where RAX and RDI are getting their values from. We know RAX needs to be 0x100000a, and RDI needs to be the value mach_task_self() returns. Luckily mach_task_self() is actually a macro for a variable (mach_task_self_), which is at the same memory address every time (should change on reboot however). In my particular instance mach_task_self_ happens to be located at 0x00007fff7d578244. So to cut down on instructions, we'll instead be passing in this data from argv. This is why we run the program with this expression $(ruby -e 'puts "\xf5\xff\xff\xfe\xff\xff\x44\x82\x57\x7d\xff\x7f"') for the first argument. The string is the two values combined, where the RAX value (0x100000a) is only 32 bits and has had a one's complement applied to it (so there's no null bytes; we just NOT the value to get the original), the next value is the RDI (0x00007fff7d578244) which has been shifted to the left with 2 extra junk bytes added to the end (again to exclude the null bytes, we just shift it back to the right to get it back to the original).

After the syscall we're writing to our newly allocated memory. The reason for this is because memory allocated using mach_vm_allocate (or this syscall) are actually VM pages, and are not automatically paged into memory. Rather they are reserved until data is written to them, and then those pages are mapped into memory. Wasn't sure if it would meet the requirements if it was only reserved.

For the next solution we'll be taking advantage of the fact that our shellcode has no null bytes, and so can move it outside of our program's code to reduce the size.

Solution 2: C (Mac OS X x86_64), 44 bytes

The source for golf_sol2.c

main[]={141986632,10937,1032669184,2,42227};

The above program needs to be compiled with execution access on the __DATA segment.

clang golf_sol2.c -o golf_sol2 -Xlinker -segprot -Xlinker __DATA -Xlinker rwx -Xlinker rwx

Then to execute the program run the following:

./golf_sol2 $(ruby -e 'puts "\xb8\xf5\xff\xff\xfe\xf7\xd0\x48\xbf\xff\xff\x44\x82\x57\x7d\xff\x7f\x48\xc1\xef\x10\x8b\x3f\x48\x8d\x74\x24\xf8\x89\xc2\x4c\x8d\x50\xf7\x0f\x05\x48\x8b\x36\x89\x36\xc3"')

The result should be the same as before, as we're making an allocation of the same size.

Explanation

Follows much the same concept as solution 1, with the exception that we've moved the chunk of our leaking code outside of the program.

The shellcode found in main is now the following:

movq 8(%rsi), %rsi movl $42, %ecx leaq 2(%rip), %rdi rep movsb (%rsi), (%rdi)

This basically copies the shellcode we pass in argv to be after this code (so after it has copied it, it will run the inserted shellcode). What works to our favour is that the __DATA segment will be at least a page size, so even if our code isn't that big we can still "safely" write more. The downside is the ideal solution here, wouldn't even need the copy, instead it would just call and execute the shellcode in argv directly. But unfortunately, this memory does not have execution rights. We could change the rights of this memory, however it would require more code than simply copying it. An alternative strategy would be to change the rights from an external program (but more on that later).

The shellcode we pass to argv is the following:

movl $0xfefffff5, %eax notl %eax movq $0x7fff7d578244ffff, %rdi shrq $16, %rdi movl (%rdi), %edi leaq -0x8(%rsp), %rsi movl %eax, %edx leaq -9(%rax), %r10 syscall movq (%rsi), %rsi movl %esi, (%rsi) ret

This is much the same as our previous code, only difference being we're including the values for EAX and RDI directly.

Possible Solution 1: C (Mac OS X x86_64), 11 bytes

The idea of modifying the program externally, gives us the possible solution of moving the leaker to an external program. Where our actual program (submission) is just a dummy program, and the leaker program will allocate some memory in our target program. Now I wasn't certain if this would fall within the rules for this challenge, but sharing it nonetheless.

So if we were to use mach_vm_allocate in an external program with the target set to our challenge program, that could mean our challenge program would only need to be something along the lines of:

main=65259;

Where that shellcode is simply a short jump to itself (infinite jump/loop), so the program stays open and we can reference it from an external program.

Possible Solution 2: C (Mac OS X x86_64), 8 bytes

Funnily enough when I was looking at valgrind output, I saw that at least according to valgrind, dyld leaks memory. So effectively every program is leaking some memory. With this being the case, we could actually just make a program that does nothing (simply exits), and that will actually be leaking memory.

Source: