I neither play CTFs, nor I do writeups for them. Well, both statements are not true anymore, but don’t expect too much CTF writeups on this blog anyway. The task was worth 500 points and according to my knowledge nobody submitted the flag on time (including me as well). So, enjoy the reading and I hope you will like it.

Analysis: Debugger process

Given executable is packed with ASPack, packed size is around 30kB, unpacked 80kB. It’s a simple console application, which welcome the user with below msg and immediately exits:

x:\>re5-d06bc342.exe ┬╖┬■┬■╞Σ╨▐╘╢┘Γú¼╬ß╜½╔╧╧┬╢°╟≤╦≈ The Ro@d Ahe@d Will Be Long And 0ur Cl1mb Will Be Steep

I won’t describe ASPack unpacking process, as there are plenty of tools to unpack it automatically (at least there were in 2005). Manual unpacking with OllyDbg/LordPE/ImpREC shouldn’t be a concern for anybody as well. Application runs as a 2 processes, first process acts as the debugger, second process is being debugged (protection similar to Armadillo with enabled nanomites). There is one additional functionality in the debugger process, but I’ll describe it later.

When it comes to reverse engineering tasks, I prefer working in dual mode: single stepping in OllyDbg aided with static disassembly/decompilation in IDA. First thing that I wanted to achieve was getting rid of debugger process, so I could easily debug child process. To do it properly I had to check what functionality debugger provides to the debugee. Debugger dispatcher is placed inside the function at address 0x402A17. Except for handling not really interesting events like EXIT_PROCESS_DEBUG_EVENT, it also handles EXCEPTION_DEBUG_EVENT with the exception code STATUS_BREAKPOINT (int3).

if ( ExceptionCode == STATUS_BREAKPOINT ) { _ExceptionAddress = ExceptionAddress ; if ( ExceptionAddress <= ( unsigned int ) main || _ExceptionAddress >= ( unsigned int ) sub_402D6B ) { if ( _ExceptionAddress >= ( unsigned int ) sub_401945 && _ExceptionAddress <= ( unsigned int ) sub_4019B5 ) decryptMemory ( _ExceptionAddress, byte_40DC3C, ( SIZE_T ) sub_4019B5 - _ExceptionAddress ) ; } else { qmemcpy ( & v8, & v11, sizeof ( v8 ) ) ; redirectExecution ( v8, ( DWORD ) & loc_4022DF ) ; } } if ( ExceptionCode == STATUS_BREAKPOINT ) { _ExceptionAddress = ExceptionAddress; if ( ExceptionAddress <= (unsigned int)main || _ExceptionAddress >= (unsigned int)sub_402D6B ) { if ( _ExceptionAddress >= (unsigned int)sub_401945 && _ExceptionAddress <= (unsigned int)sub_4019B5 ) decryptMemory(_ExceptionAddress, byte_40DC3C, (SIZE_T)sub_4019B5 - _ExceptionAddress); } else { qmemcpy(&v8, &v11, sizeof(v8)); redirectExecution(v8, (DWORD)&loc_4022DF); } }

First condition checks if the int3 was executed inside sub_401945 function (0x00401945 – 0x004019B5), if yes it decrypts code of this function with simple xor:

//0x004028F9 void __cdecl decryptMemory ( int a1, char a2, SIZE_T nSize ) { SIZE_T i ; void * lpBuffer ; char v5 ; SIZE_T NumberOfBytesRead ; memset ( & v5, 0 , 0x2C8u ) ; lpBuffer = operator new ( nSize ) ; NumberOfBytesRead = 0 ; ReadProcessMemory ( hProcess, ( LPCVOID ) a1, lpBuffer, nSize, & NumberOfBytesRead ) ; * ( _BYTE * ) lpBuffer ^ = 0x5Cu ; for ( i = 1 ; i < nSize ; ++ i ) * ( ( _BYTE * ) lpBuffer + i ) ^ = a2 ; WriteProcessMemory ( hProcess, ( LPVOID ) a1, lpBuffer, nSize, & NumberOfBytesRead ) ; sub_403616 ( lpBuffer ) ; } //0x004028F9 void __cdecl decryptMemory(int a1, char a2, SIZE_T nSize) { SIZE_T i; void *lpBuffer; char v5; SIZE_T NumberOfBytesRead; memset(&v5, 0, 0x2C8u); lpBuffer = operator new(nSize); NumberOfBytesRead = 0; ReadProcessMemory(hProcess, (LPCVOID)a1, lpBuffer, nSize, &NumberOfBytesRead); *(_BYTE *)lpBuffer ^= 0x5Cu; for ( i = 1; i < nSize; ++i ) *((_BYTE *)lpBuffer + i) ^= a2; WriteProcessMemory(hProcess, (LPVOID)a1, lpBuffer, nSize, &NumberOfBytesRead); sub_403616(lpBuffer); }

Second handler is executed only if the exception takes place inside main() function (0x00402B39 – 0x00402D6B), it’s main responsibility is redirection of execution to 0x004022DF, it is achieved by using SetThreadContext api:

//0x004028A7 BOOL __cdecl redirectExecution ( CONTEXT Context, DWORD a2 ) { GetThreadContext ( hThread, & Context ) ; Context. Eip = a2 ; Context. ContextFlags = CONTEXT_CONTROL ; return SetThreadContext ( hThread, & Context ) ; } //0x004028A7 BOOL __cdecl redirectExecution(CONTEXT Context, DWORD a2) { GetThreadContext(hThread, &Context); Context.Eip = a2; Context.ContextFlags = CONTEXT_CONTROL; return SetThreadContext(hThread, &Context); }

So, it seems that removing debugger/debugee functionality is doable. First phase of required changes:

changing conditional jump at 0x00402C2E to unconditional – this jump is used to change the execution path basing on the result from CreateMutexW call

to unconditional – this jump is used to change the execution path basing on the result from call change call to GetLastError (at 0x00402C32 ) to mov eax, 0xB7 – GetLastError should return 0xB7 ( ERROR_ALREADY_EXISTS ) if the mutex was already created, so to mimic this behavior I just set the eax to this value.

(at ) to – should return ( ) if the mutex was already created, so to mimic this behavior I just set the to this value. change int3 at 0x00402D14 to jmp 0x004022DF – this is actually what the debugger process would do in the int3 exception handler

My aim was to quickly remove debugger functionality, so I haven’t really checked the whole logic of the debugger loop. It’s rather certain that it could be done better (different modifications).

Next step would be decryption of the code at 0x00401951. At this point of analysis I don’t have the key, but it’s one byte xor, so it can be guessed. Usually in x86 code there is a lot of 0x00 bytes, in the encrypted region, there is a lot of 0xE9 values, so this is probably the key. Last byte of the encrypted region is 0x2A, after xor with 0xE9 it’s 0xC3 (ret opcode). This key is stored at 0x0040DC3C, keep that in mind as having this value is an actual hint to obtain the flag (hint that I’ve missed earlier). After successful decryption it’s clearly visible that something with this code is not right, namely targets of all relative calls (0xE8) are wrong by 4 (they points 4 bytes before the actual function). It’s easily fixable, but it suggests that this functions was never supposed to be called.

It’s also good to remove (just nop it) some of the code obfuscation that is placed all over the interesting functions, so hexrays won’t complain during decompilation.

There is still one thing left. Debugger process creates additional thread, I’ll skip detailed analysis of the thread function as it’s very straightforward. Below steps will be more than enough to emulate functionality of this thread:

open c:\[SsCTF]Re.txt file

file connect to localhost on port 2016

on read file content

update 0x0040DC3C value – mentioned hint, it’s a sum of all characters read from file truncated to 8 bits

value – mentioned hint, it’s a sum of all characters read from file truncated to 8 bits decode 80 bytes at 0x0040B050 by xoring with the above sum – purpose of this table will be revealed later

bytes at by xoring with the above sum – purpose of this table will be revealed later send content of the file to localhost:2016

receive some reply

sleep for 4 seconds

print the reply

At this point executable can be run as a single process executable, thread functionality can be emulated with simple python script:

import socket s = socket . socket ( socket . AF_INET , socket . SOCK_STREAM ) s. connect ( ( "127.0.0.1" , 2016 ) ) s. send ( "whatever the flag really is!" ) s. close ( ) import socket s = socket.socket(socket.AF_INET, socket.SOCK_STREAM) s.connect(("127.0.0.1", 2016)) s.send("whatever the flag really is!") s.close()

Analysis: Debuggee process

Without debugger functionality, proper execution starts at 0x004022DF (the place where debugger redirects child process). Function at this address is responsible for creating localhost server that listens on port 2016. Received data is used to calculate the key at 0x0040DC3C (again! how did I miss this clue). Server dispatcher is at address 0x0040203F. Dispatcher function takes two arguments: received data and length of this data. There are 5 different handlers, each of them is executed according to the length of the data:

handlerId = 0 ; if ( dataLen == 3 ) { handlerId = 1 ; } else if ( dataLen <= 3 || dataLen >= 0x14 ) { if ( dataLen <= 0x14 || dataLen >= 0x1E ) handlerId = rand ( ) % 5 ; else handlerId = 2 ; } else { handlerId = 0 ; } handlerId = 0; if ( dataLen == 3 ) { handlerId = 1; } else if ( dataLen <= 3 || dataLen >= 0x14 ) { if ( dataLen <= 0x14 || dataLen >= 0x1E ) handlerId = rand() % 5; else handlerId = 2; } else { handlerId = 0; }

I really like how hexrays decompiled those conditions, completely unreadable! So, simplify:

dataLen handlerId 0 – 2 rand() % 5 3 1 4 – 0x13 0 0x14 rand() % 5 0x15 – 0x1D 2 0x1E – … rand() % 5

Lengths that leads to random handler execution probably can be skipped. Handlers 3 and 4 are probably not analysis worthy as they’re reached randomly. I’ve skipped handler_1 as well, as three bytes input isn’t anything close to the flag length. So, there are two handlers left, I’ve started with handler_2 which turned out to be poor choice, but I’ll briefly describe it anyway:

//0x004019B5 BOOL handler_2 ( ) { char tmpStr [ 0x80 ] ; memcpy ( tmpStr, inputStr, 0x80u ) ; int v3 = strlen ( inputStr ) >> 3 ; for ( int i = 0 ; i < v3 ; ++ i ) mystEncrypt ( & tmpStr [ 8 * i ] , key ) ; return strcmp ( tmpStr, encryptedData ) == 0 ; } //0x004019B5 BOOL handler_2() { char tmpStr[0x80]; memcpy(tmpStr, inputStr, 0x80u); int v3 = strlen(inputStr) >> 3; for (int i = 0; i < v3; ++i) mystEncrypt(&tmpStr[8 * i], key); return strcmp(tmpStr, encryptedData) == 0; }

The function is pretty simple, it just encrypts input data using 8 bytes blocks (64bit) at each iteration. key is given, encryptedData is given, inputStr is a global variable that keeps received data. It’s definitely some block cipher, internally the mystEncrypt() function uses one dword const from address 0x0040B030. The const is equal to 0x97B2CB, but earlier during execution it’s multiplied by 0x10B, so the final value is 0x9E3779B9. Querying google with this value returns references to Tiny Encryption Algorithm (TEA). encryptedData turns out to be “WoyaoDuanKouHaoHeFlag,Pls.”, when this value is inserted to c:\[SsCTF]Re.txt file, application prints “Port:2447”.

The next step, which is common for all handlers is strange as I haven’t found any purpose for it. It reads port number from c:\Port.txt file and tries to send some data to this port. Data is 8 bytes long and is xored with below algorithm:

char data [ ] = { 0x0C , 0x1F , 0x22 , 0x44 , 0x6E , 0x79 , 0x9B , 0xBD , 0x00 } ; for ( i = 0 ; i < 8 ; ++ i ) { data [ i ] ^ = byte_40DC3C ; data [ i ] ^ = portLoByte ; } char data[] = { 0x0C, 0x1F, 0x22, 0x44, 0x6E, 0x79, 0x9B, 0xBD, 0x00 }; for ( i = 0; i < 8; ++i ) { data[i] ^= byte_40DC3C; data[i] ^= portLoByte; }

I tried xoring it with every possible value and it didn’t yield any meaningful string, so I considered it as another decoy.

The only probable handler that left is handler_0. First function called from handler_0 gathers frequencies of the characters in the received data (inputStr).

//0x00401000 void __cdecl getCharactersStats ( char * inpStr, _DWORD * inpLen, int * stats, _DWORD * uniqueChars ) { int k ; int j ; signed int v6 ; int i ; * uniqueChars = 0 ; for ( i = 0 ; inpStr [ i ] ; ++ i ) { v6 = 1 ; for ( j = 0 ; j < i ; ++ j ) { if ( inpStr [ j ] == inpStr [ i ] ) { v6 = 0 ; break ; } } if ( v6 ) { LOBYTE ( stats [ 3 * ++* uniqueChars + 1 ] ) = inpStr [ i ] ; stats [ 3 * * uniqueChars ] = 1 ; for ( k = i + 1 ; inpStr [ k ] ; ++ k ) { if ( inpStr [ i ] == inpStr [ k ] ) ++ stats [ 3 * * uniqueChars ] ; } } } * inpLen = i ; } //0x00401000 void __cdecl getCharactersStats(char *inpStr, _DWORD *inpLen, int *stats, _DWORD *uniqueChars) { int k; int j; signed int v6; int i; *uniqueChars = 0; for ( i = 0; inpStr[i]; ++i ) { v6 = 1; for ( j = 0; j < i; ++j ) { if ( inpStr[j] == inpStr[i] ) { v6 = 0; break; } } if ( v6 ) { LOBYTE(stats[3 * ++*uniqueChars + 1]) = inpStr[i]; stats[3 * *uniqueChars] = 1; for ( k = i + 1; inpStr[k]; ++k ) { if ( inpStr[i] == inpStr[k] ) ++stats[3 * *uniqueChars]; } } } *inpLen = i; }

Due to my past experience with compression algorithms I was pretty sure that the rest of the functions called from handler_0 will be just another stages of the Huffman coding (admins confirmed it later with the hint). Fully hexraysed version of handler_0 should looks like the one below:

signed int handler_0 ( ) { size_t v0 ; signed int result ; size_t uniqueChars ; int inpLen ; int huffTree [ 800 ] ; size_t i ; char * v6 [ 399 ] ; int stats [ 300 ] ; int v8 [ 399 ] ; uniqueChars = 0 ; inpLen = 0 ; getCharactersStats ( inputStr, & inpLen, stats, & uniqueChars ) ; buildTree ( huffTree, stats, uniqueChars ) ; generateCodes ( huffTree, inputStr, v8, stats, inpLen, uniqueChars ) ; encodeInput ( inputStr, v8, v6, stats, uniqueChars, inpLen ) ; for ( i = 0 ; ( signed int ) i < inpLen ; ++ i ) strcat ( byte_40DC50, v6 [ i ] ) ; for ( i = 0 ; ; ++ i ) { v0 = strlen ( aRrrrssrsrrrsss ) ; if ( i >= v0 ) break ; aRrrrssrsrrrsss [ i ] ^ = 0x42u ; } if ( ! strcmp ( byte_40DC50, aRrrrssrsrrrsss ) ) result = 2 ; else result = 0 ; return result ; } signed int handler_0() { size_t v0; signed int result; size_t uniqueChars; int inpLen; int huffTree[800]; size_t i; char *v6[399]; int stats[300]; int v8[399]; uniqueChars = 0; inpLen = 0; getCharactersStats(inputStr, &inpLen, stats, &uniqueChars); buildTree(huffTree, stats, uniqueChars); generateCodes(huffTree, inputStr, v8, stats, inpLen, uniqueChars); encodeInput(inputStr, v8, v6, stats, uniqueChars, inpLen); for ( i = 0; (signed int)i < inpLen; ++i ) strcat(byte_40DC50, v6[i]); for ( i = 0; ; ++i ) { v0 = strlen(aRrrrssrsrrrsss); if ( i >= v0 ) break; aRrrrssrsrrrsss[i] ^= 0x42u; } if ( !strcmp(byte_40DC50, aRrrrssrsrrrsss) ) result = 2; else result = 0; return result; }

So, it compresses the input data with Huffman algorithm and as an output it generates binary string that looks like this:

input : "WoyaoDuanKouHaoHeFlag,Pls." output: "110100111101110011100101011101111101011111110000110000010010001101000101"

Later it decrypts (simple xor 0x42) string that the compressed input is compared to:

encrypted: "rrrrssrsrrrssssrrrsrrrssrsrrrsrssssrssrsrssrssssrssssrrrsssssrrssrsrsrssssrr" decrypted: "0000110100011110001000110100010111101101011011110111100011111001101010111100"

To pass the final strcmp check, one need to decompress above string. Decoding Huffman without the tree or characters frequencies might be possible but there is a lot of inputs that will compress to the same output, so it’s rather not the case this time. CTF’s admins confirmed that the tree is stored inside the executable. Coincidentally, the only data that has no particular use yet is placed just before the “rrrrssrsrrr…” string:

0040B050 99 9B D9 86 80 81 9A 87 D8 8C A4 8E 90 85 8F A9 Ö¢+åÇ.Üç+îñÄ.à.¬ 0040B060 A3 D1 D1 A3 A3 A3 D1 D1 A3 D1 A3 D1 A3 D1 A3 A3 ú--úúú--ú-ú-ú-úú 0040B070 A3 D1 D1 D1 D1 A3 A3 A3 A3 A3 A3 A3 D1 A3 D1 A3 ú----úúúúúúú-ú-ú 0040B080 D1 A3 A3 D1 D1 A3 D1 D1 D1 D1 A3 D1 A3 A3 A3 D1 -úú--ú----ú-úúú- 0040B090 D1 D1 A3 A3 A3 A3 D1 A3 D1 D1 D1 D1 D1 D1 D1 A3 --úúúú-ú-------ú

I mentioned this table earlier in connection with the byte at address 0x0040DC3C, which is the xor key (0xE9) for this table:

0040B050 70 72 30 6F 69 68 73 6E 31 65 4D 67 79 6C 66 40 pr0oihsn1eMgylf@ 0040B060 4A 38 38 4A 4A 4A 38 38 4A 38 4A 38 4A 38 4A 4A J88JJJ88J8J8J8JJ 0040B070 4A 38 38 38 38 4A 4A 4A 4A 4A 4A 4A 38 4A 38 4A J8888JJJJJJJ8J8J 0040B080 38 4A 4A 38 38 4A 38 38 38 38 4A 38 4A 4A 4A 38 8JJ88J8888J8JJJ8 0040B090 38 38 4A 4A 4A 4A 38 4A 38 38 38 38 38 38 38 4A 88JJJJ8J8888888J

This looks promising, but I had a really hard time correlating this data with Huffman tree. Not to mention that I used wrong xor key, so the table looked differently. I was aware that my key is wrong, so I’ve generated all xor possibilities and limited output to only printable characters:

224: y{9f`azg8lDnpeoI C11CCC11C1C1C1CCC1111CCCCCCC1C1C1CC11C1111C1CCC111CCCC1C1111111C 225: xz8ga`{f9mEoqdnH B00BBB00B0B0B0BBB0000BBBBBBB0B0B0BB00B0000B0BBB000BBBB0B0000000B 226: {y;dbcxe:nFlrgmK A33AAA33A3A3A3AAA3333AAAAAAA3A3A3AA33A3333A3AAA333AAAA3A3333333A 227: zx:ecbyd;oGmsflJ @22@@@22@2@2@2@@@2222@@@@@@@2@2@2@@22@2222@2@@@222@@@@2@2222222@ 231: ~|>agf}`?kCiwbhN D66DDD66D6D6D6DDD6666DDDDDDD6D6D6DD66D6666D6DDD666DDDD6D6666666D 232: qs1nhiro0dLfxmgA K99KKK99K9K9K9KKK9999KKKKKKK9K9K9KK99K9999K9KKK999KKKK9K9999999K 233: pr0oihsn1eMgylf@ J88JJJ88J8J8J8JJJ8888JJJJJJJ8J8J8JJ88J8888J8JJJ888JJJJ8J8888888J 234: sq3ljkpm2fNdzoeC I;;III;;I;I;I;III;;;;IIIIIII;I;I;II;;I;;;;I;III;;;IIII;I;;;;;;;I 235: rp2mkjql3gOe{ndB H::HHH::H:H:H:HHH::::HHHHHHH:H:H:HH::H::::H:HHH:::HHHH:H:::::::H 236: uw5jlmvk4`Hb|icE O==OOO==O=O=O=OOO====OOOOOOO=O=O=OO==O====O=OOO===OOOO=O=======O 237: tv4kmlwj5aIc}hbD N<<NNN<<N<N<N<NNN<<<<NNNNNNN<N<N<NN<<N<<<<N<NNN<<<NNNN<N<<<<<<<N 238: wu7hnoti6bJ`~kaG M??MMM??M?M?M?MMM????MMMMMMM?M?M?MM??M????M?MMM???MMMM?M???????M 242: ki+trshu*~V|bw}[ Q##QQQ##Q#Q#Q#QQQ####QQQQQQQ#Q#Q#QQ##Q####Q#QQQ###QQQQ#Q#######Q 244: mo-rtuns,xPzdq{] W%%WWW%%W%W%W%WWW%%%%WWWWWWW%W%W%WW%%W%%%%W%WWW%%%WWWW%W%%%%%%%W 245: ln,sutor-yQ{epz\ V$$VVV$$V$V$V$VVV$$$$VVVVVVV$V$V$VV$$V$$$$V$VVV$$$VVVV$V$$$$$$$V 246: om/pvwlq.zRxfsy_ U''UUU''U'U'U'UUU''''UUUUUUU'U'U'UU''U''''U'UUU'''UUUU'U'''''''U 247: nl.qwvmp/{Sygrx^ T&&TTT&&T&T&T&TTT&&&&TTTTTTT&T&T&TT&&T&&&&T&TTT&&&TTTT&T&&&&&&&T 250: ca#|z{`}"v^tjuSY ++YYY++Y+Y+Y+YYY++++YYYYYYY+Y+Y+YY++Y++++Y+YYY+++YYYY+Y+++++++Y 251: b`"}{za|#w_uk~tR X**XXX**X*X*X*XXX****XXXXXXX*X*X*XX**X****X*XXX***XXXX*X*******X 252: eg%z|}f{$pXrlysU _--___--_-_-_-___----_______-_-_-__--_----_-___---____-_-------_ 253: df${}|gz%qYsmxrT ^,,^^^,,^,^,^,^^^,,,,^^^^^^^,^,^,^^,,^,,,,^,^^^,,,^^^^,^,,,,,,,^

I chose the row 252, my reasoning was that flag should have format of SSCTF{} or maybe flag{} or something similar. Rows 250-253 contains “{}” characters and row 252 additionaly has letters “fl”.

Back to the properly decrypted data, first 16 bytes are letters and next 0x40 bytes looks like binary encoded data with 4 characters (bits) for each letter:

p r 0 o i h s n 1 e M g y l f @ J88J JJ88 J8J8 J8JJ J888 8JJJ JJJJ 8J8J 8JJ8 8J88 88J8 JJJ8 88JJ JJ8J 8888 888J J -> 1, 8 -> 0 1001 1100 1010 1011 1000 0111 1111 0101 0110 0100 0010 1110 0011 1101 0000 0001 J -> 0, 8 -> 1 0110 0011 0101 0100 0111 1000 0000 1010 1001 1011 1101 0001 1100 0010 1111 1110

There are two possible encodings, depends if “J” and “8” are encoded as 1 or 0 respectively or the other way around. Compressed string starts with “0000 1101”, so possible beginings would be “fl” or “sM”, let’s try with “fl”, so the encoding would be J -> 1, 8 -> 0.

0000 1101 0001 1110 0010 0011 0100 0101 1110 1101 0110 1111 0111 1000 1111 1001 1010 1011 1100 f l @ g M y e n g l 1 s h i s p 0 o r

It seems that the flag is “fl@gMyengl1shisp0or”, but I can’t confirm it since the CTF is long over. Putting this flag into c:\[SsCTF]Re.txt file leads to the below results:

x:\re5>re5-d06bc342.exe ┬╖┬■┬■╞Σ╨▐╘╢┘Γú¼╬ß╜½╔╧╧┬╢°╟≤╦≈ The Ro@d Ahe@d Will Be Long And 0ur Cl1mb Will Be Steep 0K!You Got 1t!

I would like to thanks: @1amtom, @_m4gu, @krzywix, @KrzaQ2 for their consultations.