Dynamic shellcode analysis

In this article, we will study a shellcode using dynamic analysis. This analysis includes a description of Miasm internals, which explains its length. The shellcode is in the archive dyn_sc_shellcodes.zip, protected with the password infected. The final script is here: dyn_sc_run.py

This analysis is based on Miasm revision 2cf6970.

First blood Here is a raw dump of the shellcode: 00000000 50 59 49 49 49 49 49 49 49 49 49 49 49 49 49 49 | PYIIIIIIIIIIIIII | 00000010 49 49 37 51 5 a 6 a 41 58 50 30 41 30 41 6 b 41 41 | II7QZjAXP0A0AkAA | 00000020 51 32 41 42 32 42 42 30 42 42 41 42 58 50 38 41 | Q2AB2BB0BBABXP8A | 00000030 42 75 4 a 49 62 78 6 a 4 b 64 58 50 5 a 6 b 39 6 e 36 | BuJIbxjKdXPZk9n6 | 00000040 6 c 49 4 b 67 4 b 30 65 6 e 7 a 49 42 54 46 6 b 6 c 79 | lIKgK0enzIBTFkly | 00000050 7 a 4 b 77 73 77 70 77 70 4 c 6 c 66 54 57 6 c 4 f 5 a | zKwswpwpLlfTWlOZ | 00000060 39 72 6 b 4 a 6 b 4 f 59 42 5 a 63 48 68 58 63 59 6 f | 9 rkJkOYBZcHhXcYo | 00000070 59 6 f 4 b 4 f 7 a 55 76 77 45 4 f 67 6 c 77 6 c 43 72 | YoKOzUvwEOglwlCr | ... We can note that this shellcode is in pure ascii. Let’s disassemble its first basic block: python miasm / example / disasm / full . py - m x86_32 shellcode . bin -- blockwatchdog 1 This gives the following graph (file graph_execflow.dot): Note the PUSH EAX POP ECX to mimic a MOV ECX, EAX, keeping a pure ascii encoding. As we can see, the shellcode starts with some computations, and will xor a memory cell: 00000019 XOR BYTE PTR [ ECX + 0x30 ], AL We could analyze it manually or dynamically. For the exercise, we will try to determine which pointer is manipulated here. Now, the question is: where does the value ECX + 0x30 point to? In Miasm, there are at least two ways to answer this: using a symbolic execution from the beginning to retrieve the equation of ECX at address 0x19

at address 0x19 using the DependencyGraph , whose goal is to track all the lines which participate to the value of a selected variable. We won’t introduce this module here, because a future post will be dedicated to it.

Symbolic Execution Here are the steps to perform a symbolic execution of a basic block: disassemble the block translate it in the Miasm intermediate representation (IR) create an initial state launch the symbolic execution The following code disassembles the shellcode from address 0x0 to 0x1C (after the XOR). Then we will translate it in IR and finally run the symbolic execution, stopping at address 0x1C. Here is the script: import sys from miasm2.analysis.machine import Machine from miasm2.core.bin_stream import bin_stream_str from miasm2.ir.symbexec import symbexec # Create a bin_stream from a Python string bs = bin_stream_str ( open ( sys . argv [ 1 ]) . read ()) # Get a Miasm x86 32bit machine machine = Machine ( "x86_32" ) # Retrieve the disassemble and IR analysis dis_engine , ira = machine . dis_engine , machine . ira # link the disasm engine to the bin_stream mdis = dis_engine ( bs ) # Stop disassembler after the XOR mdis . dont_dis = [ 0x1C ] # Disassemble one basic block block = mdis . dis_bloc ( 0 ) # instanciate an IR analysis ir_arch = ira ( mdis . symbol_pool ) # Translate asm basic block to an IR basic block ir_arch . add_bloc ( block ) # Store IR graph open ( 'ir_graph.dot' , 'w' ) . write ( ir_arch . graph . dot ()) # Initiate the symbolic execution engine # regs_init associates EAX to EAX_init and to on sb = symbexec ( ir_arch , machine . mn . regs . regs_init ) # Start execution at address 0 # IRDst represents the label of the next IR basic block to execute irdst = sb . emul_ir_blocs ( ir_arch , 0 ) print 'ECX =' , sb . symbols [ machine . mn . regs . ECX ] The output is: ECX = ( EAX_init + 0xFFFFFFF0 ) So at this point, as the xored memory is located at [ECX + 0x30], the pointer is in fact (EAX_init+0xFFFFFFF0) + 0x30 = EAX_init + 0x20. By the way, EAX_init is the value of EAX in the initial symbolic execution state. Actually, the shellcode has information about the value of EAX when it’s run by the application. What I didn’t say is that this shellcode was executed after an exploit which leads to the corruption of a vtable leading to a CALL EAX. Hence the shellcode knows that when its first instruction is executed, EAX points to it. If you don’t want to bother writing Python code only to run a symbolic execution, the script miasm/example/ida/symbol_exec.py will do the trick. Under IDA, hit Alt-F7 and run the script. Now, select the code you want to execute and hit F3. You should have the following result: Note: the script only displays modified registers and memory. Here again, the value of ECX is EAX_init+0xFFFFFFF0. Please, note that Miasm2 must be in IDA’s python path for the script to run properly. So the shellcode will modify itself. Even if we could continue the analysis manually, here we are going to use the Miasm sandbox to run a dynamic execution.

Emulation To continue the analysis, we will emulate the shellcode in a sandbox. For this, Miasm offers multiple solutions. There is a simple sandbox demonstration in the example miasm/example/jitter/x86_32.py. Here is the core of the script: # Create a x86 32bit sandbox myjit = Machine ( "x86_32" ) . jitter () # Add memory for the stack, and point ESP to this area myjit . init_stack () # Read the shellcode data = open ( args . filename ) . read () # Add memory for the shellcode run_addr = 0x40000000 myjit . vm . add_memory_page ( run_addr , PAGE_READ | PAGE_WRITE , data ) # Trace registers values and mnemonics myjit . jit . log_regs = True myjit . jit . log_mn = True # Push special address 0x1337BEEF on the stack myjit . push_uint32_t ( 0x1337beef ) # Add a breakpoint to special address 0x1337BEEF to stop emulation myjit . add_breakpoint ( 0x1337beef , code_sentinelle ) # Initialize and starts the emulator myjit . init_run ( run_addr ) myjit . continue_run () In this script, we start with an empty sandbox. If you don’t create space for the stack, the first PUSH will trigger an error saying that the code is trying to access an unmapped page. This explains the myjit.init_stack(). 0x1337BEEF is pushed on the stack to force a potential RET to jump to a special address. We then add a breakpoint at this address in order to spot such a behavior. So here is trace: RAX 0000000000000000 RBX 0000000000000000 RCX 0000000000000000 RDX 0000000000000000 RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123 FFFC RBP 0000000000000000 zf 0000000000000000 nf 0000000000000000 of 0000000000000000 cf 0000000000000000 RIP 0000000040000000 40000000 PUSH EAX ... 40000017 POP EAX RAX 0000000000000041 RBX 0000000000000000 RCX 00000000 FFFFFFF0 RDX 00000000 FFFFFFF0 RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123 FFFC RBP 0000000000000000 zf 0000000000000000 nf 0000000000000001 of 0000000000000000 cf 0000000000000000 RIP 0000000040000017 40000018 PUSH EAX RAX 0000000000000041 RBX 0000000000000000 RCX 00000000 FFFFFFF0 RDX 00000000 FFFFFFF0 RSI 0000000000000000 RDI 0000000000000000 RSP 000000000123 FFF8 RBP 0000000000000000 zf 0000000000000000 nf 0000000000000001 of 0000000000000000 cf 0000000000000000 RIP 0000000040000018 40000019 XOR BYTE PTR [ ECX + 0x30 ], AL WARNING : address 0x20 is not mapped in virtual memory : WARNING : address 0x20 is not mapped in virtual memory : ... assert ( self . get_exception () == 0 ) AssertionError In this log, the script fails at address 0x40000019: the XOR analyzed previously. We can see the error is that the shellcode tries to access unmapped memory area at address 0x20. In fact the initial state of the sandbox set EAX to 0x0. As the shellcode has been mapped at address 0x40000000, the lookup fails. To fix it, we set EAX to 0x40000000: myjit . cpu . EAX = 0x40000000 Now, the execution is able to continue after the self modifying code. Note that the logs are very verbose. From now on, we will only activate the block trace (see previous article for more details). myjit . jit . log_regs = True myjit . jit . log_mn = True is replaced by: myjit . jit . log_newbloc = True The first basic block displayed: loc_0000000040000000 : 0x40000000 PUSH EAX POP ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX DEC ECX AAA PUSH ECX POP EDX PUSH 0x41 POP EAX PUSH EAX XOR BYTE PTR [ ECX + 0x30 ], AL INC ECX IMUL EAX , DWORD PTR [ ECX + 0x41 ], 0x51 XOR AL , BYTE PTR [ ECX + 0x42 ] XOR AL , BYTE PTR [ EDX + 0x42 ] XOR BYTE PTR [ EDX + 0x42 ], AL INC ECX INC EDX POP EAX PUSH EAX CMP BYTE PTR [ ECX + 0x42 ], AL JNZ loc_000000004000007D : 0x4000007d -> c_next : loc_0000000040000033 : 0x40000033 c_to : loc_000000004000007D : 0x4000007d The interesting point is the next basic block displayed: loc_000000004000001C : 0x4000001c INC ECX IMUL EAX , DWORD PTR [ ECX + 0x41 ], 0x10 XOR AL , BYTE PTR [ ECX + 0x42 ] XOR AL , BYTE PTR [ EDX + 0x42 ] XOR BYTE PTR [ EDX + 0x42 ], AL INC ECX INC EDX POP EAX PUSH EAX CMP BYTE PTR [ ECX + 0x42 ], AL JNZ loc_000000004000007D : 0x4000007d -> c_to : loc_000000004000007D : 0x4000007d c_next : loc_0000000040000033 : 0x40000033 Note that this new basic block is in fact a slice of the first basic block. Here is what happened: Miasm translates the first basic block and starts its execution. The execution reaches the automodifying code, which messes up the current basic block. The execution stops and this block is removed from the cache. The engine resumes the execution, so the new basic block is handled as a new one, disassembled and displayed Note this new basic block is a bit different from the end of the first basic block. before: IMUL EAX , DWORD PTR [ ECX + 0x41 ], 0x51 after IMUL EAX , DWORD PTR [ ECX + 0x41 ], 0x10

Deeper in the Shellcode This basic block (loc_000000004000001C) decrypts the next stage. We could stop the execution at 0x40000033 and dump the memory to the disk to watch the next stage for further analysis. But wait! There is more: loc_0000000040000040 : 0x40000040 MOV ECX , 0x3EB LODSB XOR AL , 0x1C STOSB LOOP loc_0000000040000045 : 0x40000045 -> c_next : loc_000000004000004B : 0x4000004b c_to : loc_0000000040000045 : 0x40000045 The code above is another deciphering loop. At this point, we will add a breakpoint at address 0x4000004b to dump the shellcode. This breakpoint will trigger a callback which dumps the deciphered code from memory to the disk. # A breakpoint callback takes the jitter as first parameter def dump ( jitter ): # Dump data ad address run_addr with a length of len(data) new_data = jitter . vm . get_mem ( run_addr , len ( data )) # Save to disk open ( '/tmp/dump.bin' , 'wb' ) . write ( new_data ) # Stop execution return False # Register a callback to the breakpoint myjit . add_breakpoint ( 0x4000004b , dump ) ... myjit . cpu . EAX = 0x40000000 myjit . init_run ( run_addr ) myjit . continue_run () At this stage, a static analysis of the decrypted code is possible. But we will perform a dynamic analysis to use the Miasm sandbox. Here is the next basic block: loc_0000000040000058 : 0x40000058 POP ESI PUSH EBP MOV EBP , ESP PUSH 0x6E6F PUSH 0x6D6C7275 PUSH ESP PUSH 0xEC0E4E8E PUSH 0x6E2BCA17 CALL loc_00000000400002CA : 0x400002ca -> c_next : loc_0000000040000076 : 0x40000076 Spoiler: for the trained eyes, we have a code pattern which stacks a special string in memory: >>> "6D6C7275" . decode ( 'hex' )[:: - 1 ] + "6E6F" . decode ( 'hex' )[:: - 1 ] 'urlmon' The logs raise another Miasm error (again) during the execution: loc_00000000400002D9 : 0x400002d9 PUSHAD XOR EAX , EAX MOV EDX , DWORD PTR FS :[ EAX + 0x30 ] MOV EDX , DWORD PTR [ EDX + 0xC ] MOV EDX , DWORD PTR [ EDX + 0x14 ] MOV ESI , DWORD PTR [ EDX + 0x28 ] XOR EDI , EDI XOR EAX , EAX LODSB INC ESI TEST EAX , EAX JZ loc_0000000040000300 : 0x40000300 -> c_to : loc_0000000040000300 : 0x40000300 c_next : loc_00000000400002F3 : 0x400002f3 WARNING : address 0x30 is not mapped in virtual memory : ... assert ( self . get_exception () == 0 ) AssertionError There is an other access outside of the sandbox virtual memory at address 0x30 during the execution of this basic block. Note that we don’t known the exact address of the faulty instruction in this case. We can retrieve it by launching the script in interactive mode: python - i run_sc . py shellcode . bin ... assert ( self . get_exception () == 0 ) AssertionError >>> hex ( myjit . cpu . EIP ) '0x400002dcL' The faulty instruction is: MOV EDX , DWORD PTR FS :[ EAX + 0x30 ] Here, EAX is 0x0, so the memory lookup is at address 0x30 which is not mapped in memory. But there is a trick: the real memory lookup uses the segment selector FS. By default, Miasm doesn’t emulate segmentation, which explains the previous outcome. As we are on Windows, we know that this code is a lookup of the PEB (Process Environment Block) so we have two choices: We can map a memory page at address 0x30 in which we insert a fake PEB data. The other solution is to assign a value to the segment selector FS and a corresponding segment descriptor with a custom base address. This base address will be a fresh memory area filled with a fake PEB structure. You also have to activate the segmentation support in Miasm. Painful isn’t it? Fortunately, Miasm implements a minimal Windows structures emulation (miasm2.os_dep.win_api_x86_32_seh.py). The PEB contains interesting information like the linked list of the modules mapped in memory by the loader. By default, if you activate the Windows structures emulation, Miasm will create a PEB with dummy information related to it’s loader. However, you can force Miasm to load specific modules and use them to create a consistent loaded modules linked list (see below). To load all this information automatically, you can use the class miasm2.analysis.sandbox::Sandbox_Win_x86_32 which takes a binary’s path as input, and sets up a minimal environment like the one previously described. An example is in miasm/example/jitter/sandbox_pe_x86_32.py. The PE binary given to the sandbox is iexplorer.exe (the exploit target). This binary will serve as a host and will be used by Miasm to build the loader structure. Module dependencies will be loaded as well (they have to be present in the ./win_dll directory). As the shellcode doesn’t interact with this binary, we can also load a dummy binary (like calc.exe). Last but not least, if you don’t have calc.exe, you can build a valid executable from the shellcode using elfesteem: import sys from elfesteem import pe_init # Get the shellcode data = open ( sys . argv [ 1 ]) . read () # Generate a PE pe = pe_init . PE ( wsize = 32 ) # Add a ".text" section containing the shellcode to the PE s_text = pe . SHList . add_section ( name = ".text" , addr = 0x1000 , data = data ) # Set the entrypoint to the shellcode's address pe . Opthdr . AddressOfEntryPoint = s_text . addr # Write the PE to "sc_pe.py" open ( 'sc_pe.exe' , 'w' ) . write ( str ( pe )) In the next part, we will base our script on miasm/example/jitter/sandbox_pe_x86_32.py. This script is used to load a binary and create a working environment. Here are the default options: $ python run_sc.py -h usage: run_sc.py [-h] [-a ADDRESS] [-x] [-b] [-z] [-d] [-g GDBSERVER] [-j JITTER] [-q] [-i] [-s] [-o] [-y] [-l] [-r] filename PE sandboxer positional arguments: filename PE Filename optional arguments: -h, --help show this help message and exit -a ADDRESS, --address ADDRESS Force entry point address -x, --dumpall Load base dll -b, --dumpblocs Log disasm blocks -z, --singlestep Log single step -d, --debugging Debug shell -g GDBSERVER, --gdbserver GDBSERVER Listen on port @port -j JITTER, --jitter JITTER Jitter engine. Possible values are: tcc (default), llvm, python -q, --quiet-function-calls Don't log function calls -i, --dependencies Load PE and its dependencies -s, --usesegm Use segments -o, --load-hdr Load pe hdr -y, --use-seh Use windows SEH -l, --loadbasedll Load base dll (path './win_dll') -r, --parse-resources Load resources Here, the interesting options are: -s ( --usesegm ) to use segmentation

( ) to use segmentation -y ( --use-seh ) to generate minimalistic windows structures (yes, the name is sadly chosen)

( ) to generate minimalistic windows structures (yes, the name is sadly chosen) -l ( --loadbasedll ) to arbitrarily load a bunch of modules/dll (more on this later)

( ) to arbitrarily load a bunch of modules/dll (more on this later) -b ( --dumpblocs ) to display a block trace. As mentioned before, we can force the libraries to be loaded from a default list: # Sanbox.ALL_IMP_DLL ALL_IMP_DLL = [ "ntdll.dll" , "kernel32.dll" , "user32.dll" , "ole32.dll" , "urlmon.dll" , "ws2_32.dll" , 'advapi32.dll' , "psapi.dll" , ] We will modify the script to load and start the execution at the shellcode address: ... # Parse arguments parser = Sandbox_Win_x86_32 . parser ( description = "PE sandboxer" ) parser . add_argument ( "filename" , help = "PE Filename" ) # Get the shellcode from the second argument parser . add_argument ( "shellcode" , help = "shellcode file" ) options = parser . parse_args () # Create sandbox sb = Sandbox_Win_x86_32 ( options . filename , options , globals ()) # Load the shellcode data = open ( options . shellcode ) . read () run_addr = 0x40000000 sb . jitter . vm . add_memory_page ( run_addr , PAGE_READ | PAGE_WRITE , data ) sb . jitter . cpu . EAX = run_addr # Run sb . run ( run_addr ) Here is the command line to run this script (here we use box_upx.exe as host executable): python - i run_sc . py - b - s - l - y miasm / example / samples / box_upx . exe shellcode . bin Note that you will need a directory named win_dll containing DLLs (for instance, the ones of windows XP). Here is the output: [ INFO ]: Loading module 'ntdll.dll' [ INFO ]: Loading module 'kernel32.dll' [ INFO ]: Loading module 'user32.dll' [ INFO ]: Loading module 'ole32.dll' [ INFO ]: Loading module 'urlmon.dll' [ INFO ]: Loading module 'ws2_32.dll' [ INFO ]: Loading module 'advapi32.dll' [ INFO ]: Loading module 'psapi.dll' [ WARNING ]: Create dummy entry for 'msvcrt.dll' [ WARNING ]: Create dummy entry for 'iertutil.dll' [ WARNING ]: Create dummy entry for 'oleaut32.dll' [ WARNING ]: Create dummy entry for 'rpcrt4.dll' [ WARNING ]: Create dummy entry for 'shlwapi.dll' [ WARNING ]: Create dummy entry for 'gdi32.dll' [ WARNING ]: Create dummy entry for 'ws2help.dll' INFO : Add module 0 '' INFO : Add module 400000 'box_upx.exe' INFO : Add module 45180000 'urlmon.dll' INFO : Add module 7 c800000 'kernel32.dll' INFO : Add module 77 da0000 'advapi32.dll' INFO : Add module 7 c910000 'ntdll.dll' INFO : Add module 774 a0000 'ole32.dll' INFO : Add module 719 f0000 'ws2_32.dll' INFO : Add module 76 ba0000 'psapi.dll' INFO : Add module 7e390000 'user32.dll' INFO : Ldr 342 f00 Here, Miasm tries to load the required modules (ntdll.dll, …). Some of them are present in win_dll/ and are loaded, some are not. For those which are not present, Miasm will create a dummy base address and dummy exported addresses (near 0x7111XXXX). Next, Miasm loads the host binary (box_upx.exe). Here is an extract of the block trace: ... PUSH 0xEC0E4E8E PUSH 0x6E2BCA17 CALL loc_00000000400002CA : 0x400002ca -> c_next : loc_0000000040000076 : 0x40000076 loc_00000000400002CA : 0x400002ca POP ECX CALL loc_00000000400002D9 : 0x400002d9 -> c_next : loc_00000000400002D0 : 0x400002d0 loc_00000000400002D9 : 0x400002d9 PUSHAD XOR EAX , EAX MOV EDX , DWORD PTR FS :[ EAX + 0x30 ] MOV EDX , DWORD PTR [ EDX + 0xC ] MOV EDX , DWORD PTR [ EDX + 0x14 ] MOV ESI , DWORD PTR [ EDX + 0x28 ] This is the part which extracts imports from the PEB structure. The shellcode finds its dependencies using function and DLL hashes (0xEC0E4E8E and 0x6E2BCA17). This code is typical for a trained eye: LODSB TEST AL , AL JZ loc_0000000040000342 : 0x40000342 -> c_to : loc_0000000040000342 : 0x40000342 c_next : loc_000000004000033B : 0x4000033b loc_0000000040000337 : 0x40000337 TEST AL , AL JZ loc_0000000040000342 : 0x40000342 -> c_to : loc_0000000040000342 : 0x40000342 c_next : loc_000000004000033B : 0x4000033b loc_000000004000033B : 0x4000033b ROR EDI , 0xD ADD EDI , EAX This code snippet walks the InLoadOrderModuleList linked list and finds a module whose name’s hash matches the provided one. In this case, it will be kernel32.dll. Then it walks the export directory of this module the same way to find an expected export. For the moment, we don’t know the searched function but if we look at the next logs: ADD EAX , EBP MOV DWORD PTR [ ESP + 0x1C ], EAX POPAD RET 0x8 loc_00000000400002D6 : 0x400002d6 PUSH ECX JMP EAX [ INFO ]: kernel32_LoadLibraryA ( dllname = 0x13ffe0 ) ret addr : 0x40000076 loc_0000000040000076 : 0x40000076 We have an information from the jitter that the code called the function LoadLibraryA from the module kernel32. This is the resolved function. But how does Miasm know this? In fact each time you load a library in memory, Miasm adds a breakpoint on each of its exported addresses, and remembers the relation between the address and the exported name. When the emulated program counter reaches one of these breakpoints, the emulation is paused. Miasm then tries to find a Python function whose name has the form ModuleName_ModuleFunction and calls it. In this case, we implement a minimalistic set of Windows functions which, once called, will have the same side effects on the sandbox as the real function on the registers/memory. For example, if a binary calls rand, we can force its return value to make it less random: def msvcrt_rand ( jitter ): ret_ad , _ = jitter . func_args_cdecl ( 0 ) jitter . func_ret_stdcall ( ret_ad , 0x666 ) Those default functions are defined in the module miasm2.os_dep.win_api_x86_32. Here is the code of LoadLibraryA: def kernel32_LoadLibraryA ( jitter ): # jitter.func_args_stdcall is a helper which knows the current calling # convention (stack based here), and will unstack the return address # and one parameter (dllname). dllname is a pointer to the dll name # string in memory. ret_ad , args = jitter . func_args_stdcall ([ "dllname" ]) libname = get_str_ansi ( jitter , args . dllname , 0x100 ) log . info ( libname ) ret = winobjs . runtime_dll . lib_get_add_base ( libname ) log . info ( "ret %x " , ret ) # jitter.func_ret_stdcall is another helper which will set the program # counter to the value ret_ad and the return value (EAX in this # convention) to ret. jitter . func_ret_stdcall ( ret_ad , ret ) The jitter will then resume the execution to the fresh program counter, and the execution resumes as if the Windows function had been called. This mechanism allows us to script or simulate any function in Python! By the way, if you implement the previous two helpers for ARM, you can use the same Python code to simulate LoadLibraryA on Windows for this architecture. Note that if you want to get the module name, you can modify the script to log it, or put a breakpoint at 0x40000076 to stop the execution and retrieve the module name manually. Here is the modification: def stop_exec ( jitter ): return False sb . jitter . add_breakpoint ( 0x40000076 , stop_exec ) # Run the shellcode sb . run ( run_addr ) And the live analysis: python - i run_sc . py - b - s - l - y miasm / example / samples / box_upx . exe shellcode . bin ... >>> sb . jitter . get_str_ansi ( 0x13ffe0 ) 'urlmon'

Party Hard What’s next? Another crash, obviously! loc_0000000040000083 : 0x40000083 PUSH EAX PUSH 0x6 PUSH 0x0 PUSH 0xDC8061B PUSH 0x2E773AE6 CALL loc_00000000400002CA : 0x400002ca -> c_next : loc_0000000040000097 : 0x40000097 Traceback ( most recent call last ): ... raise ValueError ( 'unknown api' , hex ( jitter . pc ), repr ( fname )) ValueError : ( 'unknown api' , '0x774c1473L' , "'ole32_CoInitializeEx'" ) What happened here? The function at address 0x400002ca is the one which resolves a function by hash. So the code resolved another function and tries to call it. By the way, if you think that the log output is not really human friendly, you can add some symbols to enhance it. For exemple: ... # Links address 0x400002ca to the label name resolve_by_hash sb . jitter . ir_arch . symbol_pool . add_label ( 'resolve_by_hash' , 0x400002ca ) # Run the shellcode sb . run ( run_addr ) Result: loc_0000000040000083 : 0x40000083 PUSH EAX PUSH 0x6 PUSH 0x0 PUSH 0xDC8061B PUSH 0x2E773AE6 CALL resolve_by_hash : 0x400002ca -> c_next : loc_0000000040000097 : 0x40000097 Traceback ( most recent call last ): That’s a bit clearer. So what’s the problem now? Miasm reaches an internal breakpoint on the function ole32_CoInitializeEx. Unluckily, this function is not implemented in the default library. But are we really stuck here? Not really. If you read the Msdn documentation, this function is used to initialize a COM object and returns 0x1 if everything is ok. Fine, let’s implement a minimalistic function in our script. Don’t you have the feeling of re implementing the Windows API using architecture independent code here? def ole32_CoInitializeEx ( jitter ): ret_ad , args = jitter . func_args_stdcall ([ "pvReserved" , "dwCoInit" ]) jitter . func_ret_stdcall ( ret_ad , 1 ) WARNING: the function declaration position is important: it must be defined in the script before the instanciation of the sanbox. This way, the declaration belongs to the globals(). The logs are now: PUSH 0xDC8061B PUSH 0x2E773AE6 CALL resolve_by_hash : 0x400002ca -> c_next : loc_0000000040000097 : 0x40000097 [ INFO ]: ole32_CoInitializeEx ( a = 0x0 , b = 0x6 ) ret addr : 0x40000097 Ok, now we have emulated the function. But there is more: PUSH 0x91AFCA54 PUSH 0x6E2BCA17 CALL resolve_by_hash : 0x400002ca -> c_next : loc_00000000400000B0 : 0x400000b0 [ INFO ]: kernel32_VirtualAlloc ( lpvoid = 0x0 , dwsize = 0x1000 , alloc_type = 0x1000 , flprotect = 0x40 ) ret addr : 0x400000b0 The shellcode resolved and called the function kernel32_VirtualAlloc, which is already implemented in Miasm library. Then there is a call to another function: PUSH 0xCFD98161 PUSH 0x6E2BCA17 CALL resolve_by_hash : 0x400002ca -> c_next : loc_00000000400000C0 : 0x400000c0 [ INFO ]: kernel32_GetVersion () ret addr : 0x400000c0 loc_00000000400000C0 : 0x400000c0 CMP AL , 0x6 JL loc_00000000400000D4 : 0x400000d4 Hey, it seems the shellcode has a different behavior depending on the Windows version. Note that defining a custom kernel32_GetVersion will override the one defined in Miasm library, and so you can play with its behavior to see the impact on the shellcode. And now, another crash: PUSH 0xD7834A7E PUSH 0xAD74DBF2 CALL resolve_by_hash : 0x400002ca -> c_next : loc_0000000040000184 : 0x40000184 Traceback ( most recent call last ): raise ValueError ( 'unknown api' , hex ( jitter . pc ), repr ( fname )) ValueError : ( 'unknown api' , '0x7c936102L' , "'ntdll_swprintf'" ) The script tries to resolve and execute ntdll_swprintf. This one will be a bit harder. First step, let’s only dump the format string: def ntdll_swprintf ( jitter ): ret_ad , args = jitter . func_args_stdcall ([ "dst" , "pfmt" ]) fmt = jitter . get_str_unic ( jitter , args . pfmt ) print repr ( fmt ) return False Here is the output: PUSH 0xD7834A7E PUSH 0xAD74DBF2 CALL resolve_by_hash : 0x400002ca -> c_next : loc_0000000040000184 : 0x40000184 [ INFO ]: ntdll_swprintf ( dst = 0x20000000 , pfmt = 0x13ffc8 ) ret addr : 0x40000184 '%S' As the format string is really simple, let’s implement a minimalistic version of swprintf: def ntdll_swprintf ( jitter ): ret_ad , args = jitter . func_args_stdcall ([ "dst" , "pfmt" ]) fmt = jitter . get_str_unic ( args . pfmt ) print "FMT:" , repr ( fmt ) if fmt == "%S" : psrc = jitter . pop_uint32_t () src = jitter . get_str_ansi ( psrc ) out = " %s " % src else : raise RuntimeError ( "unknown fmt %s " % fmt ) print "OUT:" , repr ( out ) jitter . set_str_unic ( args . dst , out ) # Returns the string len in wchar unit jitter . func_ret_stdcall ( ret_ad , len ( out ) / 2 ) Let’s have a look at the new output: PUSH 0xD7834A7E PUSH 0xAD74DBF2 CALL resolve_by_hash : 0x400002ca -> c_next : loc_0000000040000184 : 0x40000184 [ INFO ]: ntdll_swprintf ( dst = 0x20000000 , pfmt = 0x13ffc8 ) ret addr : 0x40000184 FMT : '%S' OUT : 'hXXp://efyjlXXXXXXXXXXXXXXXXXXin.net/fXXXXXXXXXXXXXXX8867XXXX5' loc_0000000040000184 : 0x40000184 ... PUSH ESI PUSH EDI PUSH ECX CALL DWORD PTR [ EBP + 0xFFFFFFFC ] -> c_next : loc_0000000040000161 : 0x40000161 Traceback ( most recent call last ): raise ValueError ( 'unknown api' , hex ( jitter . pc ), repr ( fname )) ValueError : ( 'unknown api' , '0x451b65b3L' , "'urlmon_URLDownloadToCacheFileW'" ) Note: we deliberately changed the output of the script to avoid being flagged as a bad host. Here is a minimalistic implementation of URLDownloadToCacheFileW: ... def urlmon_URLDownloadToCacheFileW ( jitter ): ret_ad , args = jitter . func_args_stdcall ([ "lpunkcaller" , "szurl" , "szfilename" , "ccfilename" , "reserved" , "pbsc" ]) url = jitter . get_str_unic ( args . szurl ) print "URL:" , url jitter . set_str_unic ( args . szfilename , "toto" ) jitter . func_ret_stdcall ( ret_ad , 0 ) This will inform the shellcode we have correctly downloaded a binary and stored it in a file named toto. And here is the final log: PUSH EDI PUSH ECX PUSH EAX PUSH EAX PUSH EAX PUSH EAX PUSH EAX PUSH EAX PUSH EAX PUSH DWORD PTR [ EBP + 0x8 ] PUSH 0x16B3FE88 PUSH 0x6E2BCA17 CALL resolve_by_hash : 0x400002ca -> c_next : loc_00000000400002C5 : 0x400002c5 Traceback ( most recent call last ): raise ValueError ( 'unknown api' , hex ( jitter . pc ), repr ( fname )) ValueError : ( 'unknown api' , '0x7c802336L' , "'kernel32_CreateProcessW'" ) Look at the first argument: >>> sb . jitter . get_str_unic ( sb . jitter . get_stack_arg ( 1 )) 'toto' The shellcode tries to execute the freshly downloaded binary.