Intro

It has bin a while since my last post so I figured it’s time to write something. Recently I Stumbled on a piece of malware called Ponmocup which is a interesting strain of malware, but since there is plenty written about it I wont go into it’s details. While analyzing the malware I noticed that all the strings it uses are encrypted and decrypted at runtime. The decryption loops are all over the code(inline) and it seems to use various methodes to decrypt the strings, where other malware use the same routine/algorithm most the time since programmers are lazy(fact!). Normally I would make a IDA script to decrypt them but this time I choose a different approach by using the Unicorn emulator.

String decryption loop

1000B7C1 > \B9 30897B61 MOV ECX,617B8930 1000B7C6 . 898D 90F1FFFF MOV DWORD PTR SS:[EBP-E70],ECX 1000B7CC . 33C0 XOR EAX,EAX 1000B7CE > 8985 88F1FFFF MOV DWORD PTR SS:[EBP-E78],EAX 1000B7D4 . 83F8 1D CMP EAX,1D 1000B7D7 . 73 1E JNB SHORT 1100_300.1000B7F7 1000B7D9 . 81C1 5F6BA82E ADD ECX,2EA86B5F 1000B7DF . 898D 90F1FFFF MOV DWORD PTR SS:[EBP-E70],ECX 1000B7E5 . 33D2 XOR EDX,EDX 1000B7E7 . 8A1445 14070110 MOV DL,BYTE PTR DS:[EAX*2+10010714] 1000B7EE . 03D1 ADD EDX,ECX 1000B7F0 . 885405 C4 MOV BYTE PTR SS:[EBP+EAX-3C],DL 1000B7F4 . 40 INC EAX 1000B7F5 .^ EB D7 JMP SHORT 1100_300.1000B7CE

This is an sample snippet of such decryption loop, it has a pretty simple code flow where some registers are initialized at the beginning, it then enters a loop based on a unconditional jump upwards and having a conditional jump to leave the loop. The encrypted data is loaded from 10010714.

Concept

So the idea is to use the Capstone disassembler library to analyze the decryption loop starting from a given Virtual address and locating the first unconditional jump backwards and also keeping track of write operations where a EBP base register is used, whenever such instruction is found we log the displacement value(offset to stack) which helps us to locate the decrypted string later.

Once we traced to code with the disassembler library we map the target binary into memory at it’s known imagebase and copy each section to memory so virtual addresses in the binary match up with whats in memory.

Code analyzer

def code_analyzer(pe, virtualaddress, max_instructions=128): # get the raw offset from the virtualaddress a_off = pe.get_offset_from_rva(virtualaddress - pe.OPTIONAL_HEADER.ImageBase) # init disassembler lib caps = Cs(CS_ARCH_X86, CS_MODE_32) caps.detail = True # init vars code_len = 0 stack_offsets = [] jmpfound = False # disassemble code and analyze the instructions for ins in caps.disasm(pe.__data__[a_off:], virtualaddress, max_instructions): # increase code_len with current instruction size code_len += ins.size if verbose: print format_disasembly(ins) # process operands if ins.operands: for ops in ins.operands: # memory access operands if ops.type == X86_OP_MEM: # ebp base register and disp value not 0 if ops.value.mem.base == X86_REG_EBP and ops.value.mem.disp != 0: disp = abs(ops.value.mem.disp) # add new disp value if disp not in stack_offsets: stack_offsets.append(disp) # process groups if ins.groups: # jump types if ins.group(CS_GRP_JUMP): # JMP backwards if ins.id == X86_INS_JMP and int(ins.op_str, 16) < ins.address: jmpfound = True break # return types elif ins.group(CS_GRP_RET): break # false if max instructions reached if not jmpfound: print "End decryption loop not found" return 0,[] # paranoid mode if len(stack_offsets) == 0: print "No stack offsets found" return 0,[] # ... for offset in stack_offsets: if offset > stacksize: print "Stack offset 0x%08x is larger then the stacksize 0x%08x" %(offset, stacksize) return 0,[] # return code length and stackoffsets sorted descending return code_len, sorted(stack_offsets, reverse=True)

This code returns the amount of bytes of all instructions till the jmp and a list with all displacement values where a EBP register was involved eq. MOV DWORD PTR SS:[EBP-E78],EAX

Emulator

# Initialize emulator emu = Uc(UC_ARCH_X86, UC_MODE_32) # map memory at the imagebase and copy each section # data to it's virtualaddress emu.mem_map(imagebase, imagesize + stacksize) for section in pe.sections: emu.mem_write(imagebase + section.VirtualAddress, section.get_data()) # initialize stack registers ebp and esp emu.reg_write(UC_X86_REG_ESP, stackaddress + stacksize) emu.reg_write(UC_X86_REG_EBP, stackaddress + stacksize) # start emulator emu.emu_start(virtualaddress, virtualaddress + code_len) # use the largest stack_offset value to define the min. # ammount of stack data to read ebp_addr = stackaddress + stacksize - stack_offsets[0] # read stack memory, largest stack_offset as size data = emu.mem_read(ebp_addr, stack_offsets[0])

The next code snippet maps the target binary into memory as explained earlier, it set’s up some stack memory and registers and then starts the emulator and once done it reads the stack memory and processes it by trying to locate strings at the known displacement offsets.

Some results

C:\>pomno_decrstr.py 1100.3002.dll 0x1000b7c1 1000B7C1 B930897B61 mov ecx, 0x617b8930 1000B7C6 898D90F1FFFF mov dword ptr [ebp - 0xe70], ecx 1000B7CC 33C0 xor eax, eax 1000B7CE 898588F1FFFF mov dword ptr [ebp - 0xe78], eax 1000B7D4 83F81D cmp eax, 0x1d 1000B7D7 731E jae 0x1000b7f7 1000B7D9 81C15F6BA82E add ecx, 0x2ea86b5f 1000B7DF 898D90F1FFFF mov dword ptr [ebp - 0xe70], ecx 1000B7E5 33D2 xor edx, edx 1000B7E7 8A144514070110 mov dl, byte ptr [eax*2 + 0x10010714] 1000B7EE 03D1 add edx, ecx 1000B7F0 885405C4 mov byte ptr [ebp + eax - 0x3c], dl 1000B7F4 40 inc eax 1000B7F5 EBD7 jmp 0x1000b7ce offset type length content ================================ 00003c ASCII 29 %u.%u.%u.%u.%u.%u.%u.%u.%s.%i C:\>pomno_decrstr.py 1100.3002.dll 0x1000b547 1000B547 BA0F000000 mov edx, 0xf 1000B54C 899520E8FFFF mov dword ptr [ebp - 0x17e0], edx 1000B552 B8EB000000 mov eax, 0xeb 1000B557 898524E8FFFF mov dword ptr [ebp - 0x17dc], eax 1000B55D B9E5000000 mov ecx, 0xe5 1000B562 898D28E8FFFF mov dword ptr [ebp - 0x17d8], ecx 1000B568 898D2CE8FFFF mov dword ptr [ebp - 0x17d4], ecx 1000B56E C78530E8FFFF19000000 mov dword ptr [ebp - 0x17d0], 0x19 1000B578 C78534E8FFFF23000000 mov dword ptr [ebp - 0x17cc], 0x23 1000B582 C78538E8FFFFE1000000 mov dword ptr [ebp - 0x17c8], 0xe1 1000B58C 89853CE8FFFF mov dword ptr [ebp - 0x17c4], eax 1000B592 C78540E8FFFFEA000000 mov dword ptr [ebp - 0x17c0], 0xea 1000B59C 898544E8FFFF mov dword ptr [ebp - 0x17bc], eax 1000B5A2 C78548E8FFFFA1000000 mov dword ptr [ebp - 0x17b8], 0xa1 1000B5AC 89854CE8FFFF mov dword ptr [ebp - 0x17b4], eax 1000B5B2 C78550E8FFFFE6000000 mov dword ptr [ebp - 0x17b0], 0xe6 1000B5BC C78554E8FFFF2F000000 mov dword ptr [ebp - 0x17ac], 0x2f 1000B5C6 C78558E8FFFFF3000000 mov dword ptr [ebp - 0x17a8], 0xf3 1000B5D0 C7855CE8FFFFDE000000 mov dword ptr [ebp - 0x17a4], 0xde 1000B5DA B848840000 mov eax, 0x8448 1000B5DF 898560E8FFFF mov dword ptr [ebp - 0x17a0], eax 1000B5E5 32C9 xor cl, cl 1000B5E7 888D9EE8FFFF mov byte ptr [ebp - 0x1762], cl 1000B5ED 899518E8FFFF mov dword ptr [ebp - 0x17e8], edx 1000B5F3 3ACA cmp cl, dl 1000B5F5 732F jae 0x1000b626 1000B5F7 0FB7C0 movzx eax, ax 1000B5FA 8BF0 mov esi, eax 1000B5FC C1EE04 shr esi, 4 1000B5FF C1E00C shl eax, 0xc 1000B602 0BC6 or eax, esi 1000B604 898560E8FFFF mov dword ptr [ebp - 0x17a0], eax 1000B60A 0FB6F1 movzx esi, cl 1000B60D 33DB xor ebx, ebx 1000B60F 8A9CB524E8FFFF mov bl, byte ptr [ebp + esi*4 - 0x17dc] 1000B616 03D8 add ebx, eax 1000B618 885C35D4 mov byte ptr [ebp + esi - 0x2c], bl 1000B61C FEC1 inc cl 1000B61E 888D9EE8FFFF mov byte ptr [ebp - 0x1762], cl 1000B624 EBCD jmp 0x1000b5f3 offset type length content ================================ 00002c ASCII 15 /images2/%s.swf C:\>pomno_decrstr.py 1100.3002.dll 0x1000b20d 1000B20D C685A7E8FFFF1C mov byte ptr [ebp - 0x1759], 0x1c 1000B214 33C9 xor ecx, ecx 1000B216 898D74E8FFFF mov dword ptr [ebp - 0x178c], ecx 1000B21C BF3F000000 mov edi, 0x3f 1000B221 89BD14E8FFFF mov dword ptr [ebp - 0x17ec], edi 1000B227 0FBFC7 movsx eax, di 1000B22A 3BC8 cmp ecx, eax 1000B22C 7D36 jge 0x1000b264 1000B22E 0FB685A7E8FFFF movzx eax, byte ptr [ebp - 0x1759] 1000B235 8BD0 mov edx, eax 1000B237 C1EA02 shr edx, 2 1000B23A C1E006 shl eax, 6 1000B23D 33D0 xor edx, eax 1000B23F 33C0 xor eax, eax 1000B241 8AC2 mov al, dl 1000B243 8885A7E8FFFF mov byte ptr [ebp - 0x1759], al 1000B249 33D2 xor edx, edx 1000B24B 8A144D60C40010 mov dl, byte ptr [ecx*2 + 0x1000c460] 1000B252 2BD0 sub edx, eax 1000B254 88940D44FFFFFF mov byte ptr [ebp + ecx - 0xbc], dl 1000B25B 41 inc ecx 1000B25C 898D74E8FFFF mov dword ptr [ebp - 0x178c], ecx 1000B262 EBC3 jmp 0x1000b227 offset type length content ================================ 0000bc ASCII 62 %u&%04X&%02X&%u.%u&%u&%s&%s&%u.%u&%u&%x.%x.%x&%s&%04x.%04x&%s& C:\>pomno_decrstr.py 1100.3002.dll 0x1000afe5 1000AFE5 B9B8690000 mov ecx, 0x69b8 1000AFEA 894D84 mov dword ptr [ebp - 0x7c], ecx 1000AFED 32C0 xor al, al 1000AFEF 8845A7 mov byte ptr [ebp - 0x59], al 1000AFF2 3C39 cmp al, 0x39 1000AFF4 7326 jae 0x1000b01c 1000AFF6 0FB7C9 movzx ecx, cx 1000AFF9 8BD1 mov edx, ecx 1000AFFB C1EA0E shr edx, 0xe 1000AFFE C1E102 shl ecx, 2 1000B001 0BCA or ecx, edx 1000B003 894D84 mov dword ptr [ebp - 0x7c], ecx 1000B006 0FB6F0 movzx esi, al 1000B009 33D2 xor edx, edx 1000B00B 8A14B530060110 mov dl, byte ptr [esi*4 + 0x10010630] 1000B012 03D1 add edx, ecx 1000B014 885435A8 mov byte ptr [ebp + esi - 0x58], dl 1000B018 FEC0 inc al 1000B01A EBD3 jmp 0x1000afef offset type length content ================================ 000058 ASCII 57 Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US) C:\>pomno_decrstr.py 1100.3002.dll 0x1000ae47 1000AE47 B8BF1293F9 mov eax, 0xf99312bf 1000AE4C 8945AC mov dword ptr [ebp - 0x54], eax 1000AE4F 32C9 xor cl, cl 1000AE51 884DBE mov byte ptr [ebp - 0x42], cl 1000AE54 33D2 xor edx, edx 1000AE56 8A150C060110 mov dl, byte ptr [0x1001060c] 1000AE5C 81F2BF000000 xor edx, 0xbf 1000AE62 8855BF mov byte ptr [ebp - 0x41], dl 1000AE65 3ACA cmp cl, dl 1000AE67 7329 jae 0x1000ae92 1000AE69 8BD0 mov edx, eax 1000AE6B C1EA19 shr edx, 0x19 1000AE6E C1E007 shl eax, 7 1000AE71 0BC2 or eax, edx 1000AE73 8945AC mov dword ptr [ebp - 0x54], eax 1000AE76 0FB6F9 movzx edi, cl 1000AE79 33D2 xor edx, edx 1000AE7B 8A147D0E060110 mov dl, byte ptr [edi*2 + 0x1001060e] 1000AE82 33D0 xor edx, eax 1000AE84 88543DC0 mov byte ptr [ebp + edi - 0x40], dl 1000AE88 FEC1 inc cl 1000AE8A 884DBE mov byte ptr [ebp - 0x42], cl 1000AE8D 8A55BF mov dl, byte ptr [ebp - 0x41] 1000AE90 EBD3 jmp 0x1000ae65 offset type length content ================================ 000040 ASCII 16 jAhX4n4xQfx8p9P3

I coudnt find a unicode sample, but those are handled aswell(dont mind my string lookup code, it sucks, I know)

Finally

I fell in love with the Unicorn emulator library, I tried a few other in the past and this one is by far the best out there currently. This malware sample was a simple example of the usage of such emulator in a rather simple way to reach your goals.

The full script can he found here