Hey folks,

Happy New Year, and welcome to 2014!

On a recent trip to Tyson's Corner, VA, I had some time to kill, so I took a careful look at a malware sample that a friend of mine sent to me some time ago, which I believe he originally got off somebody else's hosed system. The plan was for me to investigate it, and I promised him I would; it just took awhile!

Anyways, the sample has a few layers of packing, and I thought it'd be fun/interesting to show you how to unwrap the entire thing to obtain the final payload. I am not going to discuss the payload itself in this post, largely because I haven't spent much time reversing it. Perhaps in the future I'll dig a little deeper, but for now we'll focus on the packing.

I called this sample "lcmw". It stood for something interesting, but I don't really remember what—I may have been drinking when I named it. :)



You can find my IDA .idb files and my notes on Github, and samples on my downloads page. WARNING: This is malware! Only run it in a highly controlled environment!! I don't know what the final payload does, so if you run it, it's at your own risk! Use a virtual machine, don't connect it to the Internet, etc. Be careful!

(The password is 'infected')

Fun fact: my ISP got an abuse@ email because I was originally hosting the malware file without putting it in a .zip file:

Dear abuse team, please help to close these offending viruses sites(1) so far. status: As of 2013-11-29 12:51:30 CET [2]http://support.clean-mx.de/clean-mx/viruses.php?email=internet.abuse@noc.vo[..] (for full uri, please scroll to the right end ... We detected many active cases dated back to 2007, so please look at the date column below. You may also subscribe to our MalwareWatch list [3]http://lists.clean-mx.com/cgi-bin/mailman/listinfo/viruswatch This information has been generated out of our comprehensive real time database, tracking worldwide viruses URI's If your review this list of offending site, please do this carefully, pay attention for redirects also! Also, please consider this particular machines may have a root kit installed ! So simply deleting some files or dirs or disabling cgi may not really solve the issue !

That was neat. Thanks to my ISP—Voi Network Solutions—laughing about it with me rather than shutting off my connection!

Background

Stage1: the time waster

All right, if you're following along, this file is called "sample.bin". The original name was "hlcumrpi.dat". But if you run "file" on it (on *nix or Cygwin):

$ file hlcumrpi.dat

hlcumrpi.dat: PE32 executable (DLL) (GUI) Intel 80386, for MS Windows

It's technically a .dll library, and is typically run with a "rundll32.exe" command on startup. I renamed it to "sample.bin", since it's easier to remember that way. Fire up IDA, load the file, and have a look at the DllEntryPoint function!

Initial inspection

...are you done? Did you get lost in a deep rats' nest of functions? Because that's what this stage is all about—wasting your time. It turns out, there's maybe 10 lines that do anything—the rest just burns CPU and, more importantly, reverse engineer cycles.

Take this sequence for example:

.text : 004014AE mov ebx , 0 FFFFFh .text : 004014B3 .text : 004014B3 loc_4014B3 : .text : 004014B3 call UselessComplicatedFunction .text : 004014B3 .text : 004014B8 dec ebx .text : 004014B9 cmp ebx , 43 h .text : 004014BC jnz short loc_4014B3

I went through that entire function, and determined:

There are no inputs

Nothing is returned

No global variables are touched

No API calls are made

Basically, it does nothing. But it does it with a lot of code!

Then it gets the string length of the main function, which doesn't even make any sense, and ignores the response:

.text : 004014C3 call ds : lstrlenA

Then it gets the current process id about a million times:

.text : 004014C9 mov ebx , 0FFFFFFh .text : 004014CE .text : 004014CE loc_4014CE : .text : 004014CE call GetCurrentProcessIdWrapper .text : 004014D3 dec ebx .text : 004014D4 cmp ebx , 43 h .text : 004014D7 jnz short loc_4014CE

This function has no side effects and the return value is ignored. Once again, wasting time.

Then, we finally get to something interesting! This code:

.text : 004014D9 push PAGE_EXECUTE_READWRITE .text : 004014DB push 3000 h .text : 004014E0 push 30AD8h .text : 004014E5 push 0 .text : 004014E7 call ds : VirtualAlloc .text : 004014ED mov edx , eax .text : 004014EF mov edi , eax

Allocates 0x30ad8 bytes of read/write/execute memory, and stores the pointer to the memory in edi.

Even though malware is malware—and you can never realllllly be sure why it's doing something—allocating a very specific amount of r/w/x memory almost always means one thing—it's going to unpack something into that buffer and then execute it.

And sure enough, that piece of code is followed by a de-obfuscation loop (I originally called it a "decryption loop", but that was technically wrong since it isn't actually decrypting anything):

.text : 004014F6 decrypt_loop : .text : 004014F1 mov esi , offset start_encrypted_code .text : 004014F6 call XorAndRotate .text : 004014FB shl al , 2 .text : 004014FE shr ax , 2 .text : 00401502 stosb .text : 00401503 shl ah , 4 .text : 00401506 shr eax , 8 .text : 00401509 shl ax , 2 .text : 0040150D shr eax , 6 .text : 00401510 stosw .text : 00401512 cmp esi , offset end_encrypted_data .text : 00401518 jbe short decrypt_loop

Feel free to look at the function I called XorAndRotate—it's pretty boring. It just twiddles the current uint32 a bit.

You'll notice that esi—typically the "source" pointer—is initialized to something I called start_encrypted_code. I'm using the term "encrypted" in only the most general sense; "obfuscated" would have been better, but it's harder to type. If you dig deeper, this is what it looks like:

.text : 00401549 start_encrypted_code db 0 D6h , 0 E6h , 1 Bh , 0B Ah , 0 C0h , 0 F1h , 0 CDh , 0 C2h , 0, 0 D5h .text : 00401549 .text : 00401549 db 5 Eh , 0B 6 h , 97 h , 0 EBh , 0 C0h .text : 00401558 dd 2 dup ( 83008300 h ), 0 C2D50033h , 53 C02D2Dh , 0 C7008300h .text : 00401558 dd 6 A78300h , 83000 F49h , 83008300 h , 81 C88300h , 0 C8C0C700h .text : 00401558 dd 4 C70081h , 810481 C8h , 4 E18300h , 8 B6F1B04h , 0 D6B68300h .text : 00401558 dd 8 CB98Ch , 830083 h , 1 F830083h , 0 C2B79A97h , 6 C001FE6h .text : 00401558 dd 53 D5004Eh , 1 F2512A7h , 1 FB44EE6h , 61 B8300h , 681 B9h , 830083 h [...]

And so on, for a long, long time. Almost certainly obfuscated code. And since it's being sent through that de-obfuscation function, that pretty much confirms it.

The function ends with:

.text : 0040151A pop ebx .text : 0040151B pop edi .text : 0040151C pop esi .text : 0040151D jmp edx

Where edx—at that point—contains a pointer to the decrypted buffer.

So, we see a function that:

Wastes a ton of time

Allocated executable memory

Populates the executable memory

Jumps to the executable memory

A classic decryption/deobfuscation loop!

Let's look at the easiest possible way to own it!

Owning stage1

So, I'm lazy. Really lazy. I'm gonna find the easiest possible way to decrypt this bad boy.

WARNING: I run malware in this section. It's de-clawed, but you never know what clever tricks are used (I used to have a cat that was de-clawed, and believe me: they still have sharp teeth!); only do this in a throw-away virtual machine! Never, NEVER run this on any important system!

All right, so we have a useless function at the top (I enabled the 'code bytes' now so we can see what the machine code looks like):

.text : 004014B3 E8 C0 FE FF FF call UselessComplicatedFunction .text : 004014B3 .text : 004014B8 4 B dec ebx .text : 004014B9 83 FB 43

I'm paranoid; mayyybe it's doing something important? So I'm gonna fire up a hex editor (like xvi32), search the sample.dll binary for the machine code, " e8 c0 fe ff ff 4b 83 fb 43 " (which should be at offset 0x8b3 in the file), and nop out the call (" e8 c0 fe ff ff " -> " 90 90 90 90 90 ").

That way, even if it is doing something sneaky, it's never called anyways.

Next, I'm going to do the same to the GetProcessId wrapper:

.text : 004014CE E8 D1 FF FF FF call GetCurrentProcessIdWrapper .text : 004014D3 4 B dec ebx .text : 004014D4 83 FB 43 cmp ebx , 43 h

The 'call' instruction, which you can find at offset 0x8ce in the file, also needs to be replaced with " 90 90 90 90 90 ".

Finally, we don't want the malware to actually run. That would defeat the entire purpose of de-clawing! So we find the code at the bottom:

.text : 00401518 76 DC jbe short decrypt_loop .text : 0040151A 5 B pop ebx .text : 0040151B 5 F pop edi .text : 0040151C 5 E pop esi .text : 0040151D FF E2 jmp edx .text : 0040151D DllEntryPoint endp .text : 0040151D .text : 0040151F .text : 0040151F C3 retn

We want to find the jmp instruction (" ff e2 ")—which should be at 0x91d in the file—and replace it with " cd 03 ".

Wait, what's cd 03 !?

It's " int 3 ". Besides being my license plate, it's also the instruction that means "debug breakpoint". In other words, if a running application hits that instruction, it'll fire a debug interrupt. If the application is being debugged, the debugger gets control; if it's not, the application will simply crash. Whatever the case: it will never run the malicious code!

Save the new .dll—you can find this in the .zip under the name "sample_safe.bin"—and load it in IDA just to make sure. It should now look like this—note that there's only the one call left:

.text : 004014AB DllEntryPoint proc near .text : 004014AB .text : 004014AB hinstDLL = dword ptr 4 .text : 004014AB fdwReason = dword ptr 8 .text : 004014AB lpReserved = dword ptr 0 Ch .text : 004014AB .text : 004014AB push esi .text : 004014AC push edi .text : 004014AD push ebx .text : 004014AE mov ebx , 0 FFFFFh .text : 004014B3 .text : 004014B3 loc_4014B3 : .text : 004014B3 nop .text : 004014B4 nop .text : 004014B5 nop .text : 004014B6 nop .text : 004014B7 nop .text : 004014B8 dec ebx .text : 004014B9 cmp ebx , 43 h .text : 004014BC jnz short loc_4014B3 .text : 004014BE push offset DllEntryPoint .text : 004014C3 call ds : lstrlenA .text : 004014C9 mov ebx , 0 FFFFFFh .text : 004014CE .text : 004014CE loc_4014CE : .text : 004014CE nop .text : 004014CF nop .text : 004014D0 nop .text : 004014D1 nop .text : 004014D2 nop .text : 004014D3 dec ebx .text : 004014D4 cmp ebx , 43 h .text : 004014D7 jnz short loc_4014CE .text : 004014D9 push 40 h .text : 004014DB push 3000 h .text : 004014E0 push 30 AD8h .text : 004014E5 push 0 .text : 004014E7 call ds : VirtualAlloc .text : 004014ED mov edx , eax .text : 004014EF mov edi , eax .text : 004014F1 mov esi , offset byte_401549 .text : 004014F6 .text : 004014F6 loc_4014F6 : .text : 004014F6 call sub_401520 .text : 004014FB shl al , 2 .text : 004014FE shr ax , 2 .text : 00401502 stosb .text : 00401503 shl ah , 4 .text : 00401506 shr eax , 8 .text : 00401509 shl ax , 2 .text : 0040150D shr eax , 6 .text : 00401510 stosw .text : 00401512 cmp esi , offset byte_43201D .text : 00401518 .text : 00401518 loc_401518 : .text : 00401518 jbe short loc_4014F6 .text : 0040151A pop ebx .text : 0040151B pop edi .text : 0040151C pop esi .text : 0040151D int 3 .text : 0040151F retn .text : 0040151F DllEntryPoint endp

Awesome! Now let's write a quick app to run it:

#include <windows.h> int main( int argc, char *argv[]) { LoadLibrary( "C: \\ Documents and Settings \\ Administrator \\ Desktop \\ sample_safe.bin" ); return 0 ; }

And compile it, then run it in a debugger (I'm going to use windbg, since that's my favourite debugger):

C : \ Program Files \ Debugging Tools for Windows (x86)>windbg ' c:\Documents and Settings\Administrator\My Documents\Visual Studio 2008\Projects \t est_malware\Debug \t est_malware.exe ' Executable search path is: ModLoad : 00400000 0041b000 test_malware.exe ModLoad : 7c800000 7c8c0000 ntdll.dll ModLoad : 77e40000 77f42000 C: \ WINDOWS \ system32 \ kernel32.dll ModLoad : 10200000 10323000 C: \ WINDOWS \ WinSxS \ x86_Microsoft.VC90.DebugCRT_1fc8b3b9a1e18e3b_9 .0.21022.8 _x-ww_597C3456 \ MSVCR90D.dll (1b8 .2d 4): Break instruction exception - code 80000003 (first chance) eax= 10400000 ebx=7ffda000 ecx= 00000003 edx= 00000008 esi=7c8877f4 edi=00151f38 eip=7c81a3e1 esp=0012fb70 ebp=0012fcb4 iopl= 0 nv up ei pl nz na po nc cs=001b ss= 0023 ds= 0023 es= 0023 fs=003b gs= 0000 efl= 00000202 *** ERROR: Symbol file could not be found. Defaulted to export symbols for ntdll.dll - ntdll!DbgBreakPoint: 7c81a3e1 cc int 3 0 : 000 > g ModLoad : 00350000 00389000 C: \ Documents and Settings \ Administrator \ Desktop \ sample_safe.bin ModLoad : 77380000 77411000 C: \ WINDOWS \ system32 \ user32.dll ModLoad : 77c00000 77c48000 C: \ WINDOWS \ system32 \ GDI32.dll ModLoad : 77f50000 77feb000 C: \ WINDOWS \ system32 \ ADVAPI32.dll ModLoad : 77c50000 77cef000 C: \ WINDOWS \ system32 \ RPCRT4.dll ModLoad : 76f50000 76f63000 C: \ WINDOWS \ system32 \ Secur32.dll (1b8 .2d 4): Break instruction exception - code 80000003 (first chance) eax= 00035000 ebx=003514ab ecx= 77e64590 edx=003b0000 esi=0012f7d0 edi= 00000001 eip=0035151e esp=0012f7c0 ebp=0012f7dc iopl= 0 nv up ei pl nz na po nc cs=001b ss= 0023 ds= 0023 es= 0023 fs=003b gs= 0000 efl= 00000202 *** ERROR: Module load completed but symbols could not be loaded for C: \ Documents and Settings \ Administrator \ Desktop \ sample_safe.bin sample_safe+ 0x151e : 0035151e 03c3 add eax,ebx

Note that it hits a "break instruction"! Perfect!

We know that the original instruction was "jmp edx", and therefore the code is pointed at by edx. Sure enough, if we dump edx, we get something that looks like code:

0 : 000 > u edx 003b0000 55 push ebp 003b0001 89e5 mov ebp,esp 003b0003 83ec04 sub esp , 4 003b0006 56 push esi 003b0007 57 push edi 003b0008 53 push ebx 003b0009 e800000000 call 003b000e 003b000e 5b pop ebx

Perfect! We also know the length of the buffer, from the VirtualAlloc() call, so dump that many bytes to a file:

0 : 000 > .writemem c: \\ stage2.bin 0x3b0000 L0x30AD8 Writing 30ad8 bytes ................................................................................................ ..

And open the file in IDA—yup, that's code!

And thus we're finished stage1. Congrats!

Stage2: going raw

If you load Stage2 in IDA, it's going to complain that it isn't actually a PE file—it's raw code. That's fine—load it as raw 32-bit code.

If you scroll around (you may need to use 'c' to mark stuff as code), you'll see a small amount of code (with interspersed strings), followed by another block that looks encrypted/obfuscated, including something that looks suspiciously—but not exactly—like part of a PE header:

seg000 : 0000040 E db 54 h seg000 : 0000040 F db 68 h seg000 : 00000410 db 69 h seg000 : 00000411 db 73 h seg000 : 00000412 db 20 h seg000 : 00000413 db 0 Eh seg000 : 00000414 db 70 h seg000 : 00000415 db 72 h seg000 : 00000416 db 6 Fh seg000 : 00000417 db 67 h seg000 : 0000041 8 db 67 h seg000 : 0000041 9 db 61 h seg000 : 0000041 A db 6 Dh seg000 : 0000041 B db 87 h seg000 : 0000041 C db 63 h seg000 : 0000041 D db 47 h seg000 : 0000041 E db 6 Eh

Well, time to start all over!

By now, you know that my usual strategy is to let the program own itself, rather than spending a lot of time owning it. As a result, I don't really know how the obfuscation works; I just know how to bypass it!

If you're following along, I'm not going to be a ton of help on how to get the function readable. It's a mixture of "c" (for "code" sections), and "u" (to undefine non-code portions). After you see a short "call" that jumps over some weird looking code, that code probably needs to be undefined (or defined as an "a"scii string).

If you do everything right, it should wind up looking like this:

seg000 : 00000000 push ebp seg000 : 00000001 mov ebp , esp seg000 : 00000003 sub esp , 4 seg000 : 00000006 push esi seg000 : 00000007 push edi seg000 : 00000008 push ebx seg000 : 00000009 call $+ 5 seg000 : 0000000E pop ebx seg000 : 0000000F sub ebx , 40100 Eh seg000 : 00000015 mov eax , dword ptr fs : loc_2C + 4 seg000 : 0000001B mov eax , [ eax +0 Ch ] seg000 : 0000001E mov eax , [ eax + 1 Ch ] seg000 : 00000021 seg000 : 00000021 loc_21 : seg000 : 00000021 mov esi , [ eax + 8 ] seg000 : 00000024 cmp byte ptr [ eax + 1 Ch ], 18 h seg000 : 00000028 mov eax , [ eax ] seg000 : 0000002A jnz short loc_21 seg000 : 0000002C seg000 : 0000002C loc_2C : seg000 : 0000002C seg000 : 0000002C call loc_40 seg000 : 0000002C seg000 : 00000031 aGetProcAddress db ' GetProcAddress ', 0 seg000 : 00000040 seg000 : 00000040 seg000 : 00000040 loc_40 : seg000 : 00000040 push esi seg000 : 00000041 call sub_188 seg000 : 00000046 mov [ ebx + 4013 BCh ], eax seg000 : 0000004C call loc_5E seg000 : 0000004C seg000 : 00000051 aLoadlibrarya db ' LoadLibraryA ', 0 seg000 : 0000005E seg000 : 0000005E seg000 : 0000005E loc_5E : seg000 : 0000005E push esi seg000 : 0000005F call dword ptr [ ebx + 4013 BCh ] seg000 : 00000065 mov [ ebx + 4013 C0h ], eax seg000 : 0000006B call loc_80 seg000 : 0000006B seg000 : 00000070 aUnmapviewoffile db ' UnmapViewOfFile ', 0 seg000 : 00000080 seg000 : 00000080 seg000 : 00000080 loc_80 : seg000 : 00000080 push esi seg000 : 00000081 call dword ptr [ ebx + 4013 BCh ] seg000 : 00000087 mov [ ebx + 4013 C4h ], eax seg000 : 0000008D call loc_9F seg000 : 0000008D seg000 : 00000092 aVirtualalloc db ' VirtualAlloc ', 0 seg000 : 0000009F seg000 : 0000009F seg000 : 0000009F loc_9F : seg000 : 0000009F push esi seg000 : 000000A0 call dword ptr [ ebx + 4013 BCh ] seg000 : 000000A6 mov [ ebx + 4013 C8h ], eax seg000 : 000000AC call loc_BD seg000 : 000000AC seg000 : 000000B1 aVirtualfree db ' VirtualFree ', 0 seg000 : 000000BD seg000 : 000000BD seg000 : 000000BD loc_BD : seg000 : 000000BD push esi seg000 : 000000BE call dword ptr [ ebx + 4013 BCh ] seg000 : 000000C4 mov [ ebx + 4013 CCh ], eax seg000 : 000000CA seg000 : 000000CA loc_CA : seg000 : 000000CA push 4 seg000 : 000000CC push 3000 h seg000 : 000000D1 push 0 A00000h seg000 : 000000D6 push 0 seg000 : 000000D8 call dword ptr [ ebx + 4013 C8h ] seg000 : 000000DE test eax , eax seg000 : 000000E0 jz short loc_CA seg000 : 000000E2 mov [ ebp - 4 ], eax seg000 : 000000E5 push eax seg000 : 000000E6 lea eax , [ ebx + 4013 D0h ] seg000 : 000000EC mov ecx , [ eax + 4 ] seg000 : 000000EF add eax , ecx seg000 : 000000F1 push eax seg000 : 000000F2 call sub_313 seg000 : 000000F7 pop eax seg000 : 000000F8 pop eax seg000 : 000000F9 mov esi , [ ebp - 4 ] seg000 : 000000FC add esi , [ esi + 3 Ch ] seg000 : 000000FF mov edi , [ esi + 34 h ] seg000 : 00000102 mov eax , [ ebp + 10 h ] seg000 : 00000105 test eax , eax seg000 : 00000107 jnz short loc_114 seg000 : 00000109 mov eax , [ ebp +0 Ch ] seg000 : 0000010C dec eax seg000 : 0000010D test eax , eax seg000 : 0000010F jnz short loc_114 seg000 : 00000111 mov edi , [ ebp + 8 ] seg000 : 00000114 seg000 : 00000114 loc_114 : seg000 : 00000114 seg000 : 00000114 push edi seg000 : 00000115 call dword ptr [ ebx + 4013 C4h ] seg000 : 0000011B mov eax , [ esi + 50 h ] seg000 : 0000011E push 40 h seg000 : 00000120 push 3000 h seg000 : 00000125 push eax seg000 : 00000126 push edi seg000 : 00000127 call dword ptr [ ebx + 4013 C8h ] seg000 : 0000012D mov ecx , [ esi + 54 h ] seg000 : 00000130 mov esi , [ ebp - 4 ] seg000 : 00000133 rep movsb seg000 : 00000135 mov edi , eax seg000 : 00000137 push dword ptr [ ebp - 4 ] seg000 : 0000013A push edi seg000 : 0000013B call sub_1E9 seg000 : 00000140 push edi seg000 : 00000141 call sub_219 seg000 : 00000146 push edi seg000 : 00000147 call sub_28A seg000 : 0000014C push 8000 h seg000 : 00000151 push 0 seg000 : 00000153 push dword ptr [ ebp - 4 ] seg000 : 00000156 call dword ptr [ ebx + 4013 CCh ] seg000 : 0000015C mov eax , [ ebx + 4013 CCh ] seg000 : 00000162 lea ecx , [ ebx + 401000 h ] seg000 : 00000168 mov edx , [ edi + 3 Ch ] seg000 : 0000016B add edx , edi seg000 : 0000016D mov edx , [ edx + 28 h ] seg000 : 00000170 add edx , edi seg000 : 00000172 push edx seg000 : 00000173 push edi seg000 : 00000174 call sub_2E5 seg000 : 00000179 pop ebx seg000 : 0000017A pop edi seg000 : 0000017B pop esi seg000 : 0000017C leave seg000 : 0000017D push 8000 h seg000 : 00000182 push 0 seg000 : 00000184 push ecx seg000 : 00000185 push edx seg000 : 00000186 jmp eax

One of the first things I recommend doing it to re-base the program (using edit->segments->rebase or something like that). I re-based to 0x3b0000, because that's the offset that was allocated by VirtualAlloc() on my system, and therefore is where the in-memory version ended up.

Some reversing

The first part took me some time to figure out:

seg000 : 003B0015 mov eax , large fs : 30 h seg000 : 003B001B mov eax , [ eax +0 Ch ] seg000 : 003B001E mov eax , [ eax + 1 Ch ] seg000 : 003B0021 seg000 : 003B0021 loc_3B0021 : seg000 : 003B0021 mov esi , [ eax + 8 ] seg000 : 003B0024 cmp byte ptr [ eax + 1 Ch ], 18 h seg000 : 003B0028 mov eax , [ eax ] seg000 : 003B002A jnz short loc_3B0021

I actually googled parts of this, and eventually found an identical function online. Its purpose was to get a handle to the in-memory version of kernel32.dll. Sweet!

You'll then see this code:

seg000 : 003B002C loc_3B002C : seg000 : 003B002C call loc_3B0040 seg000 : 003B002C seg000 : 003B0031 aGetProcAddress db ' GetProcAddress ', 0 seg000 : 003B0040 seg000 : 003B0040 seg000 : 003B0040 loc_3B0040 : seg000 : 003B0040 push esi seg000 : 003B0041 call find_function seg000 : 003B0046 mov [ ebx + test .addr _GetProcAddress ], eax

(Note that I defined a struct for test.addr_GetProcAddress—it involves generous use of the 'structs' tab in a way it was never intended to be used in IDA).

The find_function() function was actually a guess that turned out to be right. This sequence of code gets a handle to the GetProcAddress() function, and stores it on line 0x3b0046.

Then there are a bunch of sequences that basically look like:

seg000 : 003B004C call loc_3B005E seg000 : 003B004C seg000 : 003B0051 aLoadlibrarya db ' LoadLibraryA ', 0 seg000 : 003B005E seg000 : 003B005E seg000 : 003B005E loc_3B005E : seg000 : 003B005E push esi seg000 : 003B005F call [ ebx + test .addr _GetProcAddress ] seg000 : 003B0065 mov [ ebx + test .addr _LoadLibraryA ], eax

Basically, it calls GetProcAddress() with "LoadLibraryA" as a parameter, and stores the result. It does this for a bunch of functions—basically, get pointers to a host of useful functions:

GetProcAddress

LoadLibraryA

UnmapViewOfFile

VirtualAlloc

VirtualFree VirtualAlloc(), as you'll recall, was used in the last section to allocate space for decrypted memory. At this point, we can guess that it does the exact same thing again! Sure enough, it allocates memory; but surprisingly, it's not executable! Here's the call:

seg000 : 003B00CA loc_3B00CA : seg000 : 003B00CA push 4 seg000 : 003B00CC push 3000 h seg000 : 003B00D1 push 0 A00000h seg000 : 003B00D6 push 0 seg000 : 003B00D8 call [ ebx + test .addr _VirtualAlloc ] seg000 : 003B00DE test eax , eax seg000 : 003B00E0 jz short loc_3B00CA

Note how it keeps attempting to allocate memory until it works. It's shit like this, malware...

Anyway, the memory is allocated!

Then a function is called:

seg000 : 003B00E2 mov [ ebp - 4 ], eax seg000 : 003B00E5 push eax seg000 : 003B00E6 lea eax , [ ebx + test .field _4013D0 ] seg000 : 003B00EC mov ecx , [ eax + 4 ] seg000 : 003B00EF add eax , ecx seg000 : 003B00F1 push eax seg000 : 003B00F2 call sub_3B0313

The "encrypted data"—which, as we saw earlier, looks suspiciously like a PE file—is passed in, along with the allocated memory.

A fairly complex function is called, that I looked through but didn't reverse. It's complicated, but ultimately harmless.

Active analysis

With clever use of breakpoints and sweating bullets, I let that function run. If you're interested, this is how I did it: run sample_safe.bin in windbg; when the breakpoint fired, I moved eip to where the jump would have gone using " r eip=edx " in windbg; I set a breakpoint on line 0x3b00f7 using " bp 0x3b00f7 "; I used " g " to continue the program; and bob's your uncle.

Running malware like this, once again, is *dangerous*! If you're following along, please be careful!

Anyway, once that function finishes, I check out the allocated memory:

0:000> db 900000 00900000 4d 5a 90 00 03 00 00 00-04 00 00 00 ff ff 00 00 MZ.............. 00900010 b8 00 00 00 00 00 00 00-40 00 00 00 00 00 00 00 ........@....... 00900020 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................ 00900030 00 00 00 00 00 00 00 00-00 00 00 00 e8 00 00 00 ................ 00900040 0e 1f ba 0e 00 b4 09 cd-21 b8 01 4c cd 21 54 68 ........!..L.!Th 00900050 69 73 20 70 72 6f 67 72-61 6d 20 63 61 6e 6e 6f is program canno 00900060 74 20 62 65 20 72 75 6e-20 69 6e 20 44 4f 53 20 t be run in DOS 00900070 6d 6f 64 65 2e 0d 0d 0a-24 00 00 00 00 00 00 00 mode....$.......

It's a PE file! w00t!

So thinking that's a length might just be right! I dump the file to see if IDA recognizes it:

0 : 000 > .writemem "c: \\ stage3.bin" 0x900000 L0x00020a00 Writing 20a00 bytes ..................................................................

Read it with IDA, and confirm that it's a valid PE.

But... there's more code after the PE is decrypted. What's going on?

Going above and beyond in stage2

So, this part is strictly unnecessary to figuring out how the malware works. I was simply curious, and wanted to make sure that nothing weird was going on.

Some variables are moved around after the decryption, but at this point I'm carefully stepping through with a debugger. I find this code:

seg000 : 003B0114 push edi seg000 : 003B0115 call [ ebx + test .addr _UnmapViewOfFile ]

And determine that edi is pointing to stage1. So, stage1 is unloaded.

Then, some executable memory is allocated:

seg000 : 003B011E push 40 h seg000 : 003B0120 push 3000 h seg000 : 003B0125 push eax seg000 : 003B0126 push edi seg000 : 003B0127 call [ ebx + test .addr _VirtualAlloc ]

Some incredibly complicated functions are called. I surmised—correctly—that these are taking the PE file in memory—that we just decrypted—and preparing it to be run. Basically, do the relocations and other stuff that is involved with making a PE file actually runnable.

Then, the decrypted PE file—the version that came before it was actually relocated, that it just finished relocating—is freed:

seg000 : 003B0153 push dword ptr [ ebp - 4 ] seg000 : 003B0156 call [ ebx + test .addr _VirtualFree ]

Then finally, this sequence is found:

seg000 : 003 B0174 call sub_3B02E5 seg000 : 003 B0179 pop ebx seg000 : 003 B017A pop edi seg000 : 003 B017B pop esi seg000 : 003 B017C leave seg000 : 003 B017D push 8000 h seg000 : 003 B0182 push 0 seg000 : 003 B0184 push ecx seg000 : 003 B0185 push edx seg000 : 003 B0186 jmp eax

This is actually really cool. It calls sub_3b02e5, which returns a pointer to VirtualFree(). On line 0x3b0186, VirtualFree() is jumped to. That leads to two questions: why a jmp and not a call? And if VirtualFree() only takes a single argument, what's with all the other pushes?

Well, here's what's happening: a called function returns to whatever's on the top of the stack. If you jmp to a function that expects to be called, it returns to the last thing pushed onto the stack. Since that happens to be edx (pushed at 0x3b0185), that becomes the return address.

(edx happens to be the entrypoint of the new .dll file, stage3)

The next parameter above it—ecx, pushed at 0x3b0184—is the parameter to VirtualAlloc(). It's the starting address of the current code—0x3b0000. And finally, the other two arguments—0x0000 and 0x8000—are the arguments that the entrypoint of the .dll file expects to receive.

To summarize: this piece of code frees itself, then returns into the loaded .dll file. That's really cool!

Very little malware will actually clean up after itself like we see here. This tells me that the malware was written by somebody who actually cares about code quality. I'm impressed!

Stage3: The final frontier

Stage3 is actually pretty straight forward, although it does a lot of stuff that I haven't actually reversed. I've also made a lot of educated guesses on how it works that I've validated. If you're following along, this is in stage3.bin.

Essentially, it's a compressed payload stored in a PE resource. Let's look at what that means...

First off, look at the 'strings' window (shift-f12 in IDA). Looking at the strings window is almost always the first thing I do, with malware and also legit software. In this case, you'll see some interesting strings:

.rdata : 100054A4 aAplibV1_01TheS db ' aPLib v1 . 01 - the smaller the better :)',0 Dh ,0 Ah .rdata : 100054A4 db ' Copyright ( c ) 1998 - 2009 by Joergen Ibsen , All Rights Reserved .',0 Dh ,0 Ah .rdata : 100054A4 db 0 Dh ,0 Ah .rdata : 100054A4 db ' More information : http :// www .ibsensoftware.com /',0 Dh ,0 Ah .rdata : 100054A4 db 0 Dh ,0 Ah , 0

Immediately, I know it's using compression. That's handy! If you follow the DllEntryPoint() function to its calls, you'll quickly find this:

.text : 10001C2E .text : 10001C2E sub_10001C2E proc near .text : 10001C2E .text : 10001C2E hModule = dword ptr 8 .text : 10001C2E arg_4 = dword ptr 0 Ch .text : 10001C2E .text : 10001C2E push ebp .text : 10001C2F mov ebp , esp .text : 10001C31 push esi .text : 10001C32 push 0 Ah .text : 10001C34 push 65 h .text : 10001C36 push [ ebp + hModule ] .text : 10001C39 call ds : FindResourceA .text : 10001C3F mov esi , eax .text : 10001C41 test esi , esi .text : 10001C43 jz short loc_10001C9E .text : 10001C45 push edi .text : 10001C46 push esi .text : 10001C47 push [ ebp + hModule ] .text : 10001C4A call ds : SizeofResource .text : 10001C50 mov edi , eax .text : 10001C52 test edi , edi .text : 10001C54 jz short loc_10001C9D .text : 10001C56 push ebx .text : 10001C57 push esi .text : 10001C58 push [ ebp + hModule ] .text : 10001C5B call ds : LoadResource .text : 10001C61 mov ebx , eax .text : 10001C63 test ebx , ebx .text : 10001C65 jz short loc_10001C9C .text : 10001C67 push ebx .text : 10001C68 call ds : LockResource .text : 10001C6E mov esi , eax .text : 10001C70 test esi , esi .text : 10001C72 jnz short loc_10001C78

Note the calls—FindResourceA(), SizeofResource(), LoadResource(), and LockResource(). If you're interested in what these are doing exactly, you can find plenty of info in MSDN. But suffice to say, it loads a resource from the PE, identified by the value passed into FindResourceA()—resource 0x65 (101). If you load a resource viewer—such as PEExplorer, you can view the resource section and dump resource 0x65 into a file. That file looks like:

$ xxd -g1 stage4_compressed . bin | head 0000000 : 41 50 33 32 18 00 00 00 a1 c9 01 00 0b e4 d7 66 AP32 ........... f 0000010 : 0b 51 03 00 f2 8d 91 b3 0b 38 51 03 1c 49 01 38 . Q ....... 8Q .. I . 8 0000020 : 37 b7 0e 0f 8c 07 09 7b d0 1a 01 be bc 55 1c 8b 7 ...... { ..... U .. [...]

The file starts with AP32, and earlier we saw a compression library called "aPLib" referenced. Compressed payload anyone?

As of the writing, you can download the official AP32 sample application here. You can unpack it with the appack.exe utility:

$ ./appack.exe d ./stage4_compressed.bin stage4.bin =============================================================================== aPLib example Copyright (c) 1998-2009 by Joergen Ibsen / Jibz All Rights Reserved http :/ /www.ibsensoftware.com/ =============================================================================== decompressed 117177 -> 217355 bytes in 0.00 seconds

(That tool is super buggy, you might have to move directories and stuff to get it to work; it's just a sample, after all)

Decompressed... now what?

Once decompressed, it looks like:

0000000 : 0b 51 03 00 49 01 00 00 b7 03 00 00 0b 07 00 00 .Q..I........... 0000010 : 0b 7b 01 00 00 00 00 00 00 00 00 00 00 00 00 00 . { .............. 0000020 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000030 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000040 : 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...]

Hmm.. it's clearly not compressed/encrypted, but it doesn't look like anything special.

If you scroll around, you'll quickly find a PE header. If you scroll down a lot more, you'll find another PE header. After looking at the numbers in the first 20 bytes, that they are offsets into the file as well as lengths. Those offsets represent PE files! Note that I haven't even *started* reversing the code that processes this—and I never did—I simply determined all this by simple guessing and checking.

Here are the first few int32 values (note: little endian), and what they seem to mean:

0x0003510b - the length of the entire file

0x00000149 - at this offset, there's some raw code— 55 8b ... (a raw binary function)—I initially assumed a 16-bit version, but that doesn't seem likely

(a raw binary function)—I initially assumed a 16-bit version, but that doesn't seem likely 0x000003b7 - at this offset, nothing special, but it appears to be code. Not sure what its deal is...

0x0000070b - A proper PE file, that we're gonna call "stage4.bin"

0x00017b0b - Another PE file—I guessed that this is a 64-bit version of the same thing, which seems likely—upon inspection, it has the same imports/strings

0x00000000

I called the file at 0x70b the actual payload. There are also some loader functions and a 64-bit payload that I'm going to ignore.

Odds and ends of stage3

If you want to know more about stage3, keep reading! This section is very light—the code is complex, and doesn't really add much, so I'm going to give a quick high-level overview of it.

It actually creates a .dll file whose name is based on the harddrive serial number and an implementation of a standard pseudo-random number generator. This means that, if installed on the same machine, the .dll will have the same name.

It injects the .dll file into every running process, by the looks of it.

It puts a lot of effort into determining whether to use the 64- or 32-bit version for each running executable (including correctly detecting the use of Wow64).

Once again, because of the cleanness and the fact that it handles 32- and 64-bit systems, as well as Wow64 processes, appropriately, I feel like this was written by somebody who clearly knows what they are doing.

Conclusion

Once you extract the 32-bit .dll file from the de-compressed data, you now have what I called stage4.bin. This is the final stage, and does the actual malicious functionality. As I said initially, I haven't reversed it. But if you look at it in IDA, you'll see a ton of command-and-control-like functionality. It contacts servers over HTTPS, it modifies Web sites, and lots more interesting stuff.

When I have more time, I'll look at it in more detail!

Hope you enjoyed this!