Analysing CryptoLocker with unpack.py: Network Communications (part 3)

Previous posts in this series have demonstrated how unpack.py, when used on a CryptoLocker variant, extracts the malicious PE file injected in to explorer.exe, and how it can also be used to analyse the injected PE file. This post demonstrates how unpack.py can now be used to analyse our CryptoLocker variant’s network communications by dumping the cleartext traffic sent over its HTTPS connections.

Background

I’m investigating whether my automated unpacking and analysis script (unpack.py) is still useful, by using it to analyse a malware sample (a CryptoLocker variant) which was created after my script was. The first two posts in this series (Initial Analysis and The unpacked payload) identified and analysed some unpacked code which was used to overwrite another process. Let’s see if we can do anything with our malware sample’s encrypted network communications.

I’ve been musing, for some time now, about the possibility of extracting cleartext from SSL/TLS connections by examining the memory buffers being passed to the encryption/decryption API calls. My main problem was finding information on what API calls were actually used to encrypt/decrypt traffic for TLS/SSL connections.

It turns out that EncryptMessage() and DecryptMessage() (from secur32.dll) are used to encrypt and decrypt data sent and received over TLS/SSL (and hence over HTTPS) connections. I’ve added extra code to unpack.py to hook these calls and log the cleartext data passed to/received from them — this gives us the cleartext data sent and received over HTTPS connections. Before we start on that though, let’s go back to basics to analyse our malware sample’s network traffic.

First we’ll need to create a fake Internet environment so that our malware sample thinks it has Internet access, without us having to give it Internet access. Before anything can connect to a host on the Internet, it has to either use the destination host’s IP address, or resolve its host name.

Faking Internet servers

If you remember back to part two, we saw this sample spitting out DNS queries for weird looking host names, so before we see any connections from the sample, those DNS queries are going to have to succeed and give the sample an IP address to connect to.

There is a handy way of making these DNS queries succeed without giving the malware sample access to the Internet — dnsspoof. dnsspoof is part of Dug Song’s dsniff package.

dnsspoof will take a hosts file and will respond to DNS queries for any hosts mentioned therein. If a hosts file isn’t specified, then dnsspoof will answer all DNS queries with the local machine’s IP address. This means that we can create a hosts file for the whole DNS domain that the malware was issuing DNS queries for (dnsspoof will take wildcards in the hosts file), and specify an IP address of a host which can handle subsequent connections. Alternatively, we can not specify a hosts file in which case the malware sample will attempt to connect to the local machine. Which approach you take will depend on available hardware/resources and your appetite for risk.

Now, if we get dnsspoof to respond to the malware sample’s DNS queries with the IP address of the local machine, the IP address of an existing machine on the local network, or an IP address that needs to be routed via the local gateway, then we will be able to see what connections the malware sample is trying to make. If we don’t use one of those options for the DNS reply then we will more than likely not see a connection attempt because the malware machine will need to ARP to find the MAC address to which the IP packet should be sent and we won’t see an IP packet unless ARP succeeds.

I’m going to run dnsspoof without a hosts file so that it returns the local machine’s IP address in response to all DNS queries, but I will specify an interface for dnsspoof to listen on so that dnsspoof only responds to DNS queries on my sandbox network. I’m also going to use the Linux iptables command to block all inbound connections on the sandbox network. This is because I don’t yet know what connections the malware sample is going to make. I want to see that first, then I’ll allow the connections once I’ve configured some appropriate software to handle them. I’ll allow the sample to connect to netcat (or some other monitoring software), for instance, but I don’t want it connecting to Apache (or anything real which may give information out, or present vulnerabilities to the malware).

Let’s start dnsspoof and then run the 1300.exe sample again with unpack.py and see what happens when the sample receives a valid DNS response to its DNS queries. Attempt to run dnsspoof as a non-root user, as that’ll limit what dnsspoof can do should it be exploited.

That IP address (555.25.21.1 — obviously obfuscated) is the local host’s IP address of the specified network interface. Now let’s run the malware sample again and see what we get. Now I’m running this on Windows 7, which is 64-bit. The malware sample (1300.exe) is 32-bit and the default Python installation is 64-bit, so I need to explicitly run it with the 32-bit Python installation:

c:\Python27_32\python.exe .\unpack.py 1300.exe > inf.log 2> inf-stderr.log

If you launch unpack.py using the 64-bit version of Python, then you’ll see unhandled EXCEPTION_WX86_BREAKPOINT exceptions, presumably because this is a 32-bit sample.

Firstly, let’s look at the network traffic. We notice that dnsspoof logged DNS queries that it saw and responded to. Note that the source address and domain name have been obfuscated, and that the destination address (of the DNS server) has been changed to that of Google’s public DNS server:

555.25.21.3.1026 > 8.8.8.8.53: 42496+ A? itylixelo.domain.malicious 555.25.21.3.1026 > 8.8.8.8.53: 20302+ A? iwybojiju.domain.malicious 555.25.21.3.1026 > 8.8.8.8.53: 15587+ A? ohibac.domain.malicious

Looking at the packet capture we see:

So we can see from that output that it is just dying to open a connection to port 443/tcp on whichever host belongs to those weird host names in the DNS queries. In that packet capture we can see three attempts to resolve a host name and connect to it on port 443/tcp. If we look in unpack.py‘s log output, we see a pattern of two calls to InternetOpen() (from 0x771bafca and from 0x917d5) repeated three times:

Those flags that are passed as the fifth argument to InternetOpen() (the 0x8404c700) correspond to the following constants from wininet.h:

INTERNET_FLAG_RELOAD | INTERNET_FLAG_NO_CACHE_WRITE | INTERNET_FLAG_NO_AUTH | INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTP | INTERNET_FLAG_IGNORE_REDIRECT_TO_HTTPS | INTERNET_FLAG_HYPERLINK | INTERNET_FLAG_NO_UI | INTERNET_FLAG_PRAGMA_NOCACHE

Right, let’s move on to something more interesting and give our malware sample something to connect to. First though, we’ll tweak our security a tad:

Then we’ll run netcat to listen on port 4443/tcp which, because of our iptables NAT rules will receive packets destined for port 443/tcp:

nc -l -p 4443 �L3 ��� @db��c��9�!-0�R(��

As expected, we see binary data come out from netcat. This is because our malware sample is attempting to open an HTTPS (port 443/tcp) connection which involves establishing SSL/TLS. SSL and TLS exchange binary data, not ASCII data. The binary data is in fact the SSL client hello message, which can be seen in a packet capture (Wireshark will decode the non-encrypted SSL/TLS data). To get any further with this, we need a valid SSL/TLS server — enter OpenSSL.

OpenSSL has the ability to act as an SSL/TLS client, and as an SSL/TLS server. SSL/TLS servers require a certificate though (to tie a public key used for encryption, with an identity — that of the server and server owner), so we’ll have to create one before we can go much further. Conveniently, OpenSSL can also help us with that. Here’s how we can generate a self-signed certificate for an SSL server — it’s not going to be trusted by anything, but should suffice for what we want to use it for:

C’est bon. Now we can start OpenSSL‘s SSL/TLS server:

OpenSSL‘s s_server command defaults to listening on port 4433/tcp, so we could have just translated port 443/tcp to port 4433/tcp and then we wouldn’t have to add ‘-accept 4443’ to OpenSSL‘s command line.

Run the malware sample (1300.exe) again and this time you should see connections to port 443/tcp in the packet capture. These connections are being translated by the iptables NAT rules to port 4443/tcp which is where we’re running OpenSSL‘s SSL/TLS server.

Extracting the cleartext data

Now tcpdump can’t see the cleartext data that is sent in that connection because, well, that is the purpose of SSL/TLS. Also, I couldn’t find a way of getting OpenSSL to behave like netcat and to actually dump the cleartext data either. All is not lost however, remember that unpack.py has some new functionality — the ability to dump data buffers before encryption and after decryption. Looking at unpack.py‘s output we notice that it is now logging calls to EncryptMessage() (along with some debug messages which I didn’t remove):

unpack.py has also created some new files:

Now, these files are the contents of the cleartext memory buffers passed to the EncryptMessage() API call being used to encrypt data destined for an SSL/TLS connection and, if you’re anything like me, you’ll be dying to know what’s in them, so let’s have a look:

POST /topic.php HTTP/1.1 Accept: */* Host: ezutiduduri.domain.malicious Content-Length: 160 Cache-Control: no-cache ̫�qd�U�*�+�T�i����

Hmm — a HTTP request containing binary data. Let’s see if we can extract the binary data from each of those encmsg files:

I should probably explain that. The binfile assignment is simply (or not, depending on your level of UNIX experience I guess) modifying the ‘1300.exe.encmsg0x…’ file names by removing everything up to and including the ‘.encmsg’ string (sed ‘s/^.*encmsg//’), and prepending the string ‘bin-‘ to the result (“bin-). This variable now holds the output file name for this particular encmsg file that the for loop has assigned to the f variable. Oh, I should probably also mention that all of my UNIX scripting is Bourne/bash shell — if you run that same script snippet in a csh, or tcsh, you’ll more than likely encounter problems (that’s something that I should have probably mentioned on a number of other posts too).

As for the second sed command, I’m actually surprised that it worked to be honest. The encmsg files contain the HTTP requests that the malware sample issued, before they were encrypted and sent over the SSL/TLS connection. HTTP requests contain headers which are terminated by a blank line. These are normally easy to remove, and a csplit command can usually be used to separate everything up to and including the first blank line, leaving us with the HTTP data.

These files (HTTP requests) however, contain binary data. The second sed command is an attempt to remove all of the lines up to and including the first line that contains only a carriage return character.

Using ‘/^$/’, that is, the regular expression to match a blank line (enclosed in ‘//’ so that csplit treats it as a regular expression), didn’t work so I examined the encmsg files and noticed that the HTTP headers were using the MS-DOS end-of-line convention of carriage return/line feed. The ‘\r’ matches the carriage return character and is necessary because UNIX uses a single line feed character to denote the end of a line and hence treats the carriage return as just another character that just happens to appear on the end of every line.

I didn’t want to remove carriage-returns with something like ‘tr -d \015’ because that would have also removed any such characters from the binary data, and we don’t want to modify the binary data.

Right — let’s see what that’s given us:

So all of those files except the last one, are the same. Although unpack.py dumps the EncryptMessage() and DecryptMessage() memory buffers to the encmsg and decmsg files (respectively), it does so by appending the memory buffer to a file which is named based only on the memory buffer address and process id. That means that multiple EncryptMessages() (or DecryptMessages()) that use the same memory buffer in the same process will have their memory buffer contents concatenated together. Is that why the last file in the list above is different to the others, and also larger in size? Let’s have a look at it:

Bingo — notice the start of another HTTP request at offset 160 (0xa0)? Also note that 160 is the ‘Content-Length:’ mentioned in the HTTP headers, so anything after offset 160 has been appended to the original request. That is, it will be a second HTTP request that unpack.py has appended to the first because they were encrypted from the same memory buffer in the same process (and were hence written to the same file). Let’s separate the two and process the second request:

Smashing — that is the same as the other ‘bin-‘ files. As for what the contents of the binary data actually means, that is going to require manual analysis of the malware sample.

Conclusion

Now it would be interesting to allow this sample to communicate with the Internet so we can see the response from the server, however, it’s taken me so long to find the time to analyse this sample and now the domain name that it is looking for, while still showing up in whois, no longer resolves (the parent domain does not have an NS record for it).

Still, this final post in the series managed to demonstrate how unpack.py can be used to dump SSL/TLS data before encryption, and how you can trick malware samples in to talking to you rather than to their intended target.

So there you have it. My unpack.py script is still useful. This series has shown how it can:

Part 1: Initial Analysis

Extract unpacked code and identify the unpacking loop

Dump memory buffers that were decompressed using RtlDecompressBuffer()

Identify process hollowing where a process is created in a suspended state, overwritten, and then resumed

Part 2: The unpacked payload

Show new processes being created along with their command line

Dump memory buffers that are decrypted using EncryptDecrypt()

Dump memory buffers that are written to other processes using WriteProcessMemory()

Provide information useful for further manual analysis using a debugger

Part 3: Network Communications

Dump SSL/TLS data before it is encrypted (it also dumps received data after decryption but this post was unable to demonstrate that)