Edit 27-5-2015: I added a download link to the memory dump of the machine I used in this article for others to learn from and play around with volatility.



On April 1st FireEye released a report on “MWI” and “MWISTAT” which is a sort of exploit kit for Word Documents if you will: A New Word Document Exploit Kit

In the article FireEye goes over MWI which is the short for “Microsoft Word Intruder’ coded by an actor going by the handle ’Objekt’. MWI is a ‘kit’ for people to use for spreading malware. It can generate malicious word document exploiting any of the following CVE’s:

CVE-2010-3333

CVE-2012-0158

CVE-2013-3906

CVE-2014-1761

The builder, named MWI, generates these documents which call back to a server to download malicious payloads. Together with the MWI builder the author has also released MWISTAT; a statistics backend and optional downloader component for MWI documents to track campaigns and spread of the documents.





In this article my analysis focusses on the following sample: OZLA155_C0E6C24A_X234_investigate.doc (MD5: c0f7fd333131ceca4292419e207f83fc) VirusTotal & Malwr (Downloadable).

If you would like to do the same analysis just grab the sample from Malwr which has been made downloadable. More downloable samples have been listed at the end of this blog entry; check out those if you want to do some more MWI sample analysis.





Analyzing an MWI generated document

I will start my analysis by looking at VirusTotal, looking at the hits we are told it is CVE2012-0158, CVE-2012-2539, CVE-2013-3906 and CVE-2014-1761. Besides a mismatch on the list presented by FireEye it looks like we have a multi-exploit document. Another security researcher named @ropchain responded to a tweet I send about the multi-CVE identifier from VT with a good clue: [

](https://twitter.com/ropchain/status/584036788550967296)



It seems embedding multiple vulnerabilities for RTF’s is possible. Let’s open the document and see what kind of document we are dealing with (OLE or RTF), opening up the document shows the following first line:



RTF it is, the brackets { and } define a group in the RTF file format, these groups can be nested within each other. A backslash \ is followed by a so called ’control code’, in our case the document starts with a group indicator bracket and the RTF control code ’\rtf’. This control code indicates the start of an RTF file, the number behind it indicates the specification this RTF file will follow.



Now that we know we have an RTF document we can start working out how it performs communication and loads the malicious payload on the affected machine.





MWISTAT communication

The MWI samples communicate to a backend called MWISTAT. This backend makes statistics of those who open the malicious documents as well as give the required payload based on a certain identifier when an exploit is succesful.

The statistics check-in looks like this in Wireshark:

The server responds with a 1x1 white JPEG image. Nothing interesting happens from this for the rest.

The second step after doing the statistic check-in and succesfully exploiting the user’s Microsoft Word installation is downloading a payload. This request looks similar only an extra GET parameter is added named ’&act=1’. The server responds to this request (if it is correct in terms of identifier) with a plain Windows PE, no encryption is used anywhere here as seen in Wireshark:

MWISTAT communication: Statistics check-in

Time to find the origin of both of these requests; first we will go after the statistics check-in which is quite easy to find. Simply hitting CTRL-F for the URL in the RTF file shows the following:

Based on the control code names we can guess what happens here, a remote picture is loaded into a picture field. This is standard behaviour and documented on a lot of places (except for Microsoft official ones, I couldn’t find any). However, all documentation specifies that local paths can be used not URLs as seen here; but it works anyway.

The fact that this include method is standard behaviour does mean that this check-in will always happen, you don’t have to be running a vulnerable version of Word. It means the bad guys are getting some proper stats on who opens their payloads with what version of office. Would a new vulnerability for Word hit the web they could quickly check if its 'worth it’ to implement based on the amount of hits for that specific (newly) exploitable version of Office/Word.

The User-Agent tag for Microsoft Word is Microsoft Office-wide determined. The following is a list of the Office versions with their User-Agent tag:

Microsoft Office 2007: ’MSOffice 12’

Microsoft Office 2010: ’MSOffice 14’

Microsoft Office 2013: ’MSOffice 15’

From the user-agent in the PCAP we can see that my virtual machine was running Microsoft Office 2007; quite vulnerable as it was unpatched.

Now the fact that they would have a lot of hits for a specific version of Office doesn’t always mean it is vulnerable; people can still apply pathes of course.



MWISTAT communication: Payload retrieval

Now we need to find the actual payload retrieval request. The problem here is that it will not be present just plain in the RTF, its done after Word is exploited in some way; this is also the reason why the 'MSOffice’ User-Agent tag is missing as it is not Word itself downloading the payload but the attacker’s shellcode.

I decided to not analyze the whole exploit chain in the document, there is evidence of multiple exploits. The easiest one would you want to follow one is CVE-2015-1761 located all the way at the bottom of the file, you can follow a writeup by McAfee running through it:[A Close Look at RTF Zero-Day Attack CVE-2014-1761 Shows Sophistication of Attackers]

I waited for the document to exploit the vulnerable Office installation in the virtual machine, when the screen looked like this I made a memorydump:

The memorydump was made with MoonSol’s DumpIt tool, you can get it here: [MoonSols DumpIt goes mainstream] the download link is at the bottom.



You can download the memorydump I analyzed in this blog entry here: MWI_exploited_machine.dmp



After making the dump the first step is to check if there is any injected code in Word specifically as this is what is being exploited. For this I use Volatility which is a Python based memory forensics analysis tookit. I grabbed the dump and ran the following volatility command ’vol.py -f MWI_exploited_machine.dmp –profile=Win7SP1x86 -D ./ malfind’; the ’malfind’ argument makes volatility check for any injected code in the with the with ’-f’ flag specified memory dump and mapping the memory for a Windows 7 Service Pack 1 32bit machine in order to read it (specified with the ’–profile’ flag). The -D flag specifies where to dump any injected code it finds which I set to the current working directory. The output shows some interesting hits, injected code in ’WINWORD.exe’ which is the Microsoft Word process:



Time to find our C2 download request. The C2 payload retrieval structure is the same as for the stats check-in, so simply running a grep for a section of the URL in the dumped code pinpoints the dump we want: grep -nHa image.php process.0x*.0x*.dmp:

With orange underlining we see the specific dumped injected code filename; red underlined is the C2 request URL for the payload and the green line seems to indicate a filepath most likely for the payload when downloaded to disk. We now know the specific dump containing the request to the C2, this dump also shows something else that is quite interesting to know; the filepath where the downloaded payload goes which is ’C:\Users\user1\AppData\Local\Temp..

txobj.exe’ in this case. If you load the dump in IDA you can also see some artifacts from some import resolving shellcode to grab imports from kernel32 it seems:

We can also spot the payload URL in IDA as we did in the raw dump of course:

And there we go, we found the payload download as well as (with some luck) the payload location on the affected machine without diving into the exploit itself.

On the C2 my infected VM also shows up with the 'OPEN’ state which is the term used to describe the statistics check-in and the 'LOAD’ state which is the payload retrieval state:



While the concept of MWI is simple it seems to work quite well. Sadly I cannot say much about the popularity of the MWI kit as I don’t know all the customers from Objekt using it. All I can say is that since December last year some samples having been appearing more and more often.





Detecting MWI

Detecting MWI payloads is quite easy, both on network and filesystem level. For filesystem level I’ve created a YARA rule which flags the statistics check-in RTF tags:



For detection on network level the company I work for (Fox-IT) has released Snort IDS rules: [fox-srt / fox-srt-mwi.rules] Snort coverage for Microsoft Word Intruder







MWI Samples

The following is a list of samples I used while researching MWI. The samples listed here come with a link to Malwr containing a downloadable sample. The sample listed at the top is the sample used in this article for analysis also listed at the top.

