Progressing forward with my results from yesterday I was able to get most of the data I cared about in a JSON format. Having the JSON for each grouping of data was great, but didn’t really do me any good because I could never get it into MongoDB the way I wanted without doing some crazy queries. Instead what I needed was one object made up of all the data. Doing so required me to chain a couple tools together, but everything seems to work great.

I am releasing the tool now because it ultimately produces a single object with a good deal of information. All further updates will just add more functionality and more data to the object without changing the core use of the tool. As of right now the tool produces a single JSON object with the following:

structure filename header isPdf version total entropy count after EOF count characters after last EOF stream entropy non-stream entropy all name calls count name hexdecoded count dates

hashes file md5 sha1 sha256 object id version length md5

score primary secondary total

scans wepawet virustotal results scan url date scanned



In the next few days I plan on adding the following features and data objects:

Submit file for re-scan to VirusTotal (current method pulls last scan)

Submit a zip file full of files

Store decoded object data within structure

Store generic exploits that the file uses

Identify objects that hold the malicious content (label them)

All progress can be tracked on the tool through here.