If you are interested in PDF file analysis we might soon have something for you. We have developed a nifty little application that can not only parse PDF files but also help you analyze them very quickly. The main features include:

The ability to view PDF files as content trees as well as hex data.

Decode and display embedded JavaScript.

Refactoring functionality for JavaScript code, for example for variable renaming.

An integrated JavaScript interpreter for malicious script debugging.

An extensible Adobe Reader emulator to simulate arbitrary versions and configurations of Adobe Reader.

Intercept all called functions to log calls or modify arguments and return values.

Automated exploit recognition.

To see it all in action, you can watch a preview video by clicking this link.

There are a few things to explore in the next weeks:

We will improve the PDF parser.

We will add more JavaScript refactoring functions.

We need to figure out how to limit memory to scripts because if your script gets heap-sprayed, the PDF analysis tool will get heap-sprayed too which is uncool.

We will add a plugin API so that you can automatically process large quantities of files. This should be very useful for everyone who wants to do batch analysis of PDF files.