

Windows 10 prefetch files (*.pf) show a different file format compared to previous ones. At first glance you'll spot no textual strings inside, and this was the initial reason that make me try to understand how they changed.





quick&dirty journey

calculator

MAM

MEM O

MEM 0

ReWolf

MEM

compressed

RtlDecompressBuffer

an actionable lead





windbg inside a Windows 10 virtualized guest (Pro Insider Preview 10.0.10074) and I put a couple of bp on RtlDecompressBuffer and RtlDecompressBufferEx functions: the target process is the SuperFetch process, the svchosted sysmain.dll . To be short, I landed on the moon, which in the case is the sysmain! SmDecompressBuffer method: in the next picture you can see some green boxes (I signed following the decompression branch) and some yellow boxes (checksum check, more on this later).





The routine core is represented in the next picture, where you can (could, if I correctly re-sized the image) easily spot the call to the target method RtlDecompressBufferEx.



I launchedinside a Windows 10 virtualized guest (Pro Insider Preview 10.0.10074) and I put a couple ofon RtlDecompressBuffer and RtlDecompressBufferEx functions: the target process is the SuperFetch process, the. To be short, I landed on the moon, which in the case is themethod: in the next picture you can see some green boxes (I signed following the decompression branch) and some yellow boxes (checksum check, more on this later).The routine core is represented in the next picture, where you can (could, if I correctly re-sized the image) easily spot the call to the target method

preceding

RtlDecompressBufferEx

//

// Compression package types and procedures.

//

#define COMPRESSION_FORMAT_NONE (0x0000) // winnt

#define COMPRESSION_FORMAT_DEFAULT (0x0001) // winnt

#define COMPRESSION_FORMAT_LZNT1 (0x0002) // winnt

#define COMPRESSION_FORMAT_XPRESS (0x0003) // winnt

#define COMPRESSION_FORMAT_XPRESS_HUFF (0x0004) // winnt





#define COMPRESSION_ENGINE_STANDARD (0x0000) // winnt

#define COMPRESSION_ENGINE_MAXIMUM (0x0100) // winnt

#define COMPRESSION_ENGINE_HIBER (0x0200) // winnt





decompress-ing





Python Windows native script: with native I pinpoint that you can't use on other OSes different from Windows, since it uses native api calls. Moreover you need Windows 8.1 at least, since the RtlDecompressBufferEx was introduced starting from that OS version. You could use the script to decompress prefetch files, if in need: but you'll get a better solutions at the end of the post. I tweeted about this script some days ago, pointing to a gist I made and that you can find at w10pfdecomp.py.





To replicate and double checking that findings, I created a small: withI pinpoint that you can't use on other OSes different from Windows, since it uses native api calls. Moreover you need1 at least, since thewas introduced starting from that OS version. You could use the script to decompress prefetch files, if in need: but you'll get a better solutions at the end of the post. Iabout this script some days ago, pointing to aI made and that you can find at

checksum-ing





checksum assembly branch, but then I realized that SuperFetch Windows 10 files show the same MAM signature: by applying the previous script, decompression fails. Previously I introduced the fact that a checksum could be present in the prefetch file, when in the third byte (algorithm) you get the most significant bit set: in those cases you see 0x84 as the byte value.



Reconsidering the checksum branch, here is it what happens: the (prefetch|superfetch) file will have 4 more bytes set to the calculated checksum, those bytes stored after the decompresion size (so, bytes 8-11, starting to count from 0): that bytes must be skipped during the decompression phase.



The checksum is a simple CRC32, calculated on the whole file, zeroing out the current file checksum: you can then realize why in the dis-assembly RtlComputeCrc32 is called three times. I'll updated my Python script to consider that checksum, both on the gist and on the hotoloti github repository.





In the first instance I ignored theassembly branch, but then I realized thatfiles show the samesignature: by applying the previous script, decompression fails. Previously I introduced the fact that a checksum could be present in the prefetch file, when in the third byte (algorithm) you get the: in those cases you seeas the byte value.Reconsidering the checksum branch, here is it what happens: the (prefetch|superfetch) file will have 4 more bytes set to the calculated checksum, those bytes storedthe decompresion size (so, bytes 8-11, starting to count from 0): that bytes must be skipped during the decompression phase.The checksum is a simple CRC32, calculated on the whole file, zeroing out the current file checksum: you can then realize why in the dis-assemblyis called three times. I'll updated my Python script to consider that checksum, both on theand on thegithub repository.

yaJUl





yajul (it sounds nice) means Yet Another Joachim Uber Library. Joachim Metz published and currently maintains, among many others, the "Windows Prefetch File (PF) format" document, where he describes the various formats those file use: if you couple it with his "Windows SuperFetch database format", you'll get all the intimate details of Prefetch and SuperFetch files, compression containers included.



Moreover, his libssca (Prefetch files) and libagdb ( SuperFetch files) libraries, with the help of libfwnt , are able to correctly handle the decompression and parsing of MAM compression containers (well, the libraries handles all the variants), and that is damned cool!



I want to personally thank Joachim for his prompt support when I reached him with my findings: among other things I got very good suggestions and observations on my short research. I want to share with you an interesting link he provided to me, link to a work made by Jeff Bush on Microsoft Compression Formats. No, I'm not drunk.(it sounds nice) means Yet AnotherLibrary.published and currently maintains, among many others, the "" document, where he describes the various formats those file use: if you couple it with his "", you'll get all the intimate details of Prefetch and SuperFetch files, compression containers included.Moreover, hisfiles) andfiles) libraries, with the help of, are able to correctly handle the decompression and parsing of MAM compression containers (well, the libraries handles all the variants), and that is damned cool!I want to personally thank Joachim for his prompt support when I reached him with my findings: among other things I got very good suggestions and observations on my short research. I want to share with you an interesting link he provided to me, link to a work made byon

conclusions



In the end we get that starting from Windows 10, Prefetch and SuperFetch files are compressed with the XPRESS HUFFMAN algorithm, actually a.k.a. the MAM format. Which is not new: Windows 8.1 uses it to compress SuperFetch files, but not Prefetch files. Moreover from what I see checksum is present only for SuperFetch files and never for Prefetch files.









Anyway with the excellent work made by Joachim we'll be able to understand and to handle those file without any problem. Last but not least, his work his Open Source: not bad, especially in the DFIR world, isn't it?



If you'd need my Python script you can download it from gist.



It remains unclear why Prefetch and SuperFetch files are compressed. Usually compression means space saving (IO reduction?) and computational effort, but it could mean obfuscation too: if you have any clue, I'd be happy to get it.Anyway with the excellent work made by Joachim we'll be able to understand and to handle those file without any problem. Last but not least, his work his Open Source: not bad, especially in the DFIR world, isn't it?If you'd need my Python script you can download it from hotoloti or from the





First, what aprefetch file has to say? Check the first bytes in the next figure, which shows a prefetch file for calc... sorry, now it's(sad, I'll miss you dear!). They should remind you some other "prefetch folder" related file: can't you find yourAll kidding aside, what thesignature recalls is also used by thefile format, which onexhibits the very similarsignature. A great old (gosh, 2011!) post addressing the SuperFetch file format (and so MEM[0|O] format) is "" by, a worth reading. From there, you can get that thefiles are, in the first instance,containers and that the Windows API in charge to decompress them isWhen applied to ourcase, in the end you get three bytes with the magic signature, one byte that identifies the compressionand, eventually, the presence of a: the next 4 bytes are theof the original buffer, then if checksum is in place, you'll get 4 more bytesafter [] the uncompressed size that contain the. The remaining data is what must be decompressed with. Which algorithm in used?The followings are theas they are inSo considering that thesignature usually is followed by 0x4 (or 0x84), the algorithm is