Application Description

I have an offline data processing tool. This tool loads hundreds of thousands of files. For each one it performs some calculations and when done writes a single index file. It is all C++ (all IO is via standard library objects/functions), and is being compiled with Visual Studio 2013 targeting amd64.

Performance

My test dataset has 115,757 files that need to be processed. Files total 731MB in size, and the Median file size is 6KB.

First run: 12 seconds

Second run: ~18 minutes

Thats 90x slower! The second run is extrapolated from one minute of run time. All runs after that, as i've experienced thus far, are equally slow.

Surprise!

If I rename the folder with the files in it, and then rename it back to what it originally was, the next time I run the application it will again perform quickly!

Its the same app, machine, and source data. The only difference is that one folder was temporarily renamed.

So far I can reproduce this 100% of the time.

Profiling

Naturally the next step was to profile. I profiled the quick run and the slow run and compared the hot spots. In the slow version about 86% of the application was spent in a function called NtfsFindPrefix . The quick version spends about 0.4% of its time here. This is the call stack:

Ntfs.sys!NtfsFindPrefix<itself> Ntfs.sys!NtfsFindPrefix Ntfs.sys!NtfsFindStartingNode Ntfs.sys!NtfsCommonCreate Ntfs.sys!NtfsCommonCreateCallout ntoskrnl.exe!KySwitchKernelStackCallout ntoskrnl.exe!KiSwitchKernelStackContinue ntoskrnl.exe!KeExpandKernelStackAndCalloutEx Ntfs.sys!NtfsCommonCreateOnNewStack Ntfs.sys!NtfsFsdCreate fltmgr.sys!FltpLegacyProcessingAfterPreCallbacksCompleted fltmgr.sys!FltpCreate ntoskrnl.exe!IopParseDevice ntoskrnl.exe!ObpLookupObjectName ntoskrnl.exe!ObOpenObjectByName ntoskrnl.exe!NtQueryAttributesFile ntoskrnl.exe!KiSystemServiceCopyEnd ntdll.dll!NtQueryAttributesFile KernelBase.dll!GetFileAttributesW DataGenerator.exe!boost::filesystem::detail::status

The boost call in question is an exists call. It will test for the zipped version of a file, fail to find it, and then test for the unzipped one and find it.

Profiling also showed that the disk did not get hit by either runs of the application, however File IO was expectedly high. I believe this indicates that the files were already paged to memory.

File IO also showed that the duration of file "Create" events were on average MUCH higher in the slow version. 26 us vs 11704 us.

Machine

Samsung SSD 830 Series

Intel i7 860

Windows 7 64 bit

NTFS file system.

32GB Ram

Summary

On the second run the calls into NtfsFindPrefix take much longer.

take much longer. This is a function in the NTFS driver.

The disk didn't get hit in either profiles, the Files were served from pages in memory.

A rename operation seems to be enough to stop this issue occurring on the next run.

Question

Now that the background info is out of the way, Does anyone recognise what is going on and know how to fix it?

It seems like I could work around it by renaming the folder myself, but that seems...dirty. plus im not sure why that even works.

Is the rename invalidating the pages in memory and causing them to get updated before the next run? Is this a bug in the NTFS driver?

Thanks for reading!

Update!!

After some more profiling it looks like the part that is performing slower is testing to see if the non-existent zipped file exists. If I remove this test everything seems to get quicker again.

I have also managed to reproduce this issue in a small C++ app for everyone too see. Note that The sample code will create 100k 6KB files on your machine in the current directory. Can anyone else repro it?

// using VS tr2 could replace with boost::filesystem #include <filesystem> namespace fs = std::tr2::sys; //namespace fs = boost::filesystem; #include <iostream> #include <string> #include <chrono> #include <fstream> void createFiles( fs::path outDir ) { // create 100k 6KB files with junk data in them. It doesn't matter that they are all the same. fs::create_directory( outDir ); char buf[6144]; for( int i = 0; i < 100000; ++i ) { std::ofstream fout( outDir / fs::path( std::to_string( i ) ), std::ios::binary ); fout.write( buf, 6144 ); } fs::rename( outDir, fs::path( outDir.string() + "_tmp" ) ); fs::rename( fs::path( outDir.string() + "_tmp" ), outDir ); } int main( int argc, const char* argv[] ) { fs::path outDir = "out"; if( !fs::exists( outDir ) ) createFiles( outDir ); auto start = std::chrono::high_resolution_clock::now(); int counter = 0; for( fs::recursive_directory_iterator i( outDir ), iEnd; i != iEnd; ++i ) { // test the non existent one, then the other if( !fs::exists( fs::path( i->path().string() + "z" ) ) && fs::exists( i->path() ) ) counter += 1; if( counter % 100 == 0 ) std::cout << counter << std::endl; } std::cout << counter << std::endl; auto end = std::chrono::high_resolution_clock::now(); std::chrono::duration< double, std::milli > s( end - start ); std::cout << "Time Passed: " << s.count() << "ms" << std::endl; return 0; }

Update 2

I have logged an issue with MS here. Hopefully they can help shed some light on the issue.