Benchmark: Deep directory structure vs. flat directory structure to store millions of files on ext4 hartator Follow Dec 22, 2018 · 3 min read

It seems to be common knowledge that you should be using deep (also called tree) directory structure (e.g., files/00/01/123.data ) instead of a flat directory (e.g., files/123.data ) when you want to store millions of files. It might have been true for old filesystems like ext3 but is it still true for more modern one like ext4 ?

Let’s verify that.

We’ll use Ruby to generate and benchmark both storing strategies. First we need to find a way to generate fake files. We want to generate 10 millions of them. To do that, we’ll just use a hash with random md5 keys and random md5 values. This way we are sure we are reading something that can’t be cached by the system:

hash = {} 10_000_000.times do

key = Digest::MD5.hexdigest(rand.to_s)

value = Digest::MD5.hexdigest(rand.to_s)

hash[key] = value

end

We then need some code to write than read these files using a flat directory storing strategy:

puts Benchmark.measure {

hash.each do |key,value|

File.write "./dir_flat/#{key}", value

end

} puts Benchmark.measure {

hash.each do |key,value|

File.read "./dir_flat/#{key}"

end

}

And, some code to write than read these files using a deep directory storing strategy. We chose two directory levels with two hexadecimal letters. It should average 152-153 files per leaf directory. (10,000,000/(256*256)):

puts Benchmark.measure {

hash.each do |key,value|

dir_path = "./dir_deep/#{key[0..1]}/#{key[2..3]}/"

FileUtils.mkdir_p dir_path

File.write dir_path + key, value

end

} puts Benchmark.measure {

hash.each do |key,value|

dir_path = "./dir_deep/#{key[0..1]}/#{key[2..3]}/"

File.read dir_path + key

end

}

We note that write performance is probably impacted by creating directories dynamically. Let’s prerender the directory structure:

hash.keys.each do |key|

dir_path = "./dir_deep/#{key[0..1]}/#{key[2..3]}/"

FileUtils.mkdir_p dir_path

end puts Benchmark.measure {

hash.each do |key,value|

dir_path = "./dir_deep/#{key[0..1]}/#{key[2..3]}/"

File.write dir_path + key, value

end

} puts Benchmark.measure {

hash.each do |key,value|

dir_path = "./dir_deep/#{key[0..1]}/#{key[2..3]}/"

File.read dir_path + key

end

}

Here’s the final benchmark results:

Results from a Vultr 16-cores VM (400GB SSD)

Write is 44% faster using a flat directory structure instead of deep/tree directory structure. Read is even 7.8x faster.

In conclusion, just use a flat directory structure. It’s easier to use. Faster in write. Much faster in read. Save on ionodes. And doesn’t need to pre-create or dynamically generate the branch folders.

References: source — raw results

[Edit] So I found out after publishing this article, that ext4 limits is around 10,118,651 (or ~ 10,233,706) files per directory for md5 long filename.

I was trying to run the above benchmark with 20 millions files. But I was getting Errno::ENOSPC: No space left on device @ rb_sysopen error in Ruby. That was weird because both disk space and inodes were fine.

In the dmesg log, I actually had inode directory index full errors:

ext4_dx_add_entry:2235: inode #258713: comm pry: Directory index full

[1718.956797] EXT4-fs warning (device vda1): ext4_dx_add_entry:2184: Directory (ino: 384830) index full, reach max htree level :2

[1718.956798] EXT4-fs warning (device vda1): ext4_dx_add_entry:2188: Large directory feature is not enabled on this filesystem

[10788.316073] EXT4-fs warning (device vda1): ext4_dx_add_entry:2184: Directory (ino: 384830) index full, reach max htree level :2

[10788.316075] EXT4-fs warning (device vda1): ext4_dx_add_entry:2188: Large directory feature is not enabled on this filesystem

Directory indexes in ext4 are linked to filename size and number of files. So, this limit may vary on your system.

Following some commenter’s advices, redoing the 10M benchmark with real JSON files yield a similar results though:

Results from a Vultr 16-cores VM (400GB SSD)

Reads are still 2x faster and writes are still faster by 20%.

References: source v2 — raw results v2 — tune2fs -l output

Dmke also made a awesome fork of the original code. He added benchmarks for how directory depths are performing against each other:

Results from Dmke script

Conclusion 2: stick to custom wisdom and use a deep directory file system. However, be wary of the performance cost of too many directory levels.