This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell

Storage Drivers?

If you don’t know, Docker has various options for how to store its data. Originally it used AUFS (a layered filesystem), but this was not beloved by all, so as the likes of RedHat got interested and now there are various options, including Devicemapper, VFS and Overlay(FS).

Here’s a deck from a great talk by Jérôme Petazzoni here on the subject.

OK, So What?

Docker is sexy, this is not.

But it’s going to be important to think about this if Docker is to be used in production. The selling point of Docker (and XaaSes in general) is more efficient use of resources. A bad decision on storage drivers (or no decision) could cost you in compute resources, or operational cost.

I’ve put together this high-level, incomplete, and probably wrong view of storage drivers here, as I couldn’t find such a table anywhere else. I’d welcome corrections and improvements, and hope to update as I go.

Driver-Feature High-density

Big files?

Encryption? SELinux? Space limits? Page cache sharing? AUFS Y N N(?) Y N Y DeviceMapper Y Y N(?) Y Y N BTRFS Y Y N(?) N N N OverlayFS Y N/A(?) N(?) N N(?) Y VFS N N/A(?) Y(?) Y N N

Key:

High-density: is it designed to have lots of containers on the same disk (ie copy-on-write)?

Big Files: does it handle big files gracefully (ie block-level rather than file level)?

Encryption: does it support encryption of the files?

SELinux: is there SELinux support?

Space Limits: will the container hit space limits (before standard FS limits are hit)?

Page Cache Share: can the OS share page caches between different containers

Discussion

Page Cache Sharing

As someone that works for a corp with the capacity to run a private Docker environment, the column I find most interesting is the “page cache share” one. If you’re running hundreds of thousands of containers over your estate and you have a limited number of blessed images to work from, then the savings in memory from sharing page caches across containers will be compelling.

Big Files

I’ve experienced first hand the pain of having a system that copies large files on write. If you have a monolithic database running within a container (I’m talking several Gig), then it’s painful to wait for the copy of a single massive data file to update one row while your container is running.

VFS

As VFS is copy-on-copy, VFS may be useful if you are OK taking the filesystem hit when starting up your containers, and don’t care about disk space. In return, you get (presumably) near-native performance. I’ve not used this.

Space Limits

By default, Devicemapper has a 10G limit for containers. It’s surprisingly difficult to resize this out of the box, so can get operationally annoying if you’ve not seen this before

Maturity

The area of storage drivers is still not mature within Docker. While overlay(FS) looks promising (and is reputedly dog-fooded at Docker itself), it may not be the last word, or supported everywhere.

Feedback Wanted

Please send me feedback via twitter (@ianmiell) or if you want to mail me privately go via LinkedIn (Ian Miell)

This post is based on material from Docker in Practice, available on Manning’s Early Access Program. Get 39% off with the code: 39miell