BuildGrid is Cloud 66 cluster of servers which use to create our customers' docker images from their code. These servers are created in AWS EC2 and we are using EBS mounted volumes as data-store for them.

One of the most disturbing errors that we got in BuildGrid were errors related to Storage Driver:

Driver devicemapper failed to remove root filesystem ... : Device is Busy

Cannot destroy container: Driver aufs failed to remove root filesystem

To fix this, we tested almost everything that were suggested in different communities without much luck.

The last suggestion was using OverlayFS as the Storage Driver but OverlayFS was not in linux kernel upstream and using it means losing all automatic updates and security patches for kernel which does not seems right for a production environment.

After a long wait, finally OverlayFS is in Linux kernel upstream, so we started to change BuildGrid servers backend storage to enjoy the long list of its benefits.

Instruction to use OverlaySF are simple enough:

Upgrade to linux kernel 3.18

Run docker daemon with '-s overlay' option

First week was as well as expected, OverlayFS was fast and we had no more weird "Cannot destroy container" errors.

But it was not going to be as easy as that!

I started the second week of using OverlayFS with investigating errors raised during the weekend complaining about disk space on one of BuildGrid servers.

Strange thing was I could see a lots of free space :

$ df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda1 197G 25G 164G 14% /

After some investigation I found out we are now running out of inodes on the server:

$ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvda1 13107200 13107200 0 100% /

It means more than 13 million file created in filesystem. But where are they?

Next hour was spent with a painful process of finding the source of those 13 million files.

Going through folders and narrowing down the criteria by running scripts:

$ for i in /*; do echo $i; find $i |wc -l; done

You can replace /* in above script with any path to find out about number of files in that path.

As expected the culprit was OverlayFS. OverlyFS implements

union mount, it means we end up with a host server mounting file systems for images and containers, means millions of small files and shortage in inode at the end.

Just check /var/lib/docker/overlay path and you will see folders related to each images and their mapped filesystems.

The bad news about inode is that you can only configure the maximum allowed size when creating the filesystem, so I ended up with creating and configuring a new volume for OverlayFS storage.

You can provide the number of inode during filesystem creation with mkfs.

It seems our servers issues were fixed with the new inode configuration. Let's see what is the next surprise of the container file system zoo!

At the end these are steps you need to have a docker server with OverlayFS as backend storage on AWS/EC2

Fire up an 'ubuntu 14.04' server on AWS/EC2 Upgrade kernel to 3.18 Install Docker on server Create an EBS volume Attach new volume to created server Create a file system on volume:

6.1. Connect to your instance and run lsblk to find available disk devices and their mount points.

6.2. From the result of lsblk command, choose device that is attached, but not been mounted yet (e.g. xvdf , ...). devicename would be the selected device with a /dev/ as a prefix (e.g. /dev/xvdf ,...)

6.3 Create filesystem on volume (replace number_of_inodes and devicename with your values).:

$ sudo mkfs -t ext4 -N <number_of_inodes> <devicename> Create a mount point for storage. Docker use /var/lib/docker/overlay path to store OverlayFS data and I couldn't find any place to configure it, So I'm going to use that folder as mount point:

$ sudo mkdir /var/lib/docker/overlay Mount the voulme (replace devicename with your value):

$ sudo mount <devicename> /var/lib/docker/overlay Make volume available on every system reboot.

9.1. Open /etc/fstab with enough permission.

9.2. Add a record with bellow format to the end of it :

device_name mount_point file_system_type fs_mntops fs_freq fs_passno

e.g.

<devicename> /var/lib/docker/overlay ext4 defaults,nofail 0 2