This post is also available in Japanese.

I’ve been playing a little bit with ZFS, Oracle’s (previously Sun’s) next-generation file system. Originally developed for Solaris, but since it’s open source also ported to Linux (as of 0.6.1 considered stable for production use) and Mac. While called a file system, ZFS is also a volume manager, so also takes over the job of partitioning your disk as well. Why is ZFS cool? It includes protection against data corruption, built-in support for RAID, snapshots and copy-on-write clones, and flexible and efficient ways of transferring data, e.g. for backups. To show what’s possible and push the limits somewhat, I’ll show how we get implement various features of Git, the version control system (or any version control system, for that matter) using ZFS. Of course, I’m not seriously suggesting you’d ditch a “proper” version control system, but it gives a good sense of what’s possible at the file system level.

Installing ZFS is not hard: on Mac go to the OpenZFS On OS X site and install the package. On Ubuntu Linux:

$ sudo apt-add-repository ppa:zfs-native/stable

$ sudo apt-get update

$ sudo apt-get install ubuntu-zfs

Pools and file systems

Now you’re able to create new ZFS storage pools and file systems. If you have a drive available you can use that, or, if you don’t and just want to play around a little bit, you can create one or more files to represent the disks. For instance, to create a 10G file you can use dd:

$ dd if=/dev/zero of=/tmp/disk1.img bs=1024 count=10485760

If you want to test out a RAID setup, create a second one with a different name than disk1.img. The next step is to create a storage pool, for this we’ll use zpool create If you have one or more disks available you can use their drive label (e.g. /dev/sda or /dev/sdb) or better yet: by id (/dev/disk/by-id/…), in our case we’ll use absolute paths to our regular files.

We can create various types of pools, for instance to create a mirror raid:

$ sudo zpool create mypool mirror /tmp/disk1.img /tmp/disk2.img

This will create a pool named “mypool” that mirrors across the two “devices” and mount it under /mypool (on Linux, or /Volumes/mypool on Mac). To see how much space we have available use zfs list:

$ sudo zfs list

NAME USED AVAIL REFER MOUNTPOINT

mypool 433Ki 9,78Gi 370Ki /Volumes/mypool

Alternatively, we can pool up the space from all devices and treat it as one big drive. If you created mypool already, destroy it first:

$ sudo zpool destroy mypool

Then, to create the non-mirrored pool:

$ sudo zpool create mypool /tmp/disk1.img /tmp/disk2.img

$ sudo zfs list

NAME USED AVAIL REFER MOUNTPOINT

mypool 439Ki 19,6Gi 370Ki /Volumes/mypool

Now we have a total of about 20G available.

There’s much more you can do with storage pools, like adding disks on the fly, replacing them on the fly etc. But let’s stick to this simple setup for now.

While we can now start writing files to the /Volumes/mypool or /mypool mount, this is not the recommended way of using ZFS. Instead, we will create separate file systems in the pool. For each of these file systems we can then set various properties, such as whether to enable encryption, compression or quotas. We can also take snapshots of each file system individually, or share the file systems via Samba or NFS, or transfer file system snapshots to other pools, possibly on other servers.

So… file systems are kind of the shit.

ZFS filesystems are managed using the zfs command line tool (as opposed to zpool used for pools).

$ sudo zfs create mypool/test

This will create and mount a new filesystem under /mypool/test (or /Volumes/mypool/test on Mac). Incidentally, we can mount file systems (and pools) anywhere we like by passing in the -m switch, or, even more fun: by changing the mountpoint on the fly:

$ sudo zfs set mountpoint=/test mypool/test

which remounts the filesystem under /test. To see all properties of the filesystem, use zfs get all:

$ sudo zfs get all mypool/test

NAME PROPERTY VALUE SOURCE

mypool/test type filesystem -

mypool/test creation di aug 20 14:47 2013 -

mypool/test used 442Ki -

mypool/test available 9,78Gi -

mypool/test referenced 442Ki -

mypool/test compressratio 1.00x -

mypool/test mounted yes -

mypool/test quota none default

mypool/test reservation none default

mypool/test recordsize 128Ki default

mypool/test mountpoint /test local

mypool/test checksum on default

mypool/test compression off default

mypool/test atime on default

mypool/test devices on default

mypool/test exec on default

mypool/test setuid on default

mypool/test readonly off default

mypool/test snapdir hidden default

mypool/test canmount on default

mypool/test copies 1 default

mypool/test version 5 -

mypool/test utf8only on -

mypool/test normalization formD -

mypool/test casesensitivity sensitive -

mypool/test refquota none default

mypool/test refreservation none default

mypool/test primarycache all default

mypool/test secondarycache all default

mypool/test usedbysnapshots 0 -

mypool/test usedbydataset 442Ki -

mypool/test usedbychildren 0 -

mypool/test usedbyrefreservation 0 -

mypool/test logbias latency default

mypool/test sync standard default

There’s a bunch of useful stuff here, for instance, let’s enable compression:

$ sudo zfs set compression=on mypool/test

Anything we write to this filesystem from this point onwards will be compressed.

Who needs Git?

Using ZFS as a replacement of Git for is probably not a good idea, but just to give you a sense of what ZFS supports at the file system level, let me go through a few typical git-like operations:

Creating a repository

Committing or tagging a version

Branching

Pushing and pulling changes from other storage pools, possibly on other machines

Notably missing is support for merging, which ZFS does not have direct support for as far as I’m aware.

Creating a repository

First, let’s create a filesystem for our projects, with a specific nested filesystem for our project, which we’ll call “zfsgit”. Ues, you can nest filesystems as deep as you like. And then we’ll chown the root of the filesystem to our current user so that we don’t have to sudo for creating, editing and removing files.

$ sudo zfs create mypool/projects

$ sudo zfs create mypool/projects/zfsgit

$ sudo chown $(whoami) /Volumes/mypool/projects/zfsgit

$ cd /Volumes/mypool/projects/zfsgit

Alright, we now have the equivalent of a repository, or checkout thereof.

Let’s create a file and put some content in it:

$ echo "Hello" > file.txt

“Committing” and “Tagging”

In order to create a “commit” or “tag”, i.e. something that is kept in our project’s history and you can revert to, you can use a ZFS snapshot. ZFS snapshots have to be explicitly named. Let’s create our first one “firstcommit”. We do this by adding @ and the snapshot name to our filesystem name.

$ sudo zfs snapshot mypool/projects/zfsgit@firstcommit

Now, let’s change our file slightly:

$ echo "world" >> file.txt

Let’s see what changed:

$ sudo zfs diff mypool/projects/zfsgit@firstcommit

M /Volumes/mypool/projects/zfsgit/file.txt

Sadly it won’t really get to see a textual diff, but at least it indicates which file changed. We can now create a new commit:

$ sudo zfs snapshot mypool/projects/zfsgit@secondcommit

To list our current snapshots:

$ sudo zfs list -t snapshot

NAME USED AVAIL REFER MOUNTPOINT

mypool/projects/zfsgit@firstcommit 146Ki - 370Ki -

mypool/projects/zfsgit@secondcommit 0 - 386Ki -

Now, let’s make another change:

$ echo "ladies..." >> file.txt

That was a bad idea, let’s roll back to our previous snapshot:

$ sudo zfs rollback mypool/projects/zfsgit@secondcommit

$ cat file.txt

Hello

world

And now we got our previous version back.

Branching

Functionality similar to branching can be achieved using zfs clone, which allows you to clone a filesystem based on a particular snapshot:

$ sudo zfs clone mypool/projects/zfsgit@firstcommit mypool/projects/zfsgit_branch

This creates a new copy-on-write filesystem, mounted under mypool/projects/zfsgit_branch which is a very light-weight operation because no copying is involved, and initially barely any extra diskspace is consumed.

Pushing and pulling repositories

You can send filesystems, even incrementally to other storage pools, both local and remote. To demonstrate, let’s say we created another storage pool called “mypool2” locally. We can now “push” any snapshot to our the other storage pool as follows (as root):

$ zfs send mypool/projects/zfsgit@firstcommit | zfs receive mypool2/zfsgit

You can imagine, this works just as well via SSH, for instance:

$ zfs send mypool/projects/zfsgit@firstcommit | ssh root@myserver zfs receive mypool/zfsgit

This pushes the entire filesystem as it looked at the time of the snapshot. Alternatively, if we already pushed a previous snapshot before, we can also just push the difference between the previous snapshot and the current one using the -i option:

$ zfs send -i mypool/projects/zfsgit@firstcommit mypool/projects/zfsgit@secondcommit | zfs receive mypool2/zfsgit

This is useful for incrementally backing up large file systems. Of course, this is just using Unix pipes, so we can also write the result of zfs send to a file and upload it to S3, for instance:

$ zfs send mypool/projects/zfsgit@firstcommit > backup.dump

To pull a filesystem, instead of pushing it, you’d do the reverse, over SSH that could look something like this:

ssh root@myserver zfs send mypool/zfsgit@secondcommit | zfs receive mypool/zfsgit

Should you use ZFS?

ZFS is pretty cool and pretty stable, at least on Solaris and Linux. I’m not sure of the stability on Mac at this time. Using ZFS as a root file system on Linux is still slightly problematic at this moment, but those issues will likely be resolved soon. I don’t have extensive experience with its reliability and performance myself, but the Internets has good things to say.

However, ZFS is not the only game in town. There’s also Linux’ Btrfs, which offers many similar features. However, Btrfs is newer and less mature, it may not be as stable yet. Either way, these file systems are a lot of fun to play with. To learn more about ZFS, I’d recommend reading through Oracle’s ZFS Administration Guide, which is pretty readable and much of it applies to Linux and Mac as well.