In mid-August, the first commercially available ZFS cloud replication target became available at rsync.net. Who cares, right? As the service itself states, "If you're not sure what this means, our product is Not For You."

Of course, this product is for someone—and to those would-be users, this really will matter. Fully appreciating the new rsync.net (spoiler alert: it's pretty impressive!) means first having a grasp on basic data transfer technologies. And while ZFS replication techniques are burgeoning today, you must actually begin by examining the technology that ZFS is slowly supplanting.

A love affair with rsync

Revisiting a first love of any kind makes for a romantic trip down memory lane, and that's what revisiting rsync —as in "rsync.net"—feels like for me. It's hard to write an article that's inevitably going to end up trashing the tool, because I've been wildly in love with it for more than 15 years. Andrew Tridgell (of Samba fame) first announced rsync publicly in June of 1996. He used it for three chapters of his PhD thesis three years later, about the time that I discovered and began enthusiastically using it. For what it's worth, the earliest record of my professional involvement with major open source tools—at least that I've discovered—is my activity on the rsync mailing list in the early 2000s

Rsync is a tool for synchronizing folders and/or files from one location to another. Adhering to true Unix design philosophy, it's a simple tool to use. There is no GUI, no wizard, and you can use it for the most basic of tasks without being hindered by its interface. But somewhat rare for any tool, in my experience, rsync is also very elegant. It makes a task which is humanly intuitive seem simple despite being objectively complex. In common use, rsync looks like this:

root@test:~# rsync -ha --progress /source/folder /target/

Invoking this command will make sure that once it's over with, there will be a /target/folder, and it will contain all of the same files that the original /source/folder contains. Simple, right? Since we invoked the argument -a (for archive), the sync will be recursive, the timestamps, ownership, permission, and all other attributes of the files and folders involved will remain unchanged in the target just as they are on the source. Since we invoked -h, we'll get human-readable units (like G, M, and K rather than raw bytes, as appropriate). Progress means we'll get a nice per-file progress bar showing how fast the transfer is going.

So far, this isn't much more than a kinda-nice version of copy. But where it gets interesting is when /target/folder already exists. In that case, rsync will compare each of those files in /source/folder with its counterpart in /target/folder, and it will only update the latter if the source has changed. This keeps everything in the target updated with the least amount of thrashing necessary. This is much cleaner than doing a brute-force copy of everything, changed or not!

It gets even better when you rsync to a remote machine:

root@test:~# rsync -ha --progress /source/folder user@targetmachine:/target/

When rsyncing remotely, rsync still looks over the list of files in the source and target locations, and the tool only messes with files that have changed. It gets even better still—rsync also tokenizes the changed files on each end and then exchanges the tokens to figure out which blocks in the files have changed. Rsync then only moves those individual blocks across the network. (Holy saved bandwidth, Batman!)

You can go further and further down this rabbit hole of "what can rsync do." Inline compression to save even more bandwidth? Check. A daemon on the server end to expose only certain directories or files, require authentication, only allow certain IPs access, or allow read-only access to one group but write access to another? You got it. Running "rsync" without any arguments gets you a "cheat sheet" of valid command line arguments several pages long.

To Windows-only admins whose eyes are glazing over by now: rsync is "kinda like robocopy" in the same way that you might look at a light saber and think it's "kinda like a sword."

If rsync's so great, why is ZFS replication even a thing?

This really is the million dollar question. I hate to admit it, but I'd been using ZFS myself for something like four years before I realized the answer. In order to demonstrate how effective each technology is, let's go to the numbers. I'm using rsync.net's new ZFS replication service on the target end and a Linode VM on the source end. I'm also going to be using my own open source orchestration tool syncoid to greatly simplify the otherwise-tedious process of ZFS replication.

First test: what if we copy 1GB of raw data from Linode to rsync.net? First, let's try it with the old tried and true rsync:

root@rsyncnettest:~# time rsync -ha --progress /test/linodetest/ root@myzfs.rsync.net:/mnt/test/linodetest/ sending incremental file list ./ 1G.bin 1.07G 100% 6.60MB/s 0:02:35 (xfr#1, to-chk=0/2) real 2m36.636s user 0m22.744s sys 0m3.616s

And now, with ZFS send/receive, as orchestrated by syncoid:

root@rsyncnettest:~# time syncoid --compress=none test/linodetest root@myzfs.rsync.net:test/linodetest INFO: Sending oldest full snapshot test/linodetest@1G-clean (~ 1.0 GB) to new target filesystem: 1GB 0:02:32 [6.54MB/s] [=================================================>] 100% INFO: Updating new target filesystem with incremental test/linodetest@1G-clean ... syncoid_rsyncnettest_2015-09-18:17:15:53 (~ 4 KB): 1.52kB 0:00:00 [67.1kB/s] [===================> ] 38% real 2m36.685s user 0m0.244s sys 0m2.548s

Time-wise, there's really not much to look at. Either way, we transfer 1GB of data in two minutes, 36 seconds and change. It is a little interesting to note that rsync ate up 26 seconds of CPU time while ZFS replication used less than three seconds, but still, this race is kind of a snoozefest.

So let's make things more interesting. Now that we have our 1GB of data actually there, what happens if we change it just enough to force a re-synchronization? In order to do so, we'll touch the file, which doesn't do anything but change its timestamp to the current time.

Just like before, we'll start out with rsync:

root@rsyncnettest:/test# touch /test/linodetest/1G.bin root@rsyncnettest:/test# time rsync -ha --progress /test/linodetest/ root@myzfs.rsync.net:/mnt/test/linodetest sending incremental file list 1G.bin 1.07G 100% 160.47MB/s 0:00:06 (xfr#1, to-chk=0/2) real 0m13.248s user 0m6.100s sys 0m0.296s

And now let's try ZFS:

root@rsyncnettest:/test# touch /test/linodetest/1G.bin root@rsyncnettest:/test# time syncoid --compress=none test/linodetest root@myzfs.rsync.net:test/linodetest INFO: Sending incremental test/linodetest@syncoid_rsyncnettest_2015-09-18:16:07:06 ... syncoid_rsyncnettest_2015-09-18:16:07:10 (~ 4 KB): 6.73kB 0:00:00 [ 277kB/s] [==================================================] 149% real 0m1.740s user 0m0.068s sys 0m0.076s

Now things start to get real. Rsync needed 13 seconds to get the job done, while ZFS needed less than two. This problem scales, too. For a touched 8GB file, rsync will take 111.9 seconds to re-synchronize, while ZFS still needs only 1.7.

Touching is not even the worst-case scenario. What if, instead, we move a file from one place to another—or even just rename the folder it's in? For this test, we have synchronized folders containing 8GB of data in /test/linodetest/1. Once we've got that done, we rename /test/linodetest/1 to /test/linodetest/2 and resynchronize. Rsync is up first:

root@rsyncnettest:/test# mv /test/linodetest/1 /test/linodetest/2 root@rsyncnettest:/test# time rsync -ha --progress --delete /test/linodetest/ root@myzfs.rsync.net:/mnt/test/linodetest/ sending incremental file list deleting 1/8G.bin deleting 1/ ./ 2/ 2/8G.bin 8.59G 100% 5.56MB/s 0:24:34 (xfr#1, to-chk=0/3) real 24m39.267s user 3m15.944s sys 0m30.056s

Ouch. What's essentially a subtle change requires nearly half an hour of real time and nearly four minutes of CPU time. But with ZFS...

root@rsyncnettest:/test# mv /test/linodetest/1 /test/linodetest/2 root@rsyncnettest:/test# time syncoid --compress=none test/linodetest root@myzfs.rsync.net:test/linodetest INFO: Sending incremental test/linodetest@syncoid_rsyncnettest_2015-09-18:16:17:29 ... syncoid_rsyncnettest_2015-09-18:16:19:06 (~ 4 KB): 9.41kB 0:00:00 [ 407kB/s] [==================================================] 209% real 0m1.707s user 0m0.072s sys 0m0.024s

Yep—it took the same old 1.7 seconds for ZFS to re-sync, no matter whether we touched a 1GB file, touched an 8GB file, or even moved an 8GB file from one place to another. In the last test, that's almost three full orders of magnitude faster than rsync: 1.7 seconds versus 1,479.3 seconds. Poor rsync never stood a chance.

Listing image by Flickr user: jonel hanopol