Many of my colleagues at Atomic Object use Time Machine to back up their laptops. By default, Time Machine makes incremental backups hourly, but only when the external backup disk is attached. As a result, this default Time Machine backup system has two limitations:

It is not automatic: The user must attach the external hard drive (assuming the laptop travels frequently). The gap between backups can be relatively large.

For most people, these limitations are minor. For example, the gap between backups can be mitigated by the diligent use of git or some other version control system. However, I am not a typical developer: I’m an absent-minded professor who celebrates when I go a whole week without forgetting to bring my laptop to work. Relying on my remembering to attach an external hard disk to run a backup is not a good idea.

There are several ways to automate a Time Machine backup and run it over a network; however, I found a lower-tech solution: rsync . I use rsync and cron to back up my Documents and Library directories hourly. These directories contain all of my files whose changes would be difficult or impossible to re-create by hand. This “workspace” data is less than 10GB and fits easily on the shared file server. The amount of “workspace” data that changes every hour tends to be very small, so the rsync backup is rarely noticeable. The potentially long gaps between Time Machine backups pose less of a risk for other files (applications, music, system files, etc.) because they rarely change; therefore, I don’t back them up with rsync . In the event of a failure, the few “non-workspace” files that had changed since the previous Time Machine backup can reasonably be restored from original media.

My Script

I have cron run the following script hourly:

#! /bin/bash if / sbin / ifconfig | egrep -A 5 ^en? | grep -q 'status: active' then echo -n "Network detected. Starting backup at " date cd $HOME `/ opt / local / bin / rsync -e 'ssh -i ID_RSA' -a Documents Library --filter ': .rsync-filter' \ --exclude = "/Library/Caches" --exclude '/Library/Mail/IMAP*' ${remote_server} :Backup ` echo -n "Backup complete at: " date else echo -n "No network detected at " date fi echo "----------------------------" #! /bin/bash if /sbin/ifconfig | egrep -A 5 ^en? | grep -q 'status: active' then echo -n "Network detected. Starting backup at " date cd $HOME `/opt/local/bin/rsync -e 'ssh -i ID_RSA' -a Documents Library --filter ': .rsync-filter' \ --exclude="/Library/Caches" --exclude '/Library/Mail/IMAP*' ${remote_server}:Backup` echo -n "Backup complete at: " date else echo -n "No network detected at " date fi echo "----------------------------"

The if statement checks to see if the machine is connected to a network. It is not sufficient for me to simply grep for status:active because my ifconfig contains two virtual nics that are always active. (I think Parallels uses them.) This line relies on

the two entries for my two network cards (wired and wireless) being the only two lines beginning with en? , and the status line for a particular network device being between 3 and 5 lines from the first line of the record. (In other words, grep -A 5 grabs enough lines to find the nic’s status, but doesn’t generate a “false positive” by detecting a succeeding nic’s status.)

ID_RSA is the name of a file containing a private key with no passphrase. The key allows rsync to authenticate with the file server. The key must not have a passphrase (because I don’t think there is a way to configure cron to use one). Because the key contains no passphrase, I configure the server to run only rsync when a client attempts to use this key. (To do this, add the public key for ID_RSA to the server’s authorized_keys file, then add command="rsync --server -logDtpr . Backup" to the beginning of the line containing the public key.)

The --filter ': .rsync-filter' option tells rsync to check each directory for a .rsync-filter file containing lists of files to exclude.

The crontab entry for the backup is 5 * * * * my_rsync.sh >> /tmp/backup.log 2>&1 . If you don’t redirect the output to a file, cron will e-mail it to your account on localhost .

At this point, I need only occasionally check backup.log for problems. (For example, when the server gets re-imaged, I need to log in once by hand to accept the new fingerprint.) This script never deletes files on the server; therefore, I delete the backup directory and run a “full” backup twice a year.

Alternatives