Cloud backups these days are all the rage—for good reason. Rather than dealing with shuffling physical media offsite, you can simply back up the data offsite, where it can be stored in one of many professionally monitored data centers.

Unfortunately, this kind of service isn’t free, and the cost can be a barrier. However, there is a cost-effective way to store your cloud backups: Usenet. With access to a Usenet news server, you can simply upload your backup there, and it will be stored redundantly in news servers all over the world. Best of all, this approach typically costs considerably less than a cloud backup service.

If you’re not an IT graybeard, you may not be familiar with Usenet . To put it within a frame of reference any Ars reader may recognize, Yahoo once described the Ars Technica forums as "the successor to Usenet and precursor of Reddit." And while that's not a 100 percent accurate, Usenet is kinda, sorta like an ancient Reddit. It's a collection of forums, organized by subject, where anyone can anonymously discuss nearly any topic. Where Usenet differs from Reddit, however, is in many of the technical details.

For starters, Usenet is decentralized; there is no central authority you need to register with before posting, and at least in the unmoderated groups, no central committee controls the posts. News posts are stored on thousands of servers worldwide, and any or all posts can be downloaded locally using a newsreader.

Posts are organized into groups and sub-groups, which are a way of specifying the subject matter for posts to the group. For example, comp.os.linux.admin is a group devoted to discussions of Linux administration.

Most groups are devoted to text-based discussions, but all of the alt.binaries subset of groups are devoted to storage of binary files. One in particular, alt.binaries.backup, is devoted to perhaps the oldest type of cloud backup.

Obviously, this is not your typical method of cloud storage. Many may snicker or find it plain weird to tap into this vintage part of the Internet in such a modern way. But oddly enough, we have experience experimenting with this alternative offline storage when it comes to backing up a Linux system. And backing up your Linux system to usenet ultimately requires only a handful of steps. It's not only possible—it's scriptable, too.

Manually backing up a Linux system

First and foremost, we have to determine what needs to be backed up. Luckily, this is pretty easy in Linux. For my desktop systems, a typical list is:

Compile a list of the installed packages

Compile a list of third-party repositories

Locate all user data to back up

The idea here is to only back up those things that are unique to the system. There’s usually no need to back up the whole system, and the majority of components can simply be reinstalled. So the goal is to get a list of all the things that need to be installed, get a list of custom sources for those things, and identify all unique data.

For the first step, you simply need to use your package manager to get a list of installed packages. In Ubuntu, the command for doing this is dpkg --get-selections . However, just running the command sends it to standard out, which is the screen. That isn’t super helpful for what we are trying to do. Instead, redirect this output to a text file by using the command dpkg --get-selections > ~/Package.list . This sends the list to a file called Package.list under your home directory.

Next, you need a list of any third-party repositories in use on your system. For this in Ubuntu, you can simply copy the sources.list file out of the /etc/apt folder with the command sudo cp /etc/apt/sources.list ~/sources.list . However, you also need all of the authentication keys for the repositories, which you can export to a text file with the command sudo apt-key exportall > ~/Repo.keys .

The final step can often be the most complex if your users are not consistent with where they store data. Essentially, you just need to copy the user data into a common location to be archived. For this, the command rsync is very useful. However, if you have hundreds of directories with unique data in your environment, the command can become quite long and convoluted.

Rsync’s basic syntax is as follows:

Rsync

For a backup, at the very least, you want the –a option to archive the files, retaining permissions settings, ownership settings, and so forth. You also should specify the –r option, just to make sure all files in the path are recursively backed up. I generally also like to include the –progress option so that I can monitor the backup. With these options, the rsync command would look something like this:

Rsync –ra –progress

If your environment is very organized with only a few source folders containing unique files, then the easiest way to handle this is often just to add multiple source folders in the rsync command directly. For example, if all unique files were in the user’s home folder, then the rsync command would simply be:

Rsync –ra –progress /home/user

But what if you had three locations? For instance, imagine that this system has unique data in the user’s home folder, as well as the logs stored in /var/logs and data from a custom application stored in /srv/testapp. In this case, the easiest option would be to add all paths in as source paths in the rsync statement, like so:

Rsync –ra –progress /home/user /var/logs /srv/testapp

If you find yourself in a situation with a very large number of source paths, you can use one final option: --files-from. This option allows you to specify a text file with a list of directories to copy from.

For the destination, I like to specify a custom directory to back up the files into. For starters, I generally want to make sure the backup is going on a separate disk from any disks heavily used by the system normally. This is partially due to my desire to not impact normal system functions on a production system and partially due to a desire to increase speeds by having the main system disks primarily reading and the backup disk primarily writing. Also, backing the files up to a generic temp folder (like /tmp) can come back to haunt you if this folder is size-restricted in some way. For all of these reasons, I’II often create a /backup folder or something similar to store the files as I back them up.

One final thing to note with rsync is that you may want to exclude certain files or file types from the backup. For example, I commonly exclude all ISO disk images, as they are large and generally non-unique.

To exclude files, you can simply use the --exclude option, followed by the specific file or wildcard string you want to exclude. You can add as many --exclude options as you'd like to specify multiple files or file types to exclude. For example, to exclude both .iso files and .ISO files, you would use the following command:

rsync -- progress -a -- exclude '*.iso' -- exclude '*.ISO' /home/user /backup

Once you’ve got all of this stuff copied into a single folder structure, you can now roll it up into a tarball and compress it. To do this, just use the tar command. The basic syntax for tar is:

Tar <options> <destination file name> <source path>

The general options I like to use for backups are z, c, v, and f. The ‘z’ option tells tar to create a zipped or compressed file, the ‘c’ option tells it to create a new tarball, the ‘v’ option tells it to give you verbose progress information, and the ‘f’ option lets you specify the name of the output tarball file. So the final tar command for the backup should look something like this:

tar -zcvf /tmp/backup.12-20- 16.tar.gz /backup

Once the tarball is built, you need to encrypt it. This makes it very difficult for anyone but you to gain access to the contents of the file. This is especially important because the location you are uploading it to is a public cloud—in other words, everyone will have access to the file without encryption. So you need to ensure that access to the file does not allow access to the contents of the file.

The utility I use for this process is GNU Privacy Guard (GPG). It’s a free utility that encrypts files using the OpenPGP standard. Normally, gpg encrypts with asymmetric encryption using public/private key pairs. But for this task, it’s both easier and less fraught with error to use a password to encrypt the file. While this is a less secure method overall, by using a password you do not have to worry about keeping up with a private key, making it less likely that you will completely lose a backup. And if you choose a long, very complex password, it is still reasonably secure.

The basic syntax for gpg is below:

Gpg

The options I choose are as follows:

–z to specify the compression level. I set this to zero, since we will compress the file with the tar command prior to running gpg.

--passphrase to specify the password used as a symmetric key.

-c to specify that gpg should use a symmetric cipher and a passphrase.

In the end, the command for gpg should look something like this:

gpg -z0 -- passphrase -c /tmp/backup.12-20- 16.tar.gz

This results in a compressed, encrypted tarball containing all of the unique data as well as all of the text files describing the applications installed on the system and custom repositories used to retrieve them.

Further Reading From the Wirecutter: The best home online backup service

Next, it’s time to prepare the file for Usenet by creating error recovery files using par2. This is a typical way to post files to newsgroups because it allows for easier recovery if a part of the file is damaged. In the case where you upload it monolithically, if even a byte is damaged, you need to re-download the entire backup. And that download will probably come from a different news server, since the old one will still have the damaged file. But if you include par2 files, you only have to download the par2 files. As long as only a small portion of the file is damaged, par2 can often recover it.

This process is done with the par2 command. The basic syntax for par2 is below:

Par2

For this task, we use the c (create) command, and set the –r option to provide for our desired redundancy level (more redundancy means larger par files) and the –s option to determine the size of the individual blocks. The final command, par2 c –s2250000 –r10 /tmp/backup.12-20-16.tar.gz.gpg , creates par files where each block is equivalent to 2.25MB and provides for a 10 percent data redundancy level.

Listing image by Steve Eason/Hulton Archive/Getty Images