If Tarsnap costs $0.25 / GB of storage, how is it possible to store "archives adding up to several terabytes" while paying less than $10/month?

Please see our page about deduplication efficiency.

Since the cost depends on "encoded bytes", how can I predict how much Tarsnap will cost before signing up?

Starting with tarsnap 1.0.36, you can test the deduplication and compression without an account:

tarsnap --dry-run --no-default-config --print-stats --humanize-numbers -c /MY/DATADIR

Replace /MY/DATADIR with the name of the directory which contains the files you wish to back up. You may also list multiple directories to back up. For example, you may wish to back up: /etc /home /var/www

This will produce output in the form:

tarsnap: Performing dry-run archival without keys (sizes may be slightly inaccurate) tarsnap: Removing leading '/' from member names Total size Compressed size All archives 2.2 GB 1.8 GB (unique data) 2.1 GB 1.7 GB This archive 2.2 GB 1.8 GB New data 2.1 GB 1.7 GB

The value which matters for the cost is "(unique data) -- Compressed size", which represents the "encoded bytes" that is stored on the Tarsnap servers. In above example, this is 1.7 GB, so it will cost approximately $0.43 (= 1.7 * $0.25) to upload the data, and $0.43 per month for storage.

To calculate the precise cost, we would omit the --humanize-numbers argument and use bytes instead of GB. Tarsnap's internal accounting uses attodollars ($10-18) .

Note that deduplication is most effective when creating multiple snapshots (e.g., daily backups), so it will not help much for the initial snapshot. We have a few examples of deduplication with multiple snapshots.

Is Tarsnap storage reliable?

Yes. Data archived via Tarsnap is stored on the Amazon S3 storage service (the original version, not the "reduced redundancy" version introduced in 2010).

Why doesn't Tarsnap --list-archives print archives in alphabetical (or chronological) order?

The archive metadata which contains Tarsnap archive names and creation times is encrypted; so it's impossible for the Tarsnap client code to figure out in what order the archives should be listed until it downloads and decrypts the metadata. Once it has done so, it might as well just print out the information immediately — if you want a particular order, sort(1) is your friend.

Can I move my Tarsnap setup to a new computer?

Yes, no problem! Tarsnap doesn't care about the physical hardware; only the data, key file, and cache directory that it is given to work with.

To confirm that everything is set up, we recommend that after you have copied your data to the new system:

Create an archive on the old system with: tarsnap -c --print-stats MY_OPTIONS Transfer the cache directory to the new system Simulate creating an archive on the new system: tarsnap -c --dry-run --print-stats MY_OPTIONS

The "new data" size should be quite small (consisting of archive metadata), and the "this archive" size should be approximately the same as the old statistics (machines can present metadata in a slightly different manner, and can list files within a directory in a new order which could alter the compression efficiency).

How can I investigate network problems?

We have a series of tips about debugging Tarsnap network problems.

What does " Pathname in pax header can't be converted to current locale " mean?

This message arises if you created an archive that contains filenames with characters that can't be represented in the current locale. We recommend that if you would like to use non-ASCII characters, your locale should support UTF-8.

For example, consider this archive:

tarsnap -tf kana kana/ kana/kana.txt kana/カナ.txt

With an environment which cannot print Japanese characters, we get:

LANG=C tarsnap -tf kana kana/ kana/kana.txt tarsnap: Pathname in pax header can't be converted to current locale. kana/\343\202\253\343\203\212.txt tarsnap: Error exit delayed from previous errors.