Having already outlined my reasons for using tarsnap for online backups this post will detail how exactly I'm using it.
The instructions on the tarsnap site are really very easy to follow. I was momentarily caught out by not importing the code signing key but after getting that sorted out it was fine. I did need to use sha256sum rather than sha256 as suggested. Installation went well and then I had a little play with creating, listing, deleting and recovering data from backups. It was at this point when my only real gripes with the software started to become obvious - you can't humanize the data size figures when using --list-archives and there is no shortcut for --list-archives. As gripes go these are fairly minor though and everything else works nicely.
With the tarsnap client running on my server it was time to automate my backups. I put together a small script which creates a dump of my database and then creates a new backup with the tarsnap client.
#!/bin/bash dateString=`date +%F` echo "Beginning backup for $dateString" >> /home/streety/sources/backup/tarsnap.log #dump the mysql database rm -f /home/streety/mysql-backup.sql mysqldump --user=backup -ppassword --all-databases > /home/streety/mysql-backup.sql #backup to tarsnap tarsnap -c -f linode-jscom-$dateString /home/streety /etc/apache2 echo "Backup complete for $dateString" >> /home/streety/sources/backup/tarsnap.log
That script worked fine when I ran it from the shell but cron didn't seem to be running it. I needed to specify the path to the tarsnap script. Easily enough done.
PATH=/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/bin MAILTOemail@example.com # m h dom mon dow command 5 0 * * * /home/streety/sources/backup/backup.sh >> /home/streety/sources/backup/output.log 2>&1
With everything working I wanted to get permissions set up. Again this was very easy.
tarsnap-keymgmt --outkeyfile /root/limited-tarsnap.key -r -w /root/tarsnap.key
The original key is then removed from the system and kept in a secure place. The new limited key should allow us to create and read from backups but not to delete them.
streety@jonathanstreet:~$ tarsnap -c -f anothertestbackup /home/streety tarsnap: fopen(/root/tarsnap.key): Permission denied tarsnap: Cannot read key file: /root/tarsnap.key streety@jonathanstreet:~$ sudo !! sudo tarsnap -c -f anothertestbackup /home/streety [sudo] password for streety: tarsnap: Removing leading '/' from member names Total size Compressed size All archives 1804387231 685263319 (unique data) 481384333 178645934 This archive 746610352 296516102 New data 721055 196300 streety@jonathanstreet:~$ tarsnap --list-archives tarsnap: fopen(/root/tarsnap.key): Permission denied tarsnap: Cannot read key file: /root/tarsnap.key streety@jonathanstreet:~$ sudo !! sudo tarsnap --list-archives testbackup anothertestbackup linode-jscom-2009-11-30 streety@jonathanstreet:~$ sudo tarsnap -d -f anothertestbackup tarsnap: The delete authorization key is required for -d but is not available
As you can see I keep forgetting to use sudo but it all works. I can create backups, list the existing backups but I can't delete them, at least not from this server. Success.
I've been running this script for a little more than a month now and so far I'm very happy with it.
I've heard enough horror stories about lost data to know that backups are important. For those files that mainly live on my laptop I use Jungledisk to automatically backup the important files daily. At the time I signed up the program cost $20 and my storage costs are about $0.50 a month. Today you have to pay at least $2/month and then the storage fees as well. Not bad but after a couple of years those fees are going to add up. When I used shared hosting I sporadically backed up the files and then emailed a database dump to my gmail account daily. This worked perfectly well as the files rarely changed and gmail was able to hold several hundred copies of the database for my little blog. When I needed to I could simply go in and delete last years backup emails. Recently though I've started renting a VPS from Linode and I'm now in the position where both the files and the database are frequently changing. I need a way to backup both the files and the database and as I'm lazy I want it to be automated. I started looking around for information on how other people were handling this.
The PlanI came across a post from John Eberly discussing how he automates his backups to amazon s3. This looked like a good place to start but I was sceptical about how rsync would work with amazon s3 as described and there was only one backup. Based on this I formulated the following plan: At the start of each week copy the directories to be backed up to a temporary directory using rsync and then encrypt using gnupg. Then push the resulting file to amazon s3. On each subsequent day make a differential backup using the batch mode of rsync, encrypt and then push to s3. Repeat for the start of the next week. After putting a surprisingly short script together I had a working approach. Except nothing was actually being pushed to s3. I still need to investigate why this was happening but it isn't at the top of my list of things to do as I have since found a far better way to handle my backups.
TarsnapI'm not an expert at backups. Nor am I a security expert. Nor am I interested in becoming an expert at either backups or security. This means someone has likely already built a better backup utility than I could. I believe I have found it in Tarsnap. Below is a list of what tarsnap does. I've highlighted the features which take it above and beyond my approach.
- Multiple backups
- Backups on my schedule
- Files are encrypted
- Utility pricing - pay only for what you use with no standing charges
- Open source - I can check that only what I want to happen is really happening
- Efficient - backups take up no more space than my full+differential strategy and yet each backup can be manipulated independently of any other backup
- Permissions - With tarsnap I can allow my server to create and read backups but not delete them