Setting up backups with tarsnap

Having already outlined my reasons for using tarsnap for online backups this post will detail how exactly I'm using it.

The instructions on the tarsnap site are really very easy to follow. I was momentarily caught out by not importing the code signing key but after getting that sorted out it was fine. I did need to use sha256sum rather than sha256 as suggested. Installation went well and then I had a little play with creating, listing, deleting and recovering data from backups. It was at this point when my only real gripes with the software started to become obvious - you can't humanize the data size figures when using --list-archives and there is no shortcut for --list-archives. As gripes go these are fairly minor though and everything else works nicely.

With the tarsnap client running on my server it was time to automate my backups. I put together a small script which creates a dump of my database and then creates a new backup with the tarsnap client.

#!/bin/bash
dateString=`date +%F`
echo "Beginning backup for $dateString" >> /home/streety/sources/backup/tarsnap.log
#dump the mysql database
rm -f /home/streety/mysql-backup.sql
mysqldump --user=backup -ppassword --all-databases > /home/streety/mysql-backup.sql
#backup to tarsnap
tarsnap -c -f linode-jscom-$dateString /home/streety /etc/apache2
echo "Backup complete for $dateString" >> /home/streety/sources/backup/tarsnap.log

That script worked fine when I ran it from the shell but cron didn't seem to be running it. I needed to specify the path to the tarsnap script. Easily enough done.

PATH=/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/bin
MAILTO=jonathan@jonathanstreet.com
# m h  dom mon dow   command
5 0 * * * /home/streety/sources/backup/backup.sh >> /home/streety/sources/backup/output.log 2>&1

With everything working I wanted to get permissions set up. Again this was very easy.

tarsnap-keymgmt --outkeyfile /root/limited-tarsnap.key -r -w /root/tarsnap.key

The original key is then removed from the system and kept in a secure place. The new limited key should allow us to create and read from backups but not to delete them.

streety@jonathanstreet:~$ tarsnap -c -f anothertestbackup /home/streety
tarsnap: fopen(/root/tarsnap.key): Permission denied
tarsnap: Cannot read key file: /root/tarsnap.key
streety@jonathanstreet:~$ sudo !!
sudo tarsnap -c -f anothertestbackup /home/streety
[sudo] password for streety:
tarsnap: Removing leading '/' from member names
                                       Total size  Compressed size
All archives                           1804387231        685263319
  (unique data)                         481384333        178645934
This archive                            746610352        296516102
New data                                   721055           196300
streety@jonathanstreet:~$ tarsnap --list-archives
tarsnap: fopen(/root/tarsnap.key): Permission denied
tarsnap: Cannot read key file: /root/tarsnap.key
streety@jonathanstreet:~$ sudo !!
sudo tarsnap --list-archives
testbackup
anothertestbackup
linode-jscom-2009-11-30
streety@jonathanstreet:~$ sudo tarsnap -d -f anothertestbackup
tarsnap: The delete authorization key is required for -d but is not available

As you can see I keep forgetting to use sudo but it all works. I can create backups, list the existing backups but I can't delete them, at least not from this server. Success.

I've been running this script for a little more than a month now and so far I'm very happy with it.

Ditching the custom wheel in backups

I've heard enough horror stories about lost data to know that backups are important. For those files that mainly live on my laptop I use Jungledisk to automatically backup the important files daily. At the time I signed up the program cost $20 and my storage costs are about $0.50 a month. Today you have to pay at least $2/month and then the storage fees as well. Not bad but after a couple of years those fees are going to add up. When I used shared hosting I sporadically backed up the files and then emailed a database dump to my gmail account daily. This worked perfectly well as the files rarely changed and gmail was able to hold several hundred copies of the database for my little blog. When I needed to I could simply go in and delete last years backup emails. Recently though I've started renting a VPS from Linode and I'm now in the position where both the files and the database are frequently changing. I need a way to backup both the files and the database and as I'm lazy I want it to be automated. I started looking around for information on how other people were handling this.

The Plan

I came across a post from John Eberly discussing how he automates his backups to amazon s3. This looked like a good place to start but I was sceptical about how rsync would work with amazon s3 as described and there was only one backup. Based on this I formulated the following plan: At the start of each week copy the directories to be backed up to a temporary directory using rsync and then encrypt using gnupg. Then push the resulting file to amazon s3. On each subsequent day make a differential backup using the batch mode of rsync, encrypt and then push to s3. Repeat for the start of the next week. After putting a surprisingly short script together I had a working approach. Except nothing was actually being pushed to s3. I still need to investigate why this was happening but it isn't at the top of my list of things to do as I have since found a far better way to handle my backups.

Tarsnap

I'm not an expert at backups. Nor am I a security expert. Nor am I interested in becoming an expert at either backups or security. This means someone has likely already built a better backup utility than I could. I believe I have found it in Tarsnap. Below is a list of what tarsnap does. I've highlighted the features which take it above and beyond my approach.
  • Multiple backups
  • Backups on my schedule
  • Files are encrypted
  • Utility pricing - pay only for what you use with no standing charges
  • Open source - I can check that only what I want to happen is really happening
  • Efficient - backups take up no more space than my full+differential strategy and yet each backup can be manipulated independently of any other backup
  • Permissions - With tarsnap I can allow my server to create and read backups but not delete them
The efficiency is nice but a difference between $0.50/month and $0.60/month isn't a massive deal. What is a big deal is the permissions. Backing up my files anywhere with an online connection has always made me slightly uneasy. Email works well as once an email is sent it can't be called back. If you want to backup to amazon s3 you have to give unrestricted access to read, edit and delete which means it is possible to loose all your backups. Tarsnap is not vulnerable to this weakness though and this is a big deal. It's one less thing to worry about which is certainly worth the $0.15/GB premium over s3 alone. My next post will detail how I have implemented backups using tarsnap.

Developing the Alectruino clock - Part 1

Side-on view of the breadboard setup

I have recently returned to a project I began last winter. The project is to design an alarm clock which eases waking up on these dark and dismal winter mornings. The idea is fairly simple. 30 minutes or so before I want to wake up a bank of LEDs will begin to shine simulating the dawn. This will hopefully prime me for the alarm itself. Last winter I had the time working with a DS1307 real-time clock and was able to control the LEDs through an Arduino.

Although I'm keeping the real-time clock I'm switching the LEDs from white to blue light and I'm adding in an LCD to display the time and likely allow programming. So far I have soldered the bank of LEDs up so they are almost ready to go and setup the Arduino to pull the current time from the DS1307 and then display it on the LCD. So far the project is going well. Hopefully this year I'll actually finish it while it is still of some use. When spring comes around I'll let nature do the work.

Aerial view of the breadboard setup

The code is straightforward and is little more than pulling together code put together by others. Specifically the tutorials for the LiquidCrystal library and this tutorial on interfacing with the DS1307. The code can be viewed here.

Read comments ...