iA


Back-up

by Jens-Christian Fischer. Average Reading Time: about 6 minutes.

This is one of the things that I should have been setting up a long time ago, but haven’t. Yesterday was the day. Remote Backup of my PowerBook.

Goal:

Continuous off-site backup of the most important files (my home directory) on my laptop

Ingredients:

  • Strongspace (I got a 4 GiB account for $8 / month)
  • rsync (comes bundled with OS X)
  • Ruby (comes bundled with OS X)
  • SSHKeychain

Setting up a Strongspace account is easy. Strongspace gives you secure offline storage that you access through SFTP or anything with SSH (like rsync). The cost is negligible if you care about what you have on your laptop.

Set up key based login

Drop to a terminal on OS X. Create a dsa key (if you have one, you will know what to do anyway)

$ ssh-keygen -t dsa -b 1024

Leave the default options for where the key is stored (in ~/.ssh/id_dsa) and pick a password.

If you want to use the automated backup as described below, you must leave the password empty (I’m trying to find a better solution to this, but for the time being, that’s it)

You could leave the password empty, but this poses a security risk, because the key you just created identifies your machine to strongspace. Anyone getting access to the private key (the file id_dsa) could impose you. So protect it with a passphrase and let SSHKeychain handle the password handling (using the normal OSX Keychain)

$ cp ~/.ssh/id_dsa.pub authorized_keys

Now you have a file called authorized_keys containing the public key part of the key you just created. Copy this file to Strongspace using the web interface. In your Strongspace account, create a directory called .ssh (note the . dot in front of ssh) and copy that file to there. Now rsync will not ask for a password.

SSHKeychain (that lives in your menu bar) will take care of protecting your private key and make the keys password available to SSH

Create a Strongspace backup directory

I wanted to set up one directory that contains all the stuff I want to backup and have some control over what is included. The simplest thing would have been to backup the complete /Users/name directory, but I’m not backing up my videos, images and music to strongspace. Therefore I created a directory inside my home directory and use Unix file-system links to link in any thing I want:

$ mkdir ~/strongspace
$ cd ~/strongspace
$ ln -s ../Documents
$ ln -s ../Desktop
$ ln -s ../.ssh
$ ln -s ../any_other_directory_you_want__to_backup

First Sync

Let’s try and see if everything works as expected: Type this into a shell

$ rsync -azvCL –exclude=’*.mpg’ –exclude=’Cache/’ –exclude=’Caches/’ –progress ~/strongspace/ account@account.strongspace.com:/home/account/backup/computer/users/name/

all the bold items in that command line are those you should adapt to your environment:

account is your strongspace login

computer is the name of your computer

name is your user name

The options for rsync:

a: archive mode (recurse into subdirectories and preserve ownership and permission of files)

z: compress the data over the wire

v
: verify the data

C
: ignore files CVS style (don’t worry if you aren’t using CVS. This setting just ignores common artefacts from programming and various temporary files)

L
: follow links (we absolutely want this, because this is what makes our strongspace directory work. Otherwise, we’d only have a directory backed up with the links, but not the directories hiding behind those links)

The –exclude option excludes files (all files ending with mpg) and directories ( I’m not particularly interested in the various cache directories)

–progress gives you a bit more information to look at while the initial rsync is running – it could take quite some time

~/strongspace/ tells rsync from which directory to sync, while account@account.strongspace.com:/home/account/backup/computer/users/name/ tells it where to sync to

Prepare to wait quite a while while the initial rsync takes place (depending of course how many files and directories you have linked into your strongspace directory)

Setting up regular sync

Any backup that needs manual intervention is bad. Therefore we want to set this up to work in the background whenever the computer is online.

cron will handle this for us. We could just enter the commandline we just saw above into crontab (the file that controls which programs are started when), but we’ll get a bit more fancy and write a wrapper around this command that gives us some more flexibility.

I expanded a script I found over at textsnippets.com:

The key difference to the one found on textsnippets is the simple configuration (by expanding the EXCLUDES and DIRS variables). And important for me: This script checks if it is connected to the internet before it starts the rsync process. In addition it checks if the computer is connected through a dial up interface (in my case GPRS EDGE using my mobile phone. Syncing large quantities of data over the air is both slow and expensive) and aborts if that’s the case.

Download and save

ss.rb

this script to ~/bin/ss.rb or any other directory/file in your home directory. Open it with an editor and change the following lines to match your account:

# configuration
ACCOUNT = 'account'
EXCLUDE = [ '*.mpg', 'Cache/', 'Caches/']

DIRS = { '~/strongspace' => 'backup/computer/users/name',
         '/etc' => 'backup/computer/etc' }

instead of account use your strongspace account name

I use the distinction computer and name to separate backups from multiple computers and users, feel free to change that

You can try it out by dropping to a shell and typing:

ruby ~/bin/ss.rb

That should print out something like:

06.10.05 10:18: syncing ~/strongspace -> backup/arwen/users/jcf
/usr/bin/rsync -azvCL --delete --exclude='*.mpg' --exclude='Cache/' --exclude='Caches/' ~/strongspace/ account@account.strongspace.com:/home/account/backup/computer/users/name/

lots and lots of info about files it has backed up

06.10.05 10:18: syncing /etc -> backup/arwen/etc
/usr/bin/rsync -azvCL --delete --exclude='*.mpg' --exclude='Cache/' --exclude='Caches/' /etc/ account@account.strongspace.com:/home/account/backup/computer/etc/

Scheduling

Remember: Only backups that you don’t have to do yourself are good backups. So let’s get the computer doing some work for us. I like to run this script every hour, but you can of course change that to whatever cycle you like.

We use cron, the daemon available on any unix system, to trigger the script when it’s time:

$ crontab -e

This opens an editor window. Normally that’s vi, the visual editor. Mastery of vi is considered a black art – and I’m not going to teach it to you: just do the following:

press i to get into insert mode.

Type the following (changing name to your user name of course):

01 * * * * ruby /Users/name/bin/ss.rb >> /Users/name/var/log/ss.log

then press ESC to get into command mode. Now press the following keys: : w q (colon w q). That saves the file and exits vi. You now get a message that the crontab has been altered. At the shell prompt, type

$ crontab -l

and you should see the listing of what you just typed.

We told cron that we want our script to run 1 minute after every hour, every day, every month and every weekday. Check out a quick reference for more options. Also we told cron, that we want a logfile in the directory ~/var/log. This needs to be created:

$ mkdir -p ~/var/log

(the -p option tells mkdir to create any intermediate directories (var in this case) too)

Now you are all set and regular backups should occur 1 minute after the hour.

Have fun!

Technorati Tags: , , , ,

16 comments on ‘Back-up’

  1. Jonathan says:

    Take a look at rsnapshot. Its a wrapper around rsync for incremental backups that holds several versions to the past.

    http://blog.innerewut.de/articles/2005/05/25/remote-filesystem-snapshots-with-rsnapshot

  2. Actually I looked at rsnapshot (and would have loved to use it), but Strongspace is the remote machine. And rsnapshot does not allow me to to a backup TO a remote machine, only pull data FROM a remote machine.

    In that case, it does not help me.

    I have heard that the Strongspace people are looking at supporting some kind of incremental backup solution though. I’m the first to sign up for it

  3. Great guide,
    worked immediately for me. I have on question though, I want to do this also the other way around; updating my Mac from Strongspace after I’ve worked on another computer.

    I tried it by reversing the source and destination in the rsync command, which worked, but with one caveat: he overwrites the symbolic links with the real directories… You happen to know how to make it so rsync doesn’t do this? (with one command, I could also execute rsync in each directory but that would defeat the elegance of the symbolic links)

  4. Charles says:

    Thanks for taking the time to post such a clear set of instructions. I had my backups up and running in a short time.

  5. magpiebrain says:

    Trouble free backups, Part Two – SSH Keys for trouble free authentication

    Where I introduce SSH keys into our backup solution and talk about SSHKeychain

  6. magpiebrain says:

    Trouble free backups, Part Two – SSH Keys for trouble free authentication

    Where I introduce SSH keys into our backup solution and talk about SSHKeychain

  7. Robert Andrews says:

    Your script at http://blog.invisible.ch/ss.rb seems to be 404.

    Shame. I was hoping to try it out. :( Had got that far.

  8. The 404 is fixed. Thanks for pointing this out.

  9. Michael LeBarron says:

    I tried to do this and all I get is

    rsync: Failed to exec ss: No such file or directory (2)
    rsync error: error in IPC code (code 14) at /SourceCache/rsync/rsync-20/rsync/pipe.c(83)
    rsync: connection unexpectedly closed (0 bytes received so far) [sender]
    rsync error: error in rsync protocol data stream (code 12) at /SourceCache/rsync/rsync-20/rsync/io.c(359)

    For the life of me I haven’t been able to figure out what “file or directory” there is “no such” of :)

    unless there’s just something wrong with the version of rsync I have… Anyone? Please?

  10. Hugh Macmillan says:

    I notice that you can get this error by only putting a single hyphen infront of non single character switches. e.g. ‘progress’ requires a double hyphen!

  11. Cate says:

    That error is a result of the fact that the ss.rb file says “# normally, don’t change after here” and after that has a PID file path defined which you MUST change to this:

    (AFTER creating the directory /Users/username/var/run/ )

    PID = ‘/Users/username/var/run/ss.pid’

    Obviously ‘username’ is your local username in OS X.

    . . .

    In other news, I could not get cron to work either, but I did get iCal to do it, after creating an Applescript application that iCal could open as its “alarm” that said:

    on run

    set carriage_return to return as text

    tell application “Terminal”

    set theCommand to “ruby” & ” ” & “/Users/username/ss.rb”

    do script with command theCommand & carriage_return

    end tell

    end run

    …..of course you’ll have to edit the “/Users/username/ss.rb” path to wherever you put ss.rb.

  12. Cate says:

    Obviously above, anywhere you see a double-quote escaped with a backslash, remove the backslash — and replace the & character entities with real ampersands again.

    And don’t forget “on run” at the beginning and “end run” at the end. They weren’t properly included in the fixed-width text of the rest of the script as seen here.

  13. nsidney says:

    I am soooo confused on the “password-less” rsync part. I’ve created the keys and copied them to the server… but I still get asked for my password. I feel like I’m missing something that has to do with SSHKeychain (i.e. do I ever need to touch the GUI for SSHKeychain?)

    Any ideas?

    Thanks!

  14. Administrator says:

    nsidney, when you have copied the keys, you should be able to connect to the server directly. The test is to use the shell and try something like:

    scp somefile myaccout.strongspace.com

    and that should work fine.

    Did you create a directory called .ssh over on strongspace? Did you copy the correct key?

    SSHKeychain needs to know about the stronbspace key. Go to the configuration and add that key on the tab “SSH Keys”

  15. Jeroen janssen says:

    I have a weird problem.. everything works fine a couple of times but after that it stops working. When I delete ss.pid everything works a couple for a while..

    I have no idea why this is.

  16. [...] We use Joyent (affiliate link) for our mail (born out of Textdrive). Last weekend, the Joyfolks updated the application to a new version (and had quite a few problems in the process). One of the new features is Strongspace integration. Strongspace is a secure storage on the Net (accessible through SFTP, SCP and the like) that I have used for online backups for quite a while. [...]

Leave a Reply

  1. Subscribe to this comment