Categories
Admin Computers

Website Backup Over SSH

I recently enabled ssh shell access to my Hostmonster hosted website account. Previously, I’d relied on cPanel for all my maintenance. With shell access comes all the power of shell commands and scripting to your fingertips. I’ve come up with a nice, for me, automated backup process for the site which I present after the jump. But be gentle, my bash scripting foo is weak.

In order for this to work, ssh access has to use the public key/ private key authentication method. What I’ve done is created a bash script on my host account that builds a backup archive of my website, which is then download to my computer here at home. The whole process is kicked off by a local bash script that I run via cron. Both scripts consist of only a few lines of commands.

First, the remote script:

#!/bin/bash

SITEFILES=/path/to/sitefiles
DB_DUMP=backup_dump
ARCHIVE=backup-website.tar
DB_USER=username
DB_PW=password
DB_NAME=database

mysqldump --user=$DB_USER --password=$DB_PW $DB_NAME > $DB_DUMP
rc=$?
if [ $rc != 0 ]; then
    exit $rc
fi

tar -cf $ARCHIVE $SITEFILES && tar -rf $ARCHIVE $DB_DUMP && bzip2 $ARCHIVE && rm $DB_DUMP

exit

This script goes into the home directory on the host account, for the purpose of the post the script name is remote-backup-script.sh.

The mysqldump command creates a file that can be used to restore a database using only a few lines of mysql commands (I’ve tested and confirmed this). After making sure it succeeded, the script builds the archive using tar and then removes the database dump file once the file has been added to the archive. Finally, the archive is compressed into a bz2 file. Notice the fixed archive name (no timestamp or variable naming element). That’ll be key for the second part of the process.

The script that runs locally is similarly simple:

#!/bin/bash

HOST=mysite.org
RFILE=backup-website.tar.bz2

ssh $HOST './remote-backup-script.sh'
rc=$?
if [ $rc != 0 ]; then
    exit $rc
fi

rsync $HOST:$RFILE ~/ && ssh $HOST rm $RFILE && cp $RFILE $RFILE.$(date +%I%M%m%d%Y)

exit

This script takes advantage of ssh‘s ability to pass a shell command to the remote host. In the first ssh command, the single quotes make sure that the script executed is the one on the remote host. Without the quotes, ssh would look for the script on the local machine. We make use of that in the second ssh command where a local variable needs to be interpeted; if the quotes were in place ssh would attempt to interpret the variable on the remote host.

The nice part here is the use of rsync. The first time this script runs, depending on the size of the archive to transfer, it will take awhile because the entire archive needs to be moved and the local copy backup-website.tar.bz2 created. However, once the initial backup file is created, subsequent iterations will execute much quicker.

Why? Because the local script doesn’t remove the local copy. It maintains it in order to take advantage of rsync‘s transfer algorithm which only transfers file differences. (As noted above, the remote script always creates a file with the same name.)

The final steps are to delete the archive on the remote and copy the archive to a file with a timestamp in the name. The final result is an updated version of the archive and a series of timestamped files for the purposes of creating a history.

A final note on both scripts: both essentially abort on errors. So if, say, the remote script fails, for whatever reason, the local script will catch it in the if statement and abort the remainder of the script. The final lines use the && operator for similar effect- the sequence of commands separated by the && operator is aborted if the previous step fails for some reason.

All that’s left to do is insert the script into the cron table for regular execution. Now I get regular, automated backups of my site. Because of the error handling, if something goes wrong then cron will send me an email notification. The only part left is to occasionally prune the destination directory so it doesn’t take up too much space. I can handle that manually, I think.

For now.

Leave a Reply

Your email address will not be published. Required fields are marked *