Using
Dirvish for Disk-to-Disk Backups: Part II
Keith Lofstrom
Dirvish is a disk-to-disk backup program, a Perl wrapper around
the rsync file system network copy program. Using Unix-style hard
links, dirvish and rsync allow successive backup images to share
identical data files and occupy the same disk space, greatly reducing
backup time and backup disk usage.
My first article about dirvish in the January issue of Sys
Admin (http://www.samag.com/documents/s=9464/sam0501b/)
described how I inherited this open source program and where I plan
to take it. In this article, I'll describe how I use dirvish for
my Linux systems. If you are running Windows or Mac clients, you
will need to follow the directions at the rsync site for configuring
rsync and ssh on those platforms. However, the setup for the dirvish
backup server is almost exactly the same.
The Backup Server
Once your clients are provisioned with rsync and configured for
ssh, they can be backed up by dirvish, unmodified and without further
configuration. The dirvish software and configuration resides completely
on backup server. For your backup server, you will need:
- A Linux backup server with two (and preferably three) half-height,
5.25-inch drive bays available and a fairly recent kernel. OpenBSD
will also work well, but the instructions here require some adaptation.
- An IDE main hard drive.
- An IDE backup drive (two or three are recommended, 250 GB recommended).
- A duplicate IDE main hard drive, for full restore tests.
- A ViPower IDE UDMA 133 Super Rack (VP-10LSFU-133) for the main
drive.
- A ViPower IDE to USB 2.0 SwapRACK (VP-1028LSF) for the backup
drive.
- Extra ViPower IDE drive trays (VP-15F-100/133).
- A third rack (IDE or USB2) if you will be using your server
for client rebuilds.
- A USB2 card or internal USB2 connection. Older USB-1 connections
are far too slow.
- Ssh for the server and all clients, set up for private-keyed,
password-free communication. The latest ssh is available from:
http://www.openssh.org/.
- Rsync for the server and all clients. Since September 2004,
I have been using rsync 2.6.3. The latest version (for Unix, Linux,
Windows, and Mac platforms) can be downloaded from: http://samba.anu.edu.au/rsync/.
- Perl, with modules POSIX, Getopt::Long, Time::ParseDate, and
Time::Period. I am using Perl version 5.8.0. The most recent Perl
may be obtained at: http://www.perl.org/.
Hardware Setup
The hardware setup is easy. I suggest the ViPower swap system
because they are commonly available and work well. A USB2-based
swap cage permits hot-swapping of hard drives without a reboot.
Direct IDE hotswap was added to the later 2.4.X series of Linux
kernels by Alan Cox and removed from the 2.6.X kernels (through
2.6.7) for reasons unknown. Until this reappears in 2.6.X, use
the USB2 interface for hot swap.
However, be very careful which USB2 cage you use. Numerous
external drive boxes, as well as the SanMax Inclose internal
USB2 cage, use the Cypress C7Y68013 chipset, and these have
an incompatibility with current Linux kernels that causes them
to hang during large data transfers. The kernel code will eventually
get fixed, but meanwhile you should use a USB2 cage with known
good behavior.
SCSI and SATA drives and cages may also work with hot swap;
I have not tried them. If you have experience with this, please
let me know at:
http://wiki.dirvish.org/
Parallel IDE drive controllers come in two flavors: the older
LBA-28, and the newer LBA-48. LBA-28 is limited to hard drives
with less than 228 512-byte sectors, or 137 GB, while LBA-48 permits
up to 248 sectors, or 141 million GB. Some external USB2 enclosures
will not work with the larger drives needed for backup, and some
older motherboards will not either. An LBA-48 IDE drive controller
card, such as the Promise TX2-133, will work with the larger drives
and run faster as well.
The main drive and the backup drive in the backup server should
be in identical removable trays. This permits rapid swapping
for restores. I like the version of SuperRack with the slide
switch, rather than the key, because it speeds up backup drive
swapping. Every second shaved off the backup drive swap makes
it more likely to get done.
Partitions
With the hardware set up and the drives installed, format
the backup drive with three partitions: a 5-GB bootable partition,
a 2-GB swap partition, and the remainder for the backup images.
Just about any Linux distro will work in the 5-GB partition,
as long as it is reasonably secure. I use Fedora Core 1, with
all services except ssh turned off. I might want to access the
Web, recompile a program, print something out, or add other
features in the future, so a large boot partition allows for
a lot of future flexibility. You can either do a from-CD install
for the boot partition or copy the images from another machine.
The main thing is to make sure that there are suitable X drivers,
USB drivers, and Ethernet drivers for all the different hardware
setups that the bootable partition might be expected to work
with in the future.
The backup partition will be huge and will have enormous numbers
of inodes and directories. When it is populated with images,
it will take a very long time to fsck. If you are using Linux,
you may have the best luck with version 3 of reiserfs (version
4 is still experimental). Reiser makes much more efficient use
of space for directories and inodes. However, if you would rather
use a traditional file system like ext3, with a fixed inode
count, be sure to build the partition with more than the default
number of inodes. One inode per 8 KB data is the default, while
1 inode per 4 KB is better. Otherwise, you will run out of inodes
long before you have used up your data space. For a Linux ext3
partition, you can set the number of inodes per data byte with
the -i 4096 option to mkfs.
Once created and populated, copying the files in a dirvish
backup partition to add space or inodes is impractical. While
a "dd" or other direct partition copy is reasonably fast, a
file-level copy above the virtualization layer can take a week
or more, due to all the chasing of hard links. Such file-level
copies (such as those performed by tar, dump, cp, rsync, or
cpio) are the equivalent of reading and writing every overlaid
image, separately and in sequence. Thus, a 250-GB hard disk
containing 200 100-GB image sets will read and write 20,000
GB while copying. So, it is critical to create the right kind
of backup filesystem the first time.
After you have built the backup drive (and you can always
tweak it while booted from your main drive), it is time to prepare
for dirvish. Your distribution probably came with an older version
of rsync, but it is important to download and install the latest
version from the rsync Web site. This will take care of the
most recent security problems. You should also make sure your
ssh and sshd packages are up to date.
Setting Up the Software
Let's assume we are preparing one backup server named "Mom"
and two backup client machines named "Dick" and "Jane". Mom
can be almost any flavor of Unix, while Dick and Jane can be
Linux, Unix, Windows, or OS X Macintosh -- anything that will
run rsync through an ssh connection. For simplicity, we will
assume Mom, Dick, and Jane are all Linux machines. Once you
have ssh and rsync running on other types of machines, backing
them up with dirvish works almost identically.
Mom will have a main drive, /dev/hda, and a backup drive,
/dev/sda. For convenience, we will symbolically link /dev/backup
to the backup partition on the backup drive, for example /dev/backup
-> /dev/sda3. The backup drive will get nightly backup images
from Jane and Dick, and Mom as well, at 4 a.m. every night.
Two 250-GB backup drives, labeled A250 and B250, will be alternately
rotated to a fire-resistant safe.
We have set up basic networking between the machines, and
Mom must be able to ping Dick and Jane by name. Note that Mom
does not need to be ping-able by Dick and Jane. Our backup
server may be more secure if it is firewalled to ignore incoming
traffic. However, all machines must have running sshd daemons,
even if firewall rules permit only Mom to ssh to herself.
Why does Mom ssh to herself? Rsync is designed to copy between
networked machines and uses ssh/sshd for everything. Although
a special case could be created for copying between partitions
on the same machine, the bottleneck is usually disk speed. The
overhead of ssh/sshd is small by comparison.
Preparing ssh
Setting up ssh and sshd is beyond the scope of this article.
For key-authenticated (no password) ssh to work, there must
be copies of the host keys for Mom, Jane, and Dick in the /root/.ssh/known_hosts
file on Mom. These are created the first time you do an ssh
from Mom to Jane and Dick. No-password authentication is more
difficult; there must be a copy of Mom's root key on Mom, Jane
and Dick in /root/.ssh/authorized_keys2. The original key is
in Mom's file /root/.ssh/id_rsa.pub or /root/.ssh/id_dsa.pub,
generated by the ssh-keygen command.
When ssh and sshd are set up correctly, you should be able
to ssh from Mom into any machine (including Mom) without supplying
a password.
Preparing Perl
Most distros include Perl and install it as part of the base
system. Dirvish is a collection of Perl scripts, and we will
use the CPAN archive to download needed modules to the local
Perl library:
root@mom# perl -MCPAN -e shell
cpan> install POSIX
cpan> install Getopt::Long
cpan> install Time::ParseDate
cpan> install Time::Period
cpan> quit
(much verbiage omitted)
Any or all of these modules may already be installed in our Perl
libraries. If so, we will be informed, and the CPAN install will
continue. We now have all the Perl modules needed need to operate
dirvish.
Installing and Configuring Dirvish
Download the gzipped tar file for dirvish itself, from the
dirvish site:
http://www.dirvish.org/
Put it into /tmp. This article assumes dirvish version 1.2, but
later versions will be back-compatible. Untar the tar file into
/usr/local/lib, then install it:
root@mom# cd /usr/local/lib
root@mom# tar -zxvf /tmp/dirvish_1.2.orig.tar.gz
root@mom# cd Dirvish-1.2
root@mom# sh install.sh
perl to use (/usr/bin/perl)
What installation prefix should be used? () /usr/local
Directory to install executables? (/usr/local/sbin)
Directory to install MANPAGES? (/usr/local/man)
Configuration directory (/evc/dirvish)
Perl executable to use is /usr/bin/perl
Dirvish executables to be installed in /usr/local/sbin
Dirvish manpages to be installed in /usr/local/man
Dirvish will expect its configuration files in /etc/dirvish
Is this correct? (no/yes/quit) yes
Executables created.
Install executables and manpages? (no/yes) yes
installing /usr/local/sbin/dirvish
installing /usr/local/sbin/dirvish-runall
installing /usr/local/sbin/dirvish-expire
installing /usr/local/sbin/dirvish-locate
installing /usr/local/man/man8/dirvish.8
installing /usr/local/man/man8/dirvish-runall.8
installing /usr/local/man/man8/dirvish-expire.8
installing /usr/local/man/man8/dirvish-locate.8
installing /usr/local/man/man5/dirvish.conf.5
Installation complete
Clean installation directory? (no/yes) no
Now we need to configure dirvish and create some cron scripts,
etc. Our three machines have the following partitions, which we
will back up into separate dirvish "vaults":
Mom: /boot /usr /home / (containing the rest of the root directories)
Dick: / (one main partition)
Jane: / /home (two partitions)
We will only back up the normal OS directories on Mom's main disk.
The backup directory and the redundant boot directory on the second
(backup) hard drive should not be backed up, of course.
Mount /dev/backup to /backup. In the /backup partition, we
will create the following directory tree with seven vaults:
/backup/dirvish/mom/mom-boot
.../mom/mom-usr
.../mom-home
.../mom-root
.../dirvish/dick/dick-root
.../jane/jane-root
.../jane-home
Each directory will contain a "dirvish" directory, and all directories
will be permission 700 (root only). The following commands do
this:
root@mom# cd /backup/dirvish
root@mom# for i in boot usr home root
> do
> mkdir -p mom/mom-$i/dirvish
> done
root@mom# mkdir -p dick/dick-root/dirvish
root@mom# mkdir -p jane/jane-root/dirvish
root@mom# mkdir -p jane/jane-home/dirvish
root@mom# chmod -R 700 /backup/
Note: Do NOT do this chmod after populating with data!
Next, we populate each of those directories with a default.conf
file. Here's the contents of /backup/dirvish/jane/jane-root/dirvish/default.conf:
client: jane
tree: /
xdev: true
index: gzip
image-default: %Y-%m%d-%H%M
exclude:
/proc/
/mnt/
isoimage
*.iso
/var**/tmp/
/var/*/*.oldlog
This tells dirvish Jane's Internet address, where to start the
backup tree, not to cross mount points, use gzip to store the
daily backup indexes, use YYYY-MMDD-HHMM to label the images,
and to exclude the proc directory from the image, as well
as any subdirectories or files named "isoimage". If we wanted
to combine both the / and /home partitions into one backup image,
we would permit dirvish to cross mount points with "xdev: false".
The exclude directories are listed one per line on indented,
separate lines from the exclude: statement itself. While
the "list" format reflects dirvish's Perl heritage, the entries
are not Perl regular expressions but rsync exclude patterns.
If preceded by a slash, these are matched at the root of the
vault. If not, these match the end. So the /proc/ line would
match the /proc directory in the root file system or the /home/proc/
directory in the /home file system. The asterisk pattern match
is a little strange -- a single asterisk matches any string
of characters not containing a /, while the double asterisk
matches strings that can contain zero or more /.
So, the string isoimage will match all files named isoimage,
while *.iso will match all filenames with an .iso at the end.
/var**/tmp/ will match any directory named tmp that is a subdirectory
of the rooted directory /var, such as /var/tmp/, /var/foo/tmp,
/var/foo/fum/tmp, etc. /var/*/*.oldlog will match and exclude
files ending in .oldlog in the directories /var/foo/ and /var/fie/,
but not in /var/foo/fum/. There are more complicated features;
see the rsync man page.
This command shows dirvish's Perl heritage; we are constructing
a "list" instead of a "scalar", and this simple approach to
constructing a Perl list is easy to write code for. I hope to
make future versions of dirvish more forgiving of configuration
style.
Similar default.conf files are constructed for all seven vaults.
It is all right to exclude non-existent directories, so we can
leave /proc/ and isoimage in all of them if convenient, or we
can add other excluded files or directories as needed. However,
always test your excludes, and keep the names in mind when you
create new files and directories.
The backup disk is complete and ready to roll. Next, we prepare
the other big configuration file, /etc/dirvish/master.conf.
Here is a typical master configuration file:
bank:
/backup/dirvish/mom
/backup/dirvish/dick
/backup/dirvish/jane
exclude:
lost+found/
core
Runall:
jane-home 03:00
jane-root 03:00
dick-root 03:00
mom-home 03:00
mom-usr 03:00
mom-root 03:00
mom-boot 03:00
expire-default: never
# keep the Sunday backups forever, the dailies for 3 months
expire-rule:
# MIN HR DOM MON DOW STRFTIME_FMT
* * * * * +3 months
* * * * 1 never
pre-server: /usr/local/sbin/dirvish-pre
post-server: /usr/local/sbin/dirvish-post
There are three banks, the same as our three machines, but we
can put the vaults into one bank, or many, dividing them as needed
to fit on multiple backup disk media.
We can also exclude files and directories from backup in the
master.conf file as well; these are added to the per-vault excludes
in our default.conf files.
Runall lists the vaults by name, in the order that they are
processed, and the second column gives the hour used to timestamp
the daily image name. This is not the run time for dirvish
but is a reference time a little in advance of it. Vaults should
be sequenced so the most important ones are backed up first,
and so that the machines needed earliest are finished first.
If Jane is your laptop machine, you want backups for Jane finished
early, so you can take her on that early-morning flight.
The expire-default specifies what to do if we do not match
the expire-rule just below. It is set to "never", because we
only expire if the expire rule tells us to. Expire is performed
by the /usr/local/sbin/dirvish-expire program, and you may choose
not to schedule that in cron, since dirvish-expire is time consuming
and disk space is cheaper than compute time.
The expire-rule list shown above is a bit complicated. It
tells us to expire all images older than three months, made
at any time, except for images made at any time on day 1, Sunday.
Again, this is interpreted by the dirvish-expire program only
when that is run; dirvish-runall ignores it.
The last two commands, pre-server and post-server, call user
scripts on the server, before and after every vault update.
There are two similar commands called pre-client and post-client,
run for each client. These scripts are called in this order:
pre-server, pre-client, rsync, post-client, post-server.
In the following example, dirvish-pre does nothing except
return with an exit 0, but in dirvish-post we will run a shell
script to gather some useful information:
#!/bin/bash
# /usr/local/sbin/dirvish-post
SFDISK='/sbin/sfdisk -d /dev/hdmain '
DF='/bin/df '
SSH='/usr/bin/ssh'
# variables
# DIRVISH_CLIENT provided from dirvish
# DIRVISH_DEST provided from dirvish
$SSH $DIRVISH_CLIENT $DF > $DIRVISH_DEST/../df.out
$SSH $DIRVISH_CLIENT $SFDISK > $DIRVISH_DEST/../sfdisk.out
exit 0
This saves the output of df and sfdisk along with the daily image,
which will help us rebuild the client disk if necessary.
I run dirvish from a master shell script, /usr/local/sbin/dirvish-daily,
called in the early morning hours by cron. Here is one way to
write it:
#!/bin/bash
# /usr/local/sbin/dirvish-daily
# this is called by /etc/cron.daily/backup
PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin
/sbin/hdparm -zb 1 /dev/backup
/bin/mount /dev/backup /backup
/usr/local/sbin/dirvish-runall
/bin/umount /dev/backup
/sbin/hdparm -b 0 /dev/backup
exit 0
The dirvish-daily shell script runs the actual dirvish Perl program
dirvish-runall and specifies its runtime environment. The daily
script takes the disk online and offline with hdparm to
make sure it is ready to be physically swapped during the day.
The hdparm command may not be necessary with some kernels
and hot-plug configurations. Disconnecting the disk also helps
protect it from rogue software and hides it from system crackers.
The previous two scripts have been simplified from my actual
configuration. I keep track of every machine that has been successfully
backed up, as well as which backup drive was used (A250 or B250),
and append that information to a file in /var/log. This helps
me determine which disk contains which evening's backups.
If Mom is a Red Hat system, there is a cron script that runs
updatedb, which constructs the database used by slocate. You
do not want to add all the zillions of files on the backup drive
to the slocate database, so you should add /backup to the -e
directories:
#!/bin/sh
# /usr/cron.daily/slocate.cron
renice +19 -p $$ >/dev/null 2>&1
/usr/bin/updatedb -f "nfs,smbfs,ncpfs,proc,devpts" \
-e "/tmp,/var/tmp,/usr/tmp,/afs,/net,/backup"
The runtime software is complete. Now we can make the initial
backup images. This will take hours (we are copying every single
byte on our small network), so we initialize all the vaults from
one last script:
#!/bin/bash
# /usr/local/sbin/dirvish-init
DIRV="/usr/local/sbin/dirvish --vault"
$DIRV mom-root
$DIRV mom-boot
$DIRV mom-home
$DIRV mom-usr
$DIRV dick-root
$DIRV jane-root
$DIRV jane-home
exit 0
Be sure to start this with plenty of time to complete; for 100
GB, this initial rsync copy can take about 8 hours and keep your
network very busy. After a dirvish backup, most processes on Mom,
Dick, and Jane will be swapped out to disk, so you will notice
some temporary delay in your desktop responsiveness.
We are just about done with configuration! We need a script
to schedule dirvish in the crontab. If Mom is a Red Hat-style
Linux system, we create another script, /etc/cron.daily/backup,
which is run at 4 a.m. with the rest of the daily scripts. This
file is very simple:
#!/bin/bash
# /etc/cron.daily/backup, run dirvish
/usr/local/sbin/dirvish-daily
Voila! You are done setting up backups. Your first images will
appear in the vault-level directories, timestamped with directory
names like 2005-0215-0300. In each image, there will be five files
and a directory:
sfdisk.out # made by dirvish-post, contains partition map
df.out # made by dirvish-post, contains partition usage
log # a list of every file looked at by dirvish
index.gz # a list of all the files that changed
summary # describes what dirvish did with this image
tree # contains the actual directory tree.
Tree will contain an identical image of the client filesystem,
except for the excluded directories and files. One small discrepancy
will be the modification dates for the symbolic links; rsync (like
cp and tar) cannot modify permissions or modification times for
the symbolic links it creates.
Our example used seven vaults; you will probably want to do
your first experiments with just one vault backing up one small
image, then expand it to your full network, which might use
hundreds of vaults. If, as part of your experimentation, you
call dirvish-runall twice in one day, it will refuse to write
over an identically named image. You can delete the target image
manually, however, and it will act just like an expire, allowing
you to repeat your experiments.
You can look in the tree directory and find all the files
in your original image. Unlike tape, a dump image, or a tar
file, these files are honest-to-goodness data files and directories,
and you can copy them back out of the image using normal Linux
tools. The downside is that you can write over them, delete
them (or at least one hard-link to them), or move them, and
destroy the integrity of the backup. This is the downside of
the hard-linking; you really only have one copy of each file.
This is another good reason for the swap-to-the-safe rotation
scheme, and you may decide to use three or even four rotating
backup drives if you are ultra-careful.
Restore
Ah, but we are not REALLY done, are we? We do not have a backup
system until we verify that we can completely restore a disk
from our backups. I will discuss restoring files and whole drives
from backups in the third and final article on dirvish, in an
upcoming issue of Sys Admin. So collect the hardware
and the software, get the system set up, try running a few dirvish
backups, and keep making tapes or CDs until you learn how to
get your drives and files rebuilt!
References
Dirvish -- http://www.dirvish.org
Rsync -- http://rsync.samba.org
Perl CPAN -- http://www.cpan.org
Vipower -- http://www.vipower.com
Keith Lofstrom (http://www.keithl.com) owns an integrated
circuit design consultancy in Beaverton, Oregon. His specialty
is mixed-signal and statistical design for deep submicron processes,
as well as design for testability using the IEEE 1149.x standards.
Keith has been using some flavor of Unix since 1980 and although
he admits to a brief flirtation with DOS and Windows, he has
seen the error of his ways. He is currently managing the dirvish
backup program until a better leader comes along. |