Windows
Client Backups with rsync and FreeBSD
Geoff Breach
We've all heard of those new "fast and unreliable"
hard disks that have replaced the "slow and unreliable"
ones that we used in the past. Well, during this past year, all
of the unreliable ones seemed to be happening to me, and so I needed
a solution to keep data loss to a minimum. My environment is a little
unique -- most of the computers involved are laptops, and some
of them never appear on my LAN. Also, I'm not in a position
to demand that my customers make backups. Some may choose not to,
for their own reasons. My backup solution must let the customer
choose if and when to back up.
Generally speaking, the only ways to ensure that regular backups
(whether they're your own or someone else's) happen is
to make them either automatic or effortless! My customers are quite
competent with self-managing versions of their work, so my primary
requirement is for a backup system that guards against total loss
-- by theft, fire, or failure -- rather than an ability
to recover old versions of files.
A solution in this environment has a number of unusual requirements:
- Must be client-initiated -- The system must accept backup
data from client machines at any time, on no set schedule.
- Must be portable -- While Win32 clients comprise the majority,
client machines may also be Unix, Unix-like, or Mac/OSX.
- Must be easy to operate -- one click. The end customer
must not be required to enter passwords or otherwise answer questions
in the course of the backup.
- Must be efficient -- End customers are not always located
in their offices on a fast corporate network. The system must
be operable over the corporate network but also via other public
networks such as broadband, public wireless, and even dial-up.
- Must be secure -- If the system is to be operated over
public networks, then it must transfer data and communications
in a secure manner.
- Return of archived data to end customers must be on their terms,
in a format that they can readily recover on their own, without
technical assistance, and without special media or equipment.
- Provision must be made for the usual requirements of a backup
system. Indefinite permanent archives in full and incremental
means as required.
- Must be inexpensive -- We have no budget for this!
The Solution
My solution uses rsync over an encrypted ssh connection to synchronize
a subset of the files on the customer machines (we don't need
to archive applications and operating systems, just working files)
with a live copy on a FreeBSD server. The server keeps the live
copy on a RAID-5 array of disks. The RAID-5 is designed to protect
the server copy from the very failures from which we are trying
to protect the clients. Customers have a single "Backup Now"
icon on their Windows Start Menu, and they initiate backups at will,
background the process, then go right back to work.
This isn't a high-performance solution, but it doesn't
need to be. The primary write bottleneck tends to be at the client
in comparing, compressing, and encrypting the data. The server merely
needs to hold a large filesystem, and keep it live.
Storage
In pilot testing, the average disk space requirement was around
1.5 gigabytes per individual customer. Your mileage will vary depending
on the nature of your customer's work and the decisions you
make about which file types to archive.
In the following example, three 200-Gb disks are placed in a software
RAID-5 array using the FreeBSD native Vinum Volume Manager. Vinum
requires dedicated disk partitions with filesystem type "vinum".
Using RAID-5 will provide on the order of 400 gigabytes of usable
disk space.
Setting up vinum
Use the FreeBSD installation and configuration tool to access
fdisk and the Disklabel Editor via the Custom -> Partition and
Custom -> Label menu options:
lucy# /stand/sysinstall
Allocate to FreeBSD the portion of each physical disk that you want
to use with fdisk; remember to use the "W" (Write Changes)
if you are not allocating the space as a part of the initial installation.
In most cases, three keystrokes, "A" (Use Entire Disk),
"W" (Write Changes), and "Q" (Quit) are all that
are required for each disk.
Partition the FreeBSD slices using the FreeBSD Disklabel Editor.
The following example shows the three disks, each with a 128-Mb
swap partition (optional and not required) and the remainder of
the disk allocated to dummy mount points. The filesystem type will
be changed manually in the next step, use the "T" (Toggle
Newfs) option to disable newfs execution and "W" (Write)
to write the new disklabels:
FreeBSD Disklabel Editor
Disk: ad5 Partition name: ad5s1 Free: 0 blocks (0MB)
Disk: ad6 Partition name: ad6s1 Free: 0 blocks (0MB)
Disk: ad7 Partition name: ad7s1 Free: 0 blocks (0MB)
Part Mount Size Newfs
---- ----- ---- -----
ad5s1b swap 128MB SWAP
ad5s1e /mnt/x0 190651MB UFS+S N
ad6s1b swap 128MB SWAP
ad6s1e /mnt/x1 190651MB UFS+S N
ad7s1b swap 128MB SWAP
ad7s1e /mnt/x2 190651MB UFS+S N
Manually edit the disklabels on each disk to change the partition
type to "vinum":
lucy# disklabel -r -e ad5
The disklabel command provides the disk label for editing in
your default text editor. Locate the partition you want to use for
vinum, usually the very last line in the file:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
b: 262144 0 swap # (Cyl. 0 - 16*)
c: 390716802 0 unused 0 0 # (Cyl. 0 - 24320*)
e: 390454658 262144 4.2BSD # (Cyl. 16*- 24320*)
Note that the "c:" partition represents the whole disk and
must not be changed. Edit the filesystem type of the backup partition,
in this case "e:", to be "vinum" and save the
files. Disklabel automatically re-writes the label to the disk and
updates the kernel's in-memory version of the label:
After the allocation using the FreeBSD Disklabel Editor, the three
disks all have the following layout:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
b: 262144 0 swap # (Cyl. 0 - 16*)
c: 390716802 0 unused 0 0 # (Cyl. 0 - 24320*)
e: 390454658 262144 vinum # (Cyl. 16*- 24320*)
Note that this example includes a small swap partition at the beginning
of each physical disk. This is an artifact from an old habit of mine
and is by no means a requirement for this system. In fact, if you
plan to experiment with hot-swapping of disks, you will find that
the system will probably be more stable without swap partitions on
the disks you plan to hot swap.
Vinum requires a configuration file to initialize a new virtual
volume. This configuration names the three drives as "d5",
"d6", and "d7" and defines a virtual volume
in line 4 named "backups0". (This volume will later be
presented to the operating system as a device called "/dev/vinum/backups0".)
Line 5 specifies a RAID-5 plex with 400-Kb stripe size. Vinum performs
best with a stripe size between 256Kb and 512Kb, but powers of 2
tend to cause all of the filesystem's superblocks to be placed
on the first disk and so should be avoided. Finally, all of the
previously named drives are added to the plex as subdisks, using
the whole of each named drive:
drive d5 device /dev/ad5s1e
drive d6 device /dev/ad6s1e
drive d7 device /dev/ad7s1e
volume backups0 setupstate
plex org raid5 400k
sd length 0 drive d5
sd length 0 drive d6
sd length 0 drive d7
Execute the vinum create command to load the configuration.
Vinum responds with a summary of the newly created configuration:
lucy# vinum create -f vinum.backup0.conf
3 drives:
D d5 State: up Device /dev/ad5s1e Avail: 0/190651 MB (0%)
D d6 State: up Device /dev/ad6s1e Avail: 0/190651 MB (0%)
D d7 State: up Device /dev/ad7s1e Avail: 0/190651 MB (0%)
1 volumes:
V backups0 State: up Plexes: 1 Size: 372 GB
1 plexes:
P backups0.p0 R5 State: up Subdisks: 3 Size: 372 GB
3 subdisks:
S backups0.p0.s0 State: up PO: 0 B Size: 186 GB
S backups0.p0.s1 State: up PO: 400 kB Size: 186 GB
S backups0.p0.s2 State: up PO: 800 kB Size: 186 GB
lucy#
The "setupstate" keyword in the vinum configuration file
causes the volume and its components to be created in an "up"
state. Vinum RAID-5 volumes must be formally initialized before use
however, so you should start the vinum control program and issue the
init instruction. The list command will allow you to
see the state of each component of the array:
lucy# vinum
vinum -> init backups0.p0
vinum -> vinum[309]: initializing subdisk /dev/vinum/sd/backups0.p0.s1
vinum[308]: initializing subdisk /dev/vinum/sd/backups0.p0.s0
vinum[310]: initializing subdisk /dev/vinum/sd/backups0.p0.s2
vinum -> list
3 drives:
D d5 State: up Device /dev/ad5s1e Avail: 0/190651 MB (0%)
D d6 State: up Device /dev/ad6s1e Avail: 0/190651 MB (0%)
D d7 State: up Device /dev/ad7s1e Avail: 0/190651 MB (0%)
1 volumes:
V backups0 State: down Plexes: 1 Size: 372 GB
1 plexes:
P backups0.p0 R5 State: initializing Subdisks: 3 Size: 372 GB
3 subdisks:
S backups0.p0.s0 State: I 15% PO: 0 B Size: 186 GB
S backups0.p0.s1 State: I 12% PO: 400 kB Size: 186 GB
S backups0.p0.s2 State: I 12% PO: 800 kB Size: 186 GB
vinum ->
If you are a coffee drinker, now would be a good time to get some;
the initialization of the plex will take quite a while. Once the state
of each of the subdisks and the backups0.p0 plex returns to "up",
you can treat the new device as a regular filesystem. So, newfs it,
mount it, link it to its "public" top-level directory, and
add a suitable entry to /etc/fstab. Finally, add a line that reads
start_vinum="YES" to /etc/rc.conf to ensure that vinum loads
at boot time:
lucy# newfs -v /dev/vinum/backups0
lucy# mount /dev/vinum/backups0 /mnt/backups0
lucy# ln -s /mnt/backups0 /backups
lucy# mount -p
/dev/vinum/backups0 /mnt/backups0 ufs rw 2 2
lucy#
Install rsync
rsync is not included in the base FreeBSD distribution but is,
of course, available from the FreeBSD ports collection and as a
precompiled package on the release CD image. The following series
of commands may be used to install rsync from the FreeBSD 4.10-RELEASE
CD-ROM or directly from ftp.freebsd.org:
lucy# mount /cdrom
lucy# cd /cdrom/packages/All
lucy# pkg_add rsync-2.6.1.tgz
Or:
pkg_add ftp://ftp.freebsd.org/pub/FreeBSD/ports/i386/ \
packages-4-stable/All/rsync-2.6.2_1.tgz
rsync Wrapper Script
If your customers will access the backup server only to make backups,
then you might choose to limit their access to a wrapper script that
allows them only to execute the commands needed for backup. This sample
wrapper is placed in /backups/bin/rsync-wrapper.sh. We use the OpenSSH
forced-commands feature to execute the wrapper script later:
#!/bin/sh
/usr/bin/logger -p local0.notice rsync-wrapper $SSH_CONNECTION \
$SSH_ORIGINAL_COMMAND
if echo $SSH_ORIGINAL_COMMAND|grep -e "^rsync " >/dev/null 2>&1; then
$SSH_ORIGINAL_COMMAND
elif echo $SSH_ORIGINAL_COMMAND|grep -e "^scp " >/dev/null 2>&1; then
$SSH_ORIGINAL_COMMAND
else
/usr/bin/logger -p security.warn rsync-wrapper Denied $SSH_CONNECTION \
$SSH_ORIGINAL_COMMAND
echo "Access denied."
fi
Security Considerations
To ensure that customers will never be asked to type a password
or otherwise interact with the backup process, this implementation
places a passphrase-less private key on the client machine's
local hard disk. Then, ssh uses that key for authentication. This
is a trade-off between security and convenience, but with a suitably
secured backup server, access to the client's private key should
provide no more access than an intruder must already have gained
to hold the private key. If this calculated risk is one that you
are not prepared to take, you should consider protecting the private
key with a passphrase, storing the key on a portable device, or
even using a different form of authentication.
Configuring OpenSSH
On the server, the OpenSSH configuration is altered to disallow
password authentication, operate only the ssh version 2 protocol,
and to disable port forwarding to and from clients. You should carefully
examine the /etc/ssh/sshd_config file in its entirety to confirm
that the settings are suitable for your security requirements. Here
is a subset of /etc/ssh/sshd_config:
Protocol 2
PermitRootLogin no
AllowTcpForwarding no
GatewayPorts no
FreeBSD Login Classes
FreeBSD login classes may be used to further restrict access. In this
example, settings are appended to /etc/login.conf to limit the path
and umask for members of the "backupclients" login class.
Further limits (for example, to process count or memory usage) might
also be applied in this way. Be aware that FreeBSD users have the
option to override some login class settings with a .login_conf file
in their home directory. So, if you rely on login classes for security,
you must also take steps to prevent unwanted overrides. Use cap_mkdb
/etc/login.conf to rebuild the system database from the config
file:
backupclients:Backup Clients:\
:path=/backups/bin:\
:umask=077:\
:tc=default:
Creating FreeBSD User Accounts
The quickest and easiest one-line way to add a new user account
to FreeBSD is with the "pw" utility. The install.sh script
assembles a pw command ready to be pasted onto a command
line on the server. It uses the following options:
-n -- Username
-d -- Home directory
-L -- FreeBSD login class, set to backupclients
-g -- Group, also set to backupclients
-c -- Comment, the customer's name
-m -- Instructs pw to create the home directory
-s -- Sets the shell to /backups/bin/rsync-wrapper.sh
Because the backupclients accounts have their path set to /backups/bin,
you must place copies of (or links to) rsync, scp, grep, logger,
and rsync-wrapper.sh in that directory on the server. Setting the
user account's login shell to point to the rsync wrapper script
is really just a second level of safety. In practice, the rsync
wrapper script is executed because it is a forced command in the
authorized_keys2 file, and the login process should never actually
consider the shell in the passwd file.
rsync Include and Exclude Files
If configuring vinum was the most time-consuming part of this
odyssey, then tinkering with rsync's include and exclude functionality
will be the trickiest. In this implementation, I use separate include
and exclude files, although rsync is capable of drawing all of its
include/exclude information from one file.
When copying in --archive mode, rsync includes all files
that it has not been specifically told to exclude. It processes
include and exclude rules a bit like a packet-filtering firewall
-- searching from the top down and aborting the search at the
first match. So, if rsync copies everything that isn't specifically
excluded, why bother with an include list? When in --archive
mode, --recursive is implied, and rsync applies the include/exclude
list recursively to each sub-tree. If it finds an exclude match
in a path, it aborts checking for all subdirectories underneath
it. If there is a chance that an exclude rule might match a directory
containing files you want to keep, then you'd better make sure
those files are matched with an include rule first!
Beware the case-sensitive match! rsync does The Right Thing and
considers case when comparing and copying files. Windows filesystems
preserve case, but are not case sensitive. (This means that if I
have a file called "FILENAME.TXT" and I ask Windows "Do
you have a file called 'filename.txt'?", it will
answer "yes". At the same time, if I create the file as
"FiLeNaMe.TxT", then Windows will remember the original
case that I specified, it just won't honor it!) If you can't
be sure whether the filenames you want to match will be in uppercase
or lowercase, then you need to specify both. Obviously, specifying
all possible permutations and combinations could get pretty crazy
pretty fast, so you need to approach this one with a level head.
In the following example, I specify that I want to keep word processor
and spreadsheet files from OpenOffice and Microsoft along with PDFs.
Generally, I don't want to archive executables and libraries,
because they can be re-installed from original source disks. Many
of my customers, however, use an email client that will quite happily
soldier on if it is transplanted in its entirety with all of the
files in its install directory, so I archive everything in that
directory including executables.
Two of the biggest files on any Windows system will be the swap
file and the hibernation file. Since they're pretty much completely
useless anywhere other than on a running system, there is no point
in archiving them. Many pre-installed Windows systems keep a complete
copy of the OS install set in the \I386 directory. I can get that
on CD too, so I won't be archiving that either. Here is a sample
include file:
*.sxw
*.SXW
*.stc
*.STC
*.sxc
*.SXC
*.doc
*.DOC
*.xls
*.XLS
*.pdf
*.PDF
*Eudora*
*Eudora Pro*
And, here is a sample exclude file:
Temporary*
System Volume Information
i386
I386
*.dll
*.DLL
*.exe
*.EXE
PAGEFILE.SYS
hiberfil.sys
A check of the rsync line in my backup batch file will show that I
refer to two include files and one exclude file. In this implementation,
I use a common include and exclude file for all customers, then a
second include file that is unique to each customer, usually empty,
in case particular customizations are required.
All three files are stored on the backup server and copied at
the beginning of each backup. Thus, I can modify them in the comfort
and privacy of my own server and have the clients refer to the latest
versions for each new backup run. The backup batch file also creates
a text file listing all filenames on the client system, and that
file is conveniently delivered to the backup server on every run.
While fine-tuning the include and exclude rules, I can compare these
file lists to the files that arrive on the server and tweak the
rules as required.
Installing Win32 Client Software
In keeping with the very Unix-like flavor of this solution, Cygwin
(http://www.cygwin.com/) binaries are used on the Win32 clients
to make up the client end of the bargain. There are two ways to
achieve this. If you have other uses for a Unix-like environment
on your Win32 machines, then you might as well install the whole
Cygwin environment. If this backup solution is your only requirement,
however, then you may choose to simply install the small subset
of the Cygwin distribution that is required to achieve this goal.
Specifically, we require the rsync, ssh, scp,
ssh-keygen, and mount commands. If you don't
require a full Cygwin installation, then you can make a temporary
installation on one machine and pick out the executables and libraries
you need. To run a backup from a Win32 client to the FreeBSD server,
the following binaries are required on the Win32 machine:
rsync.exe
scp.exe
ssh.exe
cygcrypto-0.9.7.dll
cygminires.dll
cygpopt-0.dll
cygwin1.dll
cygz.dll
Additionally, to use ssh-keygen and mount (only required for installation),
simply copy their respective binaries. The cat, mkdir,
mv, nice, and rm commands and the z shell (sh.exe)
are added so that they may be used in the install and backup scripts.
These could be removed after they have been used by the install script
if you're concerned about the extra space they use. Here are
some additional Cygwin binaries for installation and scripting:
cat.exe
mkdir.exe
mount.exe
mv.exe
nice.exe
rm.exe
sh.exe
ssh-keygen.exe
cygiconv-2.dll
cygintl-1.dll
cygintl-2.dll
The files must be placed somewhere in the Windows path. Either place
them in their own directory and modify the PATH environment variable
or drop them in an existing location, perhaps C:\WINDOWS\ or C:\WINDOWS\SYSTEM32\.
Setting up the Windows Clients
Before a backup can be initiated, a number of prerequisites must
be satisfied:
- The Cygwin rsync and scp executables expect to find ssh in
the /usr/bin directory, and ssh expects to record the public keys
of known hosts in the customer's home directory in /home/<username>/.ssh/known_hosts.
Mount points must be created to connect the Unix-style paths to
their Win32 equivalents.
- The customer requires a public/private key pair to authenticate
with the backup server and, of course, the customer's public
key must be installed on the server along with an actual user
account on the server. The install.sh script delivers a series
of commands ready to be pasted onto a server command line.
- A script is required to carry out the backup process.
Execute the install script from a Windows command line with the
form sh install.sh douglasb "Douglas the Cat". Windows 98
has different ideas about some of the paths used in this script,
so it will require a bit of tweaking to run there.
#!/usr/bin/sh
# Simple install script to configure rsync/ssh backups
# on Windows NT hosts...
#
# Usage: install.sh <customer login name> <full customer name>
#
# Set Cygwin mount points
mount -f -s -t "C:\Documents and Settings" /home
mount -f -s -t $SYSTEMROOT /usr/bin
# Create customer's public/private key pair..
cd /usr/bin
mkdir .ssh
ssh-keygen.exe -N "" -q -b 1024 -C "$2" -t rsa -f .ssh/id_rsa
# Change back to the system directory, and insert the
# customer's FreeBSD username into the backup batch file.
mv backup-c.bat backup-c.bat.src
echo set USERNAME=$1 > backup-c.bat
cat backup-c.bat.src >> backup-c.bat
rm backup-c.bat.src
# A Windows Shortcut to the backup batch file on the Start menu
# may be a nice touch. Creating a Windows link file from
# DOS/shell is possible but complex. It's far easier to pre-
# create a shortcut to "%WINDIR/backup-c.bat", and simply move
# it into place.
mv "Backup C Drive.lnk" "$ALLUSERSPROFILE/Start Menu/Backup C Drive.lnk"
# Create command strings for execution on the FreeBSD backup
# server tocreate the customer account and populate .ssh/authorized_keys2
echo "/usr/sbin/pw useradd -n $1 -d /backups/$1 -L backupclients -g
backupclients -c \"$2\" -m -s /backups/bin/rsync-wrapper.sh" > tempfile.txt
echo mkdir "/backups/$1/.ssh" >> tempfile.txt
echo "echo command=\\\"/backups/bin/remote-rsync.sh\\\" 'cat
.ssh/id_rsa.pub' >/backups/$1/.ssh/authorized_keys2" >> tempfile.txt
echo "/bin/ln -s /backups/rsync-include.txt
/backups/$1/rsync-include.txt" >> tempfile.txt
echo "/bin/ln -s /backups/rsync-exclude.txt
/backups/$1/rsync-exclude.txt" >> tempfile.txt
echo "/usr/bin/touch /backups/$1/rsync-local-include.txt" >> tempfile.txt
# Present the command strings in a text editor for cut/paste
# to the host (this shell can actually execute windows binaries!)
/usr/bin/System32/notepad.exe tempfile.txt
# remove the temporary file.
rm tempfile.txt
The install.sh script prepends command="/backups/bin/rsync-wrapper.sh"
to the customer's public key before offering it up for insertion
in the authorized_keys2 file. If the customer authenticates by public/private
key (and in this implementation, it is the only way a customer can
gain access) then OpenSSH will ignore any command line sent by the
client and instead execute this forced command. The rsync-wrapper.sh
script records the client's command to syslog then confirms that
it is either scp or rsync before allowing it to be executed. If the
client sends any other command, it is logged to syslog's security
facility and rejected.
Finally, here is the code for the backup-c.bat script that is
executed by the Start menu shortcut inserted by the install.sh script:
C:
cd %WINDIR%
dir \ /a-d /s /b >all-files.txt
scp -i .ssh/id_rsa %USERNAME%@<my.backup.server>:rsync-include.txt
rsync-include.txt
scp -i .ssh/id_rsa %USERNAME%@<my.backup.server>:rsync-exclude.txt
rsync-exclude.txt
scp -i .ssh/id_rsa %USERNAME%@<my.backup.server>:rsync-local-include.txt
rsync-local-include.txt
nice -n 19 rsync.exe --archive --stats --progress --modify-window=5
--include-from=rsync-local-include.txt --include-from=rsync-include.txt
--exclude-from=rsync-exclude.txt --rsh="ssh -i .ssh/id_rsa"
/cygdrive/c/* %USERNAME%@<my.backup.server>:c/
pause
Network Time
To decide whether to copy a particular file rsync compares the
size and the timestamp of the files. If your clients and your server
have differing opinions on what the current time is, then you'll
find a lot of unnecessary file transfers going on when your customers
execute their backup scripts.
Many people are not aware that Windows 2000 ships with a perfectly
serviceable NTP client -- it only made it into the GUI in Windows
XP. The network time client is installed as a service named "Windows
Time", but it does not start automatically by default in Windows
2000. Use the Services control panel (Start -> Run -> services.msc
-> OK) to set it to start automatically.
Your backup server will also need a reliable time source. You
could simply configure a cron job to run ntpdate every hour or so.
If you have five minutes to spare instead of just one, configure
xntpd and keep your server properly synchronized to a number of
other servers. Be sure to add the line xntpd_enable="YES"
to /etc/rc.conf if you do. With xntpd running on the server, your
clients can synchronize to it, and they need never disagree on the
time. Here's how to configure and start the Windows Time client:
C:\> net time /SETSNTP:ntp.mytimeserver.com
C:\> net start W32Time
Here's a sample ntp.conf for FreeBSD:
driftfile /var/db/ntp.drift
server ntp.atimeserver.com
server ntp.ticktock.com
server ntp.cuckoo.com
server ntp.hourglass.com
Windows 95/98/Me clients that don't ship with their own network
time client might use the excellent open source NetTime. NetTime is
available from:
http://nettime.sourceforge.net/
and is included on TheOpenCD from:
http://www.theopencd.org/
Even with a nicely synchronized clock, Windows' FAT filesystems
cannot be relied upon to record timestamps with less than two seconds
of granularity, so it is necessary to run rsync with the --modify-window
option set to at least a second or two to avoid repeat copying of
files.
Finally, always remember that tradition dictates that before you
help yourself to someone else's network time service you should
send them a quick email requesting permission. It's the polite
thing to do and won't take much of your time!
Offline Backups
Once you have this system in place, making more permanent archives
of the data from the comfort of your FreeBSD filesystems will be
relatively easy. I've chosen to fulfill my "customer self-restore"
goal by using mkisofs and cdrecord to write customers'
data to CDs and DVDs. I use gzip to compress each individual file,
so the customer is still working with a familiar filesystem, and
many Windows-based zip packages happily speak gzip.
Commodity media might be an unattainable luxury in a larger implementation,
so a more conventional backup to tape might be more appropriate.
Amanda and Bacula are your friends here. Both support a wide array
of tape drives and auto-changers.
If you have disk space to burn, rsnapshot might be of interest.
Rsnapshot uses hard links to give the impression of multiple full
backups, all neat snapshots at regular intervals in time. You'll
need enough disk space to hold one full backup, plus changes, but
the potential for offering self-restore capabilities to your customers,
possibly over Samba shares, is an attractive prospect.
Traps for Young Players
The standard backup-system traps for young players apply here
as with any other. Two in particular are important here. Before
you put this system into production, you should satisfy yourself
that you have good answers to two questions:
1. Does my RAID-5 setup work? In other words, can I replace a
failed disk and have the array rebuild itself successfully?
Experiment with this one. Consider making a trial-run, perhaps
with a smaller array. Set the partitions to 100Mb instead of 200+Gb
to save yourself some time. Build your RAID-5 array, init
and newfs it, mount it, and fill it up with data. Once that
is done, forcibly fail the array -- perhaps use the atacontrol
detach command. (Be careful to only down one disk -- RAID-5
won't help you if you lose more than one.) Or, if you're
feeling a little crazy, power down one of your drives.
Vinum list should report that your volume is up and the plex is
in a degraded state. You will be able to continue to read from and
write to the array, albeit at a slower rate than usual. Replacing
a failed disk in a vinum RAID-5 array requires that you prepare
another disk with a partition the same size as the original, give
it the same name, and bring it back into the array using the vinum
start <diskname> command. Vinum will recalculate the data
that should be on the disk from parity and bring it back into the
array. While FreeBSD does support hot swapping of ATA disks using
the atacontrol command, vinum is happier with disks that
have been present since boot time.
2. Can I recover my offline and online backups?
This may sound like a silly question, but many people forget.
The most elaborate and carefully crafted backup system in the world
is useless if you can't recover the data. So test this, too.
Back up a client machine, then attempt to restore the backups. Recovering
the online backup should be as simple as rsyncing the data back
in the opposite direction. Offline backups are often trickier. Did
you really keep the data you need? Is the media you chose 5 years
ago still readable by current equipment? Has the media degraded
to the point where it can no longer be read?
You need to convince yourself that you can comfortably manage
your backup system, particularly in the arguably inevitable event
of failure. Play devil's advocate and think worst case --
imagine the horror scenarios and have a tested and working plan
for getting yourself out of them unscathed. Power failures, hard
drive failures, theft, and fires -- plan for them all. There
are not many things in life more difficult than explaining to your
boss that the backup system you built didn't work because of
some minor technical oversight five years ago.
Resources
FreeBSD
Techniques described in this article were implemented on FreeBSD
version 4.10-RELEASE, available from the main site and local mirrors
everywhere:
http://www.freebsd.org/
rsync
rsync is available in source form from:
http://rsync.samba.org/
It is included in the FreeBSD ports collection (cd to /usr/ports/net/rsync,
then make install clean) and in the packages directory on the
4.10-RELEASE CD. I use the rsync.exe binary from the Cygwin distribution
for Win32 systems.
Cygwin
The Cygwin distribution can be found at:
http://www.cygwin.com/
Download the Cygwin setup.exe from that site, run it, then follow
the bouncing ball. The setup program walks you through choosing a
mirror to install from, and choosing which components you need. rsync
and OpenSSH aren't defaults; you need to select them, but most
of the other tools I have used are part of the base Cygwin install.
If you require commercial support, Red Hat will happily sell it to
you at:
http://www.redhat.com/software/cygwin/
OpenSSH
OpenSSH is included in the standard FreeBSD distribution and works
perfectly well out of the box even without the few tweaks I've
mentioned here. Confirm that your /etc/rc.conf file contains the
line sshd_enable="YES" to ensure that sshd is started at
boot time. OpenSSH source and documentation are available from the
OpenBSD folks at:
http://www.openssh.com/
Amanda, Bacula, cdrtools, and rsnapshot
All excellent tools that you might use to make permanent offline
backups of your data; they are available from the following sites,
respectively:
http://www.amanda.org/
http://www.bacula.org/
http://ftp.berlios.de/pub/cdrecord/
ftp://ftp.berlios.de/pub/cdrecord/
Many are in the FreeBSD ports collection, so save yourself some time
and check there first.
Samba
Samba allows Unix and Unix-like systems to offer SMBFS and CIFS
file services to Windows (and other Unix) clients. Samba is available
from:
http://www.samba.org/
Note also that FreeBSD has support for SMBFS, though you'll need
to configure support for it into your kernel.
Geoff Breach, geoff@breach.com.au, is Technical Officer
to the School of Management at the University of Technology, Sydney
in Australia. He has administered AIX, HP-UX, SunOS, Solaris, BSDI,
and FreeBSD in commercial environments and currently balances his
time among postgraduate studies in management, research on the application
of agent-based systems to supply chain problems, and a twice-daily
motorcycle battle in Sydney's peak-hour traffic. Geoff's
only dependent child is an 11-month-old kitten, Douglas, named for
the late great Douglas Adams. |