Image-Based Versus File-by-File Backup
Tim Jones
When backing up data from a computer system, there are many different methods available. Some more reliable than others. The most popular method is to copy the files on your system, one by one, to your archive device. A file-by-file backup. There are many variations on this method, which are referred to by different names: incremental, differential, full, grandfather-father-child, and others. Another method, less often used, is based on the process of copying the entire contents of a given disk drive. Instead of backing up just one file (i.e., C:\DOS\FORMAT.COM) you would back up each sector on the disk as raw data, creating an image of the entire disk contents. This method is called image-based backup.
When Should I Use Image-Based Backup? Image-based backup provides a mechanism that allows you to more completely recover a crashed system without having to spend time dealing with partitions, disk geometries, drive letter assignments, or drive formatting. Whatever partitions or formats were defined on the original disk will be restored when the image is rewritten onto the new disk drive.
This way, you have rescued everything that was on the drive in both the user-accessible data areas as well as the system areas, including the master boot record (MBR), partition table, and any partition-based boot records. This last, but important step, is almost always overlooked in a file-by-file backup scheme unless special steps are taken to ensure that they are backed up. Examples of such steps include special boot recovery disks or methods that copy these system areas into files on the user-accessible areas.
Within the UNIX or Linux world, there are a number of utilities that are part of your normal system install that allow you to perform an image backup of your system. These include dd, of course, but many people don't realize that simply using the cp command can offer the same results. An example operation using dd could be as simple as:
dd if=/dev/rdsk/c0t3d0s0 of=/dev/rst0 bs=10k
for a straight image backup to the /dev/rst0 tape drive, or as elaborate as:
dd if=/dev/hda2 bs=10k count=1000 |gzip |tee /dev/rst0 \
|sum >/etc/images/dd_image`date +%b`.sum
To create the backup, compress it with GNUzip and create a checksum file named with the month.
In addition to the default tools, there are commercial applications available for PC-based systems that include EST's QuickStart Data Rescue, Symantec's Ghost, CDP's SnapBack, and others. These commercial tools require that you reboot your system using their respective boot diskettes to perform the actual backup operation. Although this creates the shutdown issue that I will discuss below, you are assured that the image produced by these tools will be complete and stable.
What Special Situations Does Image-Based Backup Create? Since image-based backup only copies raw data, backups made using this method have no concept of the files involved or mechanism for dealing with the restoration of a single file. This means that restoring a system based on an image-based backup is an all or nothing operation - there is no way to recover a single file from within the image-based backup.
Also, when using this type of backup, be sure that the tool used to create the backup provides a mechanism for adjusting certain disk parameters in the event that the disk that was backed up has different geometries than the replacement disk. These differences could include a difference in the size, number of heads, number of tracks per cylinder, or even the numbers or size of the sectors in a given track.
Finally, image-based backup almost always requires that your system is booted into maintenance mode to ensure that the disk contents don't change during the backup operation. Because this requires shutting your server down to perform the backup operation, it is not the best option for daily backups when your servers are running in a 7x24 environment.
When Is File-By-File Backup Better? Whenever you may need to recover subsets of data from a backup (e.g., a single file, a user's directories, a system configuration file), file-by-file backup offers the best mechanism for recovery. You should not have to restore an entire image just to recover a few files.
And, as mentioned above, when you are concerned with backing up server or high-availability systems, file-by-file backup allows you to backup the system without interfering with the normal operation of the system.
Which Should I Use? This is not an either/or situation. Each method has its strengths and weaknesses. However, a combination of both daily file-by-file backup for your simple recovery needs ("Hey, Don, I deleted the '97 payroll files by mistake. Can we recover them?") and weekly or monthly image-based backup for serious system disasters ("Hey, Don, why is the system making that screeching noise? And what's that smell?") will provide you with the peace of mind that a complete system backup and disaster recovery solution can deliver.
Which Tools Are Right for Me? Select your image-based tool based on a few simple rules:
- Does the product provide data verification to ensure that the data was properly written to your backup device?
- Does it work with the backup device you use, or do you need to add new hardware to support it?
- Does it get all necessary information to successfully recover your system in the event of a fatal system crash?
- Is the backup and recovery process easy enough for a non-technical user to operate?
- Does it take into account any geometry differences in the event that the replacement drive is not of the same size/geometry/manufacture as the original drive?
Select your file-by-file backup tool based on similar simple rules:
- Does the product provide data verification to ensure that the data was properly written to your backup device?
- Is the product easy to use?
- Does it provide the ability to perform backups based on file dates (incremental/differential)?
- Does it allow you to run your backups in a scheduled mode?
As you can see, image-based backup and file-by-file backup are not competitive technologies. Together, they help create a backup and recovery strategy that will make your recovery, whether from a simple file deletion or a full system crash, a non-event. n |