Oracle
Automated Recovery
Evan Jenkins and Cameron Fischer
An Oracle database represents a significant investment for any
company, and often the data supports mission-critical applications.
Unix systems administrators must ensure that data is protected and
recoverable. In this article, we will describe a simple yet effective
way to harden your Oracle database backup process.
Business Justification
The backup of our Oracle implementation, now about 3 years old,
recently experienced a complete overhaul. The backup was converted
from a file system-based cold copy of the database to a SAN-based,
low-level synchronize and fracture, otherwise known as LUN cloning.
A Storage Area Network (SAN) is a Fibre Channel network of attached
massive storage devices. LUNs are Logical Unit Numbers, roughly
equivalent to one logical disk presented to Unix hosts.
With the new-style backup, the Unix administrators gained about
4 hours of time in their backup window because the clone fracture
process of backing up LUNs is nearly instantaneous, especially when
compared to the laboriously slow cpio file system-based backup.
We'll provide detailed explanation of the SAN-style backup later
in this article.
Meanwhile, the IT management had concerns about the SAN backup.
How reliable was it? Could it compare to the previous traditional
backup's track record? What insurance was there that the new way
of copying data was fool proof? These are all perfectly reasonable
questions to ask and extra time to ensure management's concerns
were addressed was time well spent.
We set out to answer the age-old question that Unix administrators
have always faced -- how do you know your backup worked? We had
the standard tools and vendor-supplied solutions, such as multi-streaming
io to two tape drives and write verification. Although these tools
are useful in verifying the final resting place of backups, they
all fail to execute the one acid test that answers the above question:
proving that the backup can recover the database for each and every
backup instance. Our solution was exactly that -- an automated recovery
of the production data files. In fact, the solution solved three
important problems for the business:
- It relieved IT management's concern that the SAN-based backup
was at least as reliable as the traditional file-based backup.
- It provided an additional fail-safe point for the backup process.
Auto-recovery is run before disk images are written to tape. If
auto-recovery fails, there is an excellent chance that the disk
image is somehow corrupt and will fail to recover the production
database. In this case, the DBAs can be notified and correct the
problem before it becomes much more serious.
- It allowed the Unix systems administrators to get a decent
night's sleep.
These are all excellent points (especially the last one).
Figure 1 shows the mirror LUN continuously syncing with the production
database during the daily online window. Figure 2 shows the beginning
of the backup window when the database is shut down and the mirror
LUN is fractured, thus providing an exact copy of the production
data files. Figure 3 shows that auto-recovery begins by creating
a snapshot LUN of the mirror LUN. This third copy of the data files
is mounted on an alternate server, and a temporary database is brought
online, thus validating the copy of data files.
For the auto-recovery mechanism to work, we needed a convenient
and reliable way to present copies of LUNs to a host that could
initialize a database instance. Our idea for auto-recovery was inspired
by the impressive features of a typical network storage device and
its cloning capabilities. Using this technique is often referred
to as a Business Continuance Volume (BCV).
SAN-Based BCV Framework
In a conventional SAN configuration, a server is attached to a
SAN, either directly or via Fibre Channel switches. Host Bus Adapters
(HBAs) connect the host to the Fibre Channel network, or SAN. The
SAN contains many logical devices that share a single Fibre Channel
bus. LUNs are used to differentiate each unique device.
In a SAN, disks are contained in RAID groups, which are simply
a collection of disks. From the collection of disks defined as RAID
groups, logical disks are carved from the total disk space allocated
to the RAID group. During the creation of a RAID group and LUN,
the RAID type and specific configuration items are defined.
Once a LUN has been assigned to a host, and the appropriate host
configuration is complete, the LUN is used in the same manner as
local SCSI disk. The LUNs are set up as typical Unix mount points
that can be used for the Oracle database.
Creating a cold backup of a several-hundred gigabyte database
can be time consuming and cause extended database downtime, resulting
in lost productivity. LUN cloning is a feature of a SAN that creates
a block-for-block copy of a LUN. A clone requires the creation of
a mirror LUN that will hold the copy image. It must be exactly the
same block size as the original LUN (known as the source LUN).
The BCV Process
In the first step of the BCV process, the source LUN is synchronized
with the mirror LUN. To speed up synchronization time, only the
changed blocks since the last synchronization are copied to the
mirror. These changes are applied in the background without end-user
impact. Before the database shutdown, the source and mirror LUNs
contain exactly the same data.
The second step of the BCV process is to perform an orderly shutdown
of the database, ensuring it is in a consistent state.
The third step is to disconnect or fracture the source LUN from
the mirror LUN. Fracturing allows each LUN to operate independently.
Any blocks written to the source LUN will not be written to the
mirror at this point. However, all updated blocks on the source
have their block location recorded in the Clone Private LUN (CPL),
which is the feature used to speed up the synchronization in step
one. Once the fracture of the LUNs is complete, the production database
is free to restart. However, it is prudent to check the backup before
restarting the database, because the backup is now located on the
mirror LUNs. The total backup time is 14 minutes.
The fourth step in the BCV process is to make the mirror LUNs
visible on a host so they can be manipulated for the auto-recovery
test and copied to tape. The SAN offers another feature called snapshotting.
A snapshot is a representation of the mirror LUNS constructed from
a private area in the SAN. This third copy will be used for the
recovery operation, and it is safe to write to the snapshots; the
mirror LUNs will remain in a pristine state.
The snapshot LUNs are assigned, configured, imported, checked,
and mounted on an alternate host. This is achieved with a Command
Line Interpreter (CLI) in the SAN service processor and a series
of custom scripts. The alternate host now contains a copy of the
database as it was when the fracture occurred. From this point forward,
the mirror LUNs are mounted on the backup server in a read-only
file system, awaiting the go-ahead to write to tape.
We extended our BCV snapshot scripts so they detect and report
errors and accept parameters. Now they can create a snapshot of
any LUN and present it to any host. Unix SSH plays an important
role here, and what was once a modest BCV series of scripts has
become a BCV-distributed framework.
Auto-Recovery
To perform the recovery operation, three basic ingredients are
required:
1. Cut the Oracle control file. A valid copy of the control file
from the production database is required. The control file must
be obtained before the system is taken down for backup because the
Oracle control file contains essential information about the database
file layout. It must match the database image you are attempting
to recover, or the database will refuse to mount and open. The control
file is notoriously picky about bringing up the database. For example,
data files are all synchronized with an internal header field. All
data files must have exactly the same value in the timestamp field
or the recovery will fail.
2. You need a mounted copy of the source LUNs on the server where
the recovery operation will take place. The Oracle owner must have
read-write access to the data files. Altering our database image
is fine because our image is the disposable "third" copy. Our original
copy (production) is still down, and our mirror copy is waiting
to be written to tape.
3. An installation of the same version database is required on
the alternate server to perform the database recovery. Access to
the Oracle binaries and associated toolset is used to perform the
recovery operation (in our case, Oracle 9.2.0.3). Another technique
is to include the Oracle binaries in the production instance snapshot
and use those instead.
An easily overlooked feature of our solution is the fact that
we used some of the oldest and well-known tools that come with Unix.
If I trawl through the scripts we wrote, I see nothing more than
the use of ssh, scp, ksh, sh, sed, uniq, grep, and sqlplus (the
command-line interface into the Oracle database). The use of these
tools means that future maintainers will have no problem reading
our scripts and modifying as necessary, and no third-party vendors
can render our work obsolete. It also reflects one of the main reasons
why Unix has fared so well in the IT industry over the years --
small scripts are pieced together using small yet powerful operating
system components.
These components (like sed, the Unix stream editor) are themselves
built from the judicious use of operating system calls in a standard
way. This analogy holds true from the outside command interface
right through to the kernel core. Unix is literally built from itself
from the ground up.
Let's go through the process so we can illustrate some of the
details of auto-recovery beginning with the basic ingredients as
mentioned above.
To get the control file from the running database, we create a
file containing a single sqlplus command to tell the database to
dump the control file to disk:
alter database backup controlfile to trace;
The command leaves an Oracle trace file in the Oracle-defined dump
destination area. We use the secure shell copy command scp
to copy the file over to our alternate server for later use.
Now we can perform the recovery operation. To begin, we manipulate
the control file so we can start a new Oracle instance. We wrote
a file containing some sed commands:
1 s/^#/--/1
2 s/^CREATE CONTROLFILE REUSE DATABASE "PROD1" RESETLOGS \
ARCHIVELOG/CREATE CONTROLFILE REUSE SET DATABASE \"REC1\" \
RESETLOGS NOARCHIVELOG/
3 s/\/u03\//\/u03prod\//g
4 s/\/u05\//\/u05prod\//g
5 s/\/u06\//\/u06prod\//g
6 s/\/u07\//\/u07prod\//g
7 s/\/u08\//\/u08prod\//g
8 /^VARIABLE RECNO NUMBER;/d
9 /^EXECUTE :RECNO :=SYS.DBMS_BACKUP_RESTORE.SETCONFIG \
('CONTROLFILE AUTOBACKUP','ON');/d
10 /^EXECUTE :RECNO := \
SYS.DBMS_BACKUP_RESTORE.SETCONFIG('BACKUP OPTIMIZATION','ON');/d
11 /^RECOVER DATABASE USING BACKUP CONTROLFILE/d
Line 1 removes all the comments from the file. Line 2 will seek the
line beginning with:
CREATE CONTROLFILE REUSE DATABASE "PROD1" RESETLOGS ARCHIVELOG
and replace it with:
CREATE CONTROLFILE REUSE SET DATABASE "REC1" RESETLOGS NOARCHIVELOG
This effectively renames the new Oracle instance to REC1. Lines 3
to 7 replace references of the production mount points to the snapshot
mount points. Lines 8 to 11 remove references to Oracle backup utilities
that are configured on the production instance. We apply the sed
commands thusly:
sed -f controlFile.sed controlFile.trc > createREC1.sql
Next, we create an Oracle initialization file using the production
initialization file, which was transferred with the copies of production
data files. We can find the file in the $ORACLE_HOME/dbs location.
It is usually named initSID.ora, where "SID" is the Oracle instance
name. We apply another sed script:
1 s/^db_name = PROD1/db_name = REC1/1
2 s/\/u01\//\/u01prod\//g
3 s/\/u03\//\/u03prod\//g
4 s/\/u05\//\/u05prod\//g
5 s/\/u06\//\/u06prod\//g
6 s/\/u07\//\/u07prod\//g
7 s/\/u08\//\/u08prod\//g
8 /^LOG_ARCHIVE_START=TRUE/d
9 /^LOG_ARCHIVE_FORMAT=%t_%s.arc/d
10 /^LOG_ARCHIVE_DEST=\/oraarch\/PROD1\/arch/d
11 /^RECOVER DATABASE USING BACKUP CONTROLFILE/d
These sed commands alter the file so when Oracle is started it will
be content to call itself REC1 instead of PROD1. You can also see
how we are referencing the snapshot mount points as above and removing
some unnecessary lines from the file. Save the new file as initREC1.ora
and leave it in the $ORACLE_HOME/dbs location. Now we are ready to
do the recovery:
sqlplus "/ as sysdba" @createREC1.sql > createREC1.log
If everything works, we will have a brand-new Oracle instance on the
host called REC1; it has the exact contents of the production system
just before it was brought down. The true value, of course, is the
act of recovering the database and the logs and errors that may be
produced. The most important file produced from the recovery is the
Oracle alert log, which can be found in $ORACLE_HOME/bdump/alert_REC1.log.
In this file, you find all kinds of interesting information about
your new instance, but most importantly you can find out why it failed
to initialize. You will also find clues in the file createREC1.log.
Process them both with grep like so:
grep ORA createREC1.log | uniq
The use of uniq will take care of repeated errors. The above command
provides telling details about why the recovery might have failed;
otherwise, it provides a crystal-clear sign that the recovery was
successful and the backup has an excellent chance of rescuing the
production system if necessary. The entire recovery operation takes
fewer than 4 to 5 minutes to execute, but by the time it is finished,
there can be no question about the validity of the backup image. We
run the auto-recovery as an integrated component of the full system
backup every single time it runs.
Backup solutions have a tendency to work fine for the first few
months or even years, and then fail at seemingly the most inopportune
moment. Auto-recovery is an independent verification run every single
time a backup is taken. This can potentially save your company from
catastrophic failure and data loss. And Unix systems administrators
can get a good night's sleep.
Evan Jenkins, Unix Administrator and IT Consultant, lives in
Edmonton, Alberta, Canada with Dianne, Max, and Jack. He can be
reached at: evan.jenkins@telus.net.
Cameron Fischer, SAN and Unix Administrator, lives in Edmonton,
Alberta, Canada with Heather. He can be reached at: cam@fischer.ca. |