Contingency
Planning: Lessons Learned from the 9/11 Tragedy
Lisa M. Jaworski
The terrorist attacks that occurred on September 11, 2001, resulted
in terrible human loss. Additionally, many buildings were destroyed
or damaged to the extent that they had to be condemned. From an
Information Technology (IT) perspective, networks were brought down,
equipment and cabling were obliterated, and on-site and local backup
tapes were destroyed. Because of the lengthy, ensuing chaos in the
local area, it was very difficult for businesses, whose key IT functions
were disabled, to bring disaster recovery personnel into the area.
An unknown number of users lost Internet connectivity because their
Internet Service Providers (ISPs) had points of presence in the
World Trade Center [14].
Most of the information published on the Internet about contingency
planning focuses on the overall process. Plan templates are available
and, although a few articles on lessons learned from the events
of 9/11 appear, they do not focus on practical plan development
and implementation issues. They rehash what is already considered
to be industry best practices. Other related articles focus on the
need for insurance. For example, event insurance for invitation-only
events such as corporate conferences should be considered [5]. Proper
insurance coverage is certainly necessary, but the scope of contingency
planning lessons must be much wider in scope.
This article documents as many of the practical, hands-on lessons
learned as possible. Much of this information was made available
through personal conversations with senior and mid-level IT managers
responsible for contingency planning in their respective organizations.
Both government and private perspectives were gathered.
Lessons Learned
The following paragraphs discuss contingency planning concerns
that were not generally considered part of industry best practices
before the 9/11 tragedy. The order in which these concerns are presented
is not indicative of their importance. This article assumes that
existing best practices related to contingency planning are already
in place (e.g., those published by the National Institute of Standards
and Technology (NIST) and others [2, 6, 11, 12]). Some examples
of best practices are the availability of detailed, written procedures
for configuring servers, up-to-date call lists, and an established
order of priority for restoring systems.
Locate Backup Facilities in Different Geographic Regions
It might seem obvious that backup facilities should be located
in a different geographic region than the primary facility, but
this was evidently not considered standard practice before 9/11.
NetworkWorld Fusion reported that while Genuity has a backup facility,
at the time of the tragedy it was located only 12 miles away from
the primary facility [14]. Such close proximity invites vulnerability,
depending on the nature of the disaster scenario. A disaster that
affects the primary site could easily affect the backup site. For
this reason, international service provider Equant maintains Network
Operation Centers (NOCs) in Reston, VA, Paris, France, and Singapore
[14].
Although the federal government now wants a minimum distance of
300 miles between primary and backup facilities, some agencies are
resisting this idea because of the high cost associated with establishing
a geographically removed site. Note that regional data centers require
their own contingency plans and these must be coordinated with overall
enterprise-level plans [6]. To preclude the possibility of a directed
attack, either network- or physically based, some government agencies
and companies (e.g., AT&T) do not publicly reveal the location
of their backup sites [14].
Organize an Alternate, Geographically Separated Response Team
Quite often, organizations will identify candidates for disaster
recovery teams from among the individuals who live in the immediate
geographic area as the facility or facilities for which the plan
is written and who already work for, or support, that facility.
This certainly makes good common sense; however, if there is widespread
death resulting from a catastrophic event in this location, the
people comprising the response teams may very well be among those
who are killed or disabled.
Disruptions or gaps in the regular chain of command should be
anticipated [8]. Even if this worst-case scenario does not occur,
the on-site workers may not be mentally ready to do their jobs,
as was the case for many people in the Ground Zero area [3, 9].
Organizations should establish two teams of individuals to fill
these roles -- one in the immediate vicinity and the other located
at least 100 miles away from the primary facility. This team would
be different from one that would be in place at an alternate hot,
warm, or cold site. This team's function would be identical
to that of the people who would normally be on-site at the primary
facility.
The idea is to for the second team to be located close enough
so they can travel to the facility without undue difficulty, yet
far enough away that they are unlikely to be victims of the catastrophe.
Training should be identical for both teams and both should be involved
in the annual testing of the organizations' contingency-related
plans, processes, and procedures. Driving directions to the facility
to be used by the second team should be reviewed as part of the
annual test. This team, as well as the primary on-site team, should
have building and site maps that show utility shut-off points, gas
lines, exits, stairways, designated escape routes, restricted areas,
and high-value items [13]. It is imperative that these maps be tightly
held and protected as company confidential.
Identify Alternate, Geographically Separated Storage and Staging
Areas
For reasons similar to those necessitating a second response team,
organizations should identify a second local off-site storage area
for backup tapes and critical hardware items. It should be located
at least 100 miles away from the site being supported. This storage
facility should be in the same general area as the alternate local
response team. Many establishments keep backup tapes in the same
city or town as the facility at which they would be used.
This practice should be continued because close proximity of the
backups to the facility will allow a faster recovery time. However,
if a catastrophic event is so widespread that several square miles
in the city or town are affected, the local off-site storage facility
might also be affected. This was the case with Verizon after the
9/11 attacks [7]. MasterCard stated after 9/11 that it would locate
backups within three to four hours driving range [9]. If a private
corporation is highly IT dependent and its primary facility and
all of its backups are destroyed, it is highly likely that the firm
will be forced to go out of business.
If the disaster scenario necessitates the activation of the alternate
response team, as discussed above, then it logically follows that
this team will need a local staging area. Assuming that the alternate
team will be driving to the primary site, they will need an area
to assemble equipment and load it into their vehicles. For convenience,
the staging area should be close to the alternate local off-site
storage facility. Depending on the nature of the emergency, the
team may need to stay at the primary facility for an extended time
period, so they will need to bring extra luggage for clothes and
possibly food, water, and temporary shelter-related items.
The team should bring everything to the staging area and check
each item off a list before loading it. AT&T reported that staff
drove to New York from as far away as Jacksonville, Florida [14].
Kemper Casualty Company had computer and telephone technicians drive
from Illinois to New Jersey with tools and equipment such as telephone
switch parts and server components [8]. If the organization's
IT budget permits, organizations should consider employing mobile
backup facilities [13].
Establish a Senior Management Help Desk
Quickly establishing a senior management help desk after a disaster
is important for several reasons. First, it provides a fallback
set of managers that the disaster recovery team can contact in the
event that key local managers or recovery team members are missing
from the recovery management chain, as documented in the contingency
plan, due to injury or death. Second, such a help desk facilitates
quick decision making, which is needed in a crisis. For instance,
being able to get fast management approval on purchasing requests
can be important to meeting established recovery time frames. This
help desk would also serve as the focal point for the media. Finally,
this help desk serves to keep the recovery teams focused on the
established recovery priorities.
Even the very best contingency plan will not address every issue
that can come up during an emergency. This is the nature of Murphy's
Law. The senior management help desk will be able to review proposed
solutions to such issues and ensure that they are consistent with
the other recovery efforts. Note that this is actually a lesson
learned from the United States House of Representatives after the
anthrax attacks, rather than the 9/11 tragedy.
Identify Alternate Vendors and Shipping Routes
Companies should establish "quick ship" contracts with
vendors for product and equipment replacements in the event of an
emergency [3]. They should also anticipate that vendors may be saturated
in an emergency situation, so they should identify alternate vendors
and establish "as needed" contracts with them. Vendor
contact information and contract numbers must be maintained. If
a small local vendor is used, organizations should inquire as to
their contingency preparations. It should be expected that a vendor
would divulge information only if a nondisclosure agreement (NDA)
is in place. It is quite possible that the same disaster scenarios
will affect these companies, just as they affect the organizations
they support.
Organizations also need to prepare a prioritized list of the equipment
types and vendors they use to support each equipment category. According
to NetworkWorld Fusion, some industry analysts say that up to 60%
of an organization's critical data is stored on individual
laptops and desktops [13]. After 9/11, SunGard confirmed that peripherals
such as printers were commonly overlooked [3]. The lack of certain
equipment, however mundane (e.g., copiers) can affect productivity.
Kemper Casualty Company stated that they were able to get immediate
delivery of 100 laptops and 200 monitors from their vendor, along
with printers and fax machines [8]. In the first three weeks after
9/11, Compaq shipped 2,500 PCs to Lehman Brothers [4].
Assess Locations of Carrier Circuits
The potential for carrier failures should be addressed in an organization's
overall network strategy as well as its contingency plan. Equant
lost connectivity to Canada after the 9/11 attacks because the circuit
they had purchased for network diversity reasons was routed through
Wall Street; however, it was also connected to Toronto and Montreal
[14]. Equant stated that they did not know this was the case before
9/11. Equant has since reexamined local connectivity for all of
its networks [14].
Redundancy and Data Mirroring Are Crucial to High-Availability
Systems
Assuming that an organization's IT budget can accommodate
the high cost of redundancy for its high availability systems, this
is key to fast recovery time. Lehman Brothers was able to recover
its IT capabilities quickly after the 9/11 attacks because they
had redundant networks in both Manhattan and New Jersey [4]. "Every
application that ran in New York also ran in New Jersey. All wide-area
links were completely duplicated." They had lost access to
everything in New York, but were able to access all of their other
branches through New Jersey [4].
CBS Marketwatch.com has said that the 9/11 property loss changed
their views on data replication [3]. "The data for their live
tickers used to flow into one data center. Now it flows into all
three." Southwestern Bell has installed SONET rings and multiple
OC-12 links for redundancy [9]. Upon post-9/11 review of its contingency
plans, MasterCard expects to expand its use of data mirroring, real-time
data duplication, much more significantly [9]. Their goal is to
reduce recovery time from 24 hours to two hours [9].
Establish an Emergency Communication Plan
Telephone service was down in many areas on the day of the tragedy,
either because wires were down or because service requests saturated
the system, as was the case with cell phones. Many organizations
reported that BlackBerry wireless devices were their only effective
means of communicating for several days following the 9/11 tragedy.
Bob Schwartz, managing director and Chief Technology Officer for
Lehman Brothers during the tragedy, said that as he went down the
stairs in Tower One, all he had to activate the disaster recovery
plan and alert other managers was his BlackBerry pager [4].
Organizations need to consider how key members of the management
staff and the response team could communicate if phones are out
and the email system is unavailable. BlackBerries have proven to
be an effective alternative, but organizations should continually
monitor the marketplace to identify other options, too. The main
point is that organizations cannot rely on phones and email only.
At this time, it is unclear whether BlackBerries would become less
effective as a means of emergency communication as the market saturates.
It should be noted that the BlackBerry uses a store and forward
communication method; they keep trying to send a message even if
the sender is not continually pressing the send button. Whatever
type of device is selected, inter- and intra-team communications
with them should be tested as part of the overall contingency plan
testing process.
Minimally, senior managers and all members of the disaster recovery
teams should be assigned some sort of backup communication device.
Senior managers in the Human Resources Department should be included
on the distribution list for these devices so they can begin making
arrangements for counseling and other forms of employee assistance
in the wake of a disaster or catastrophic event. Human Resources
can assist with contingency plan formulation by identifying resources
in the local community, such as public safety and utilities, as
well as a list of mental health professionals who can assist with
post-disaster counseling.
Utilization of local hotels as emergency work areas should not
be overlooked. Lehman Brothers used Sheraton Manhattan to give their
people a place to work [4]. The hotel's ballroom became an
IT hub with Virtual Private Network (VPN) connectivity to New Jersey.
Human Resources can also distribute wallet-size emergency phone
numbers for personnel and their families and instruct employees
on the need for personal emergency plans and kits, as described
below.
Do Not Assume Air Transportation Will Be Available
All U.S. planes were grounded for several days after the 9/11
attacks and international planes attempting to enter U.S. airspace
were turned away. Many organizations' contingency plans relied
on the assumption that airlines would be operational to transport
both people and equipment, as needed. MasterCard is one firm that
had assumed this, and they reportedly are now updating contingency
plans to account for a potential lack of air travel [9]. Other means
of transportation such as rail and trucking lines were not identified
much less contracted.
This issue also pertains to vendor shipping routes. Long-distance
driving directions for key organizational managers and disaster
recovery team members were typically not available or, if they were,
they were not verified as part of the contingency plan testing process.
The lesson here is that multiple means of transportation need to
be identified and contracted for on an as-needed, emergency basis.
Identify a Meeting Place Away from Your Facility
Many businesses that operated in the World Trade Center had established
meeting areas at which personnel would gather after evacuating their
offices so that managers could take a head count of work force members
and subsequently alert authorities if anyone was missing. In many
cases, the meeting spot was in the basement of the building. It
would appear that the idea of the whole building collapsing did
not seem significant but, unfortunately, we now know that this is
a viable scenario. Emergency meeting spots should be at least a
block or two away from the office building to preclude people from
being injured in a structural collapse.
Conduct Monthly Evacuation Drills
Evacuation drills, often referred to as fire drills, should be
conducted periodically, and the feasibility of the assigned meeting
spot(s) should be evaluated after such drills. Drills should be
both announced and unannounced. As an organization grows, additional
meeting spots may be needed. It is suggested that announced drills
be conducted monthly. In his book The Myth of Homeland Security,
Marcus Ranum states that the discipline of constant drills will
stand you in good stead even in a completely different kind of emergency
[15].
Everyone Needs a Personal Emergency Plan
In the confusion after the attacks, many parents could not get
to their children's daycare centers to pick up their kids.
Because phone service was largely unavailable, the parents could
not call each other to see if one of them had the kids. To help
reduce these types of uncertainties, each household should have
a personal emergency plan [1]. Such a plan would identify a meeting
spot outside of the house; location of emergency bug-out kit; location
of will, trust documents, and other important papers; copies of
medical prescriptions; lists of each person's allergies; and
anything else that would be needed in an emergency.
An out-of-state third party should be identified to serve as a
message relay center for the family members. If cell phones are
down, a parent could get on a land line and leave messages with
this third party. Each adult family member should have a will, which
should identify guardians for the children in the event that both
parents are killed, durable power of attorney for healthcare, and
one for financial decisions. Each family member should also have
a wallet-sized card that has all important phone numbers on it as
well as current photographs of the other family members.
All Personnel Need Emergency Kits
Many survivors of the 9/11 tragedy were horribly burned and, at
the time of this writing, some continue to face additional surgeries
and rehabilitation therapy. Organizations must educate employees
on the need for employees to put together a portable emergency kit
that they can keep in their work areas. Such a kit should contain
a personal-sized fire extinguisher, fire blanket, flashlight and
batteries, an escape ladder, safety glasses, bottle of water, mask
to help prevent smoke inhalation injuries, detailed street map of
the city or town, first aid kit, and other appropriate items. These
should all fit into a large tote bag that a person can grab and
run with.
As part of its responsibility in this area, organizations should
hold quarterly seminars that discuss building evacuation routes,
identify primary and alternate meeting spots, warn employees against
using elevators in an emergency, particularly one involving fire,
and provide contact information for employees' loved ones in
the event of an emergency. Identification of evacuation routes is
especially important in skyscraper buildings.
Conclusion
Perhaps the greatest lesson that the 9/11 tragedy taught us is
not to underestimate the threats to our nation or our people. The
idea of initiating attacks on buildings using airplanes as weapons
was known long before 2001; however, most people thought the notion
so outlandish that they believed it could never happen. It has happened,
and we must learn from this experience. SunGard, a leading business
continuity services firm, has stated that the main problem they
saw after 9/11 with companies' contingency plans was that the
scope had not been completely or accurately defined [3]. This certainly
lends credence to the need to expect the unexpected. We must take
this lesson to heart.
As a final note, the idea of distributing lethal pathogens such
as anthrax by mail, which occurred shortly after 9/11, was once
considered as improbable as using airplanes to perpetuate physical
attacks. Thus, in addition to the common sense advice presented
in this article, organizations should also identify and address
scenarios targeting key personnel as part of the contingency planning
process. Experts say that contingency plans should include remote
management in case of biological attack [3].
Dedication
This article is dedicated to the many heroic men and women who
died on September 11, 2001.
References
[1] Department of Homeland Security. 2004. Emergencies and Disasters.
Washington, DC: Department of Homeland Security. Published on the
Internet at: http://www.dhs.gov/dhspublic/.
[2] Department of Justice. August 21, 2001. Department of Justice
Contingency Planning Template Instructions. Gaithersburg, MD: NIST.
Published on the Internet at: http://csrc.nist.gov/fasp/FASPDocs/contingency-plan/contingencyplan-template-instructions.doc.
[3] Fontana, John & Connor, Deni. November 26, 2001. Disaster
Recovery Then and Now. NetworkWorld Fusion. Published on the Internet
at: http://www.nwfusion.com/research/2001/1126featside1.html.
[4] Gaudin, Sharon. November 26, 2001. Lehman Brothers' Network
Survives. NetworkWorld Fusion. Published on the Internet at: http://www.nwfusion.com/research/2001/1126feat.html.
[5] Houston, Carey. 2004. Lessons Learned. Calgary, Canada: PRIMEDIA
Business Magazines & Media, Inc. Published on the Internet at:
http://technologymeetings.com.
[6] Legato Systems, Inc. February 1, 2002. The New Art of Business
Continuance Planning: Lessons Learned in a Changed World. Framingham,
MA: CXO Media, Inc. Published on the Internet at: http://www.cio.com/sponsors/020102legato/.
[7] Leung, Linda. May 13, 2002. Prepare For Emergency. NetworkWorld
Fusion. Published on the Internet at: http://www.nwfusion.com/research/2002/0513man.html.
[8] MacSweeney, Greg. September 11, 2002. One Year Later, 9/11
Disaster Recovery Memories Still Fresh. Insurance & Technology
Online. Published on the Internet at: http://www.insurancetech.com.
[9] Messmer, Ellen. December 2, 2002. MasterCard Factors 9/11
into Disaster-Recovery Plan. NetworkWorld Fusion. Published on the
Internet at: http://www.nwfusion.com/news/2002/1202mastercard.html.
[10] NetworkWorld, Inc. Undated. Disaster Recovery. NetworkWorld
Fusion. Published on the Internet at: http://www.nwfusion.com/research/disasterrecov.html.
[11] NIST. June 2002. Contingency Planning Guide for Information
Technology Systems. NIST Special Publication 800-34. Gaithersburg,
MD: NIST.
[12] NIST. June 2002. Contingency Planning Guide For Information
Technology Systems. Elizabeth B. Lennon (Editor). Gaithersburg,
MD: NIST Information Technology Laboratory. Published on the Internet
at: http://csrc.nist.gov/publications/nistbul/itl06-02.txt.
[13] Ohlson, Kathleen. November 26, 2001. Planning for the Worst:
Bring in the Best. NetworkWorld Fusion. Published on the Internet
at: http://www.nwfusion.com/research/2001/1126featside5.html.
[14] Pappalardo, Denise & Marsan, Carolyn Duffy. November
26, 2001. How Ready Are the Nation's Networks? NetworkWorld
Fusion. Published on the Internet at: http://www.nwfusion.com/research/2001/1126featside3.html.
[15] Ranum, Marcus J. 2004. The Myth of Homeland Security.
Indianapolis, IN: Wiley Publishing, Inc.
An employee of Science Applications International Corporation
(SAIC), Lisa Jaworski has more than 20 years of security engineering
experience on commercial and government projects. She is a key player
in the development of SAIC's standardized approach to Critical Infrastructure
Protection (CIP). She is also an expert in Health Insurance Portability
and Accountability Act (HIPAA) security and privacy requirements.
She is one of the authors of NIST's Computer Security Handbook and
she was part of the team that had connected the White House to the
Internet. Per Government invitation, she has spoken on information
warfare at FedCIRC. She can be contacted at: jaworskil@saic.com. |