Enter
the Storage Administrator
Greg (shoe) Schuweiler
The tasks of a systems administrator's job are changing.
In small and large environments, systems administrators used to
spend their days (and nights) managing any number of servers with
locally attached disks with an occasional timeout for a game of
X-Pilot, Doom, etc. We tuned the servers, managed the disks and
the data, and backed up the data to locally attached tape drives.
The failure of a disk usually meant an outage while the failed
disk was replaced, formatted, file systems built, and the data was
restored from tape. The outage was a fact of life that both the
users and administrators accepted. These were not always the good
old days. I do not miss trying to justify to management a 540-MB
hard drive that cost $3500. I do miss running scripts via cron that
automatically sent an email stating the top five disk hogs to everyone
with an account on the system and letting peer pressure do the rest.
Alas, one must be a gentler systems administrator nowadays. Sigh.
Now I have a filing cabinet drawer full of 4.5-GB, 9-GB, 18-GB,
and some 36-GB SCSI drives that are perfectly serviceable but too
small to be used in production. I just cannot bear to part with
them. We now purchase wads of storage that are racked into datacenters
in 0.5 and 1-TB clumps (a techie term) or disk arrays that are monolithic
blobs (another techie term) that snap together like Lego building
blocks and add storage by the tens of terabytes.
Our businesses' quest for acquiring data makes a salmon's
run upstream to its spawning place look like a Sunday stroll around
a pond. Because of regulatory compliance reasons, business policies,
or both, data is being stored longer and more copies are being kept.
We are slicing, dicing, reassembling, and stuffing the results into
databases of sizes unimaginable just 10 years ago. Queries and processing
against this data take place 24x7, shrinking our backup windows
from all night to nothing.
A good part of a sys admin's time is now spent, if not directly
managing storage, answering requests for more space. As server hardware
and operating systems improve in reliability, the management of
these servers will soon be superseded by the management of storage
and the data that is on that storage.
Systems administrators are trying to keep up with server, network-attached
storage equipment such as switches, GBICs, arrays, operating system
changes, firmware levels, boot proms, virtualization, consolidation,
user demands, and scores of other things that are involved in the
data acquisition, processing, and storage environment. While we
are overwhelmed with requests from customers and managers, our management
and executives are inundated with ads and sales people and are making
decisions that will ultimately affect us and the systems for which
we're responsible.
The time has come for us to look seriously at a whole new career
path that is opening up in the world of systems administration.
This new path is the management of storage and the data contained
on it. I propose we name this position right up front. Let's
call it "System Storage Administrator" or maybe "System
Storage Architect". Either will do, but I am betting that the
latter will get you at least $6K more a year and possibly an office
(it might be next to the furnace room but, hey, won't Mom be
proud).
So, one of the first things on your agenda should be to influence
management all the way to the CIO that a dedicated storage expert
or team is necessary to ensure business continuity. Depending on
the size of your organization and number of the levels between you
and the CIO, this idea may become someone else's along the
way. Don't fret, because in this case the means justifies the
end. It may be poor judgment on my part, but I am going to assume
here that various levels of management have already heard from vendors
or elsewhere of the concept of storage virtualization, of the savings
from storage consolidation, and a host of reasons why they (the
vendor) can build your business a Stepford Storage Area Network.
If they haven't, then this will be another task on your list
of things to do.
Most of the discussions you will have about forming a storage
administration position or team with your management should be centered
around two things:
1. Saving the business money in the long run
2. Providing a high level of data reliability (i.e., ensuring
that the data will be there when it needs to be)
To do this, start by running a bit of history past them. For much
of the history of computing, storage has been seen as an intrinsic
part of a computer system. It was regarded as a "peripheral".
In more recent years, it has become thought of as a storage subsystem
but is still uniquely associated with a computer. Exceptions are
the mainframe and some computer clusters where a modest number of
cooperating computer systems share a common set of storage devices.
As companies have become more dependent upon computing, they have
also become more dependent upon data, the life-blood of computing.
While a failed processor can readily be replaced and operations
continued almost immediately after the replacement, a failed storage
resource requires replacement followed by typically time-consuming
restoration of data. This restoration all too often involves some
loss of recent changes to that data that requires recovery action
before operations can continue. As a result, storage and the disciplines
of caring for data and the storage on which it resides have grown
in visibility and importance.
Additionally, the fraction of the purchase price of a computer
system that is represented by the storage component has grown to
the point that now the storage component of a computer system is
often in the vicinity of half or more of the total price. Beyond
the purchase price of storage, the total cost of owning storage
has become a significant part of the cost of maintaining the computing
environment. In other words, the acquisition cost is a small portion
of the total cost of ownership of storage over the lifetime of the
storage.
Computing environments necessarily have grown as we have become
increasingly reliant on computing. Thus, the number of computer
systems we manage has grown in size and in number. Because the traditional
computing model associates storage uniquely with a computer system,
a computing environment with many computer systems has many storage
and storage management environments to maintain and operate --
one per system.
Responding to these trends, the IT community has begun to view
storage as a resource that should be purchased and managed independently
from the computer systems that it serves. The IT community has also
increasingly come to view storage as a resource that should be shared
among computer systems. These changes allow more focused attention
on storage, which leads to reduced costs, higher levels of service,
and more flexibility through the sharing of the storage resource.
This, in turn, allows a storage administrator or storage team to
provide improved quality and response time as business needs change.
As the storage system for a computing environment becomes a shared,
independent resource, additional requirements emerge:
- Reliability -- As required of any large, shared critical
resource.
- Scalability -- To match the size, performance, and physical
and geographic placement of computing environments.
- Manageability -- To provide high levels of service and
achieve the expected reduction in operational expenses.
- Standards-based interoperation -- To avoid excessive vendor
dependence in a large, critical component of data centers.
When all these aspects are considered, a structure emerges that
achieves the goals of reliability, scalability, and manageability.
That structure is a storage system comprising many computer systems
that are the consumers of the storage system, many storage devices,
and extensive management capabilities. These systems are richly
interconnected and demand high performance. These are the characteristics
of a shared storage environment, and these are the benefits that
a storage administrator or team can bring to the business.
If I have convinced you, and you in turn have convinced management
that a storage administrator or team is needed, the next step is
to determine that storage administrator's responsibilities.
This administrator should have three primary goals that are intertwined
with each other:
- Ensuring data protection (including security)
- Reducing total cost of ownership (TCO) through optimization
- Ensuring data availability
Loss of data in any business should be considered intolerable.
Depending on the importance and the content of the data, the storage
team will need to work with the application owners, legal departments,
even dare I say it, marketing, to determine the level of protection
that different types of data will require. Critical data to the
operation of the business may be RAID5 with replication to a remote
datacenter, whereas views from databases that can be rebuilt may
be a RAID0 for quick access. Additionally, the data must be protected
from accidental or malicious access from host systems or individuals
not permitted to view or access data not intended for them.
The storage team will reduce the total cost of ownership by optimizing
hardware utilization, backup software, management, and management
software of the components of the storage network pieces. Because
the storage team's concern is storage only, they will be able
to be computer system vendor neutral. This will allow the OS and
storage administrators to concentrate on their core areas of responsibility.
The availability of the data must meet the needs of the business.
Storage administrators will need to work with systems administrators
to provide the required data availability for the applications and
departments as needed to provide business continuity.
So, now that we have a new title, what exactly should be the job
description? Where should the duties between systems administrator
and storage architect be divided? What should the storage administrator
be in charge of? Depending on the makeup of your organization, there
will be some grey areas of responsibility with the systems, storages,
and network administrator teams or individuals. Here is a suggested
checklist for the storage administrators:
- Backup, recovery, and disaster recovery (DR) schemes, schedules,
and test plans
- Backup servers, backup managers, APIs resident on slave servers
- Enterprise storage arrays, Gigabit Interface Controller (GBIC)
- Enterprise tape drives, libraries, Active Template Library
(ATL), Vtape, optical libraries
- External, SAN management software (Logical Unit Number (LUN)
mapping, management) for both in-band and out-of-band implementations
- Fibre Channel (FC) extenders, dense wavelength division multiplexing
(DWDM), and other native FC multiplexing or amplification devices
used for long-distance extension
- Fibre Channel hubs, FC switches, FC directors
- Fibre Channel-asynchronous transfer mode (FC-ATM) or other
gateway devices that use the WAN to tunnel FC or otherwise extend
FC through WAN access
- Fibre Channel cables, plant management, patch panels
- Fibre Channel/Internet Protocol (FC/IP) or Internet Small Computer
Systems Interface (iSCSI) gateway device
- Fibre Channel-Small Computer Systems Interface (FC-SCSI) bridges
- HBA utilities and software
- Host or storage data replication, snapshot, redundant arrays
of independent disks (RAID), or data mover software
- Host volume management software
- Just A Bunch Of Disks (JBOD) connected to pooled fabric architecture
- NAS appliance, NAS filer, or other general-purpose server used
for Common Internet File System (CIFS) or Network File System
(NFS) file serving
- SAN management software imbedded within switches or directors
- Server Host Bust Adapters (HBAs) selection
- Storage virtualization software
- Switch and director OS software
Additionally, storage administrators would be responsible for
selecting product standards, defining standard configurations, collaborating
on component architecture design principles with other teams as
they architect storage systems for new requirements, and planning
and executing projects for data storage.
After reading the above list, I can picture some administrators
climbing over their piles of antiquated hardware and manuals for
non-existent software to get out the door and start shouting that
I am a heretic. As I said previously, there will definitely be some
grey areas where administrators from different areas will need to
work together. Additionally, backups may not be part of storage
administration in your company. Some companies already have people
dedicated to backups and restores.
Where do you go from here? You might look at the Association of
Storage Network Professionals (ASNP) at http://www.asnp.org.
It's a user organization put together by users; membership
is free, and there are some great discussion forums on the Web site.
Be wary of vendors who believe that they can provide you with a
Stepford Storage Area Network. Nothing is perfect. I recommend reading
Designing Storage Area Networks by Tom Clark (Addison-Wesley).
It provides a very good introduction to this subject. White papers
from vendors are very helpful, but remember that they are almost
always vendor-specific.
Greg (shoe) Schuweiler has worked in the friendly Midwest for
the past 20 years as a consultant, an embedded software designer,
Oracle DBA, and a host of other strange titles. He has been in the
Unix systems administration area for the past 8 years. He is one
of the early joiners of the Association of Storage Network Professionals
and can be reached at: gregs@asnp.org. |