Cover V13, i12

Article
Figure 1

dec2004.tar

Change Control

Bob Ess

What do you mean the DNS server was being patched -- didn't you discuss that with the server team? How could the network guys be working on the core when you were updating the app server? Don't you guys talk?

Does this scenario sound familiar? As a company grows, managing change can be a challenging task. Change is necessary for a company to respond to the needs of the business and its customers. You must be able to adapt to change in order to thrive. The IT game itself has changed significantly over the past ten years -- even in just the past five years. Change can come quickly and in abundance. To manage change, you must have a formal process in place.

Why Do You Need Change Control?

Most IT shops usually have multiple teams involved in the support and administration of their systems. These can include teams for network, security, servers, data center, applications, Web, and telecom. Rarely does a change impact only a single group. With the continued emphasis on server consolidation and the constant pressure to reduce costs, teams are required to do more with less. As such, a single system may now host multiple applications that used to be individually hosted. The server team may think that a simple patch and reboot won't have any impact on the business, but the EDI folks will be quick to let you know that any reboot impacts revenue stream. It is important that any and all changes be reviewed by members of all the teams to ensure that the perceived impact is correct and the change appropriately planned.

Lack of team communication can also lead to too many changes happening simultaneously and, perhaps, unnecessarily. By subjecting planned outages or changes to a peer-review process, it becomes much easier to see potential conflicts and pitfalls. This also requires more complete planning from the change requestor as the change must be documented and spelled out completely for a thorough review. Proper planning has obvious benefits, including ensuring that the change requestor has done his homework and has confidence in the change outcome and ensuring that the change is thoroughly understood by other teams as well.

Symptoms of Unmanaged Change

Many shops may think there is no need for change control, especially smaller shops that tend to "roll their own" solutions. There is some merit to that argument since the smaller the shop is, the better the communication tends to be. But, if you are a small company, consider the following symptoms.

  • No history or audit trail of changes to a system -- How many times have you wished you could go back and know exactly when a particular patch was applied, when the disk was last changed out, when the operating system was upgraded? Without proper change control, it can be difficult to keep accurate asset history with regard to change events. Good troubleshooting practice dictates that you always first ask "What changed?" when trying to determine the solution to a problem.
  • Excessive or unnecessary downtime -- Systems can be subjected to unnecessary downtime if changes are not controlled. In the introduction to this article, there was a hypothetical situation where an admin was attempting to upgrade an application server while the network folks were doing work in the core. When teams use formal communication methods integrated into the change control process, everyone is aware of what is changing and when the change is planned. This allows all teams involved to better plan their changes, thus avoiding conflicts and possible unplanned downtime.
How to Get There

Most of the shops I have worked in have had little or no formal change control. The changes, if discussed at all, were usually communicated in a very informal manner: email, during roundtable in staff meetings, or by phone. To effectively manage change, you must map out a workable process. Once you have the process mapped, you can then automate and test it.

You can begin with something as simple as a block diagram or flow chart. When mapping internal processes, focus on the process and not the individual or team involved in the process. The process must stand alone if it is going to be automated. Tying the process to a person locks that process, and it becomes inflexible.

Think about how changes currently happen in your environment. Who needs to know about a change; how often should we do changes; what kind of changes do we do regularly; how often do we need a major outage window; who needs to sign off on changes (Security, Executive Management, Facilities...); how do we notify the user community of our planned changes?

The main consideration in the change management process is establishing scheduled outages windows. The overriding factor in establishing these windows is impact to the business. You obviously want to schedule your changes during a time that presents the least amount of impact to your company's revenue stream. Negotiating these times with department heads can be a challenge, however. You will hear from some departments that no day and time is acceptable to have an outage. But these arguments typically come from the same folks who won't pony up for clustered or HA systems as well. Regular, predictable outage windows are necessary for system upgrades, system patching, application upgrades, network upgrades, and storage expansion. Without these windows, the infrastructure cannot expand and grow to meet the business needs.

Once you have established the outage windows, you must create a process for managing regular change. Consider the following outline as a simple template with which to begin:

1. Admin submits a change request through a Web interface.

2. The change description is emailed to a change coordinator.

3. All IT-related teams meet once a week with the change coordinator to discuss the requested changes.

4. The change coordinator approves or denies the request.

5. Secondary and tertiary approvals can be added as needed (security, management, etc.).

6. Approved change requests are communicated to the user community via a weekly email that summarizes the dates, times, hosts, and applications affected.

This simple template can be greatly expanded to include much more detail. See Figure 1 for a flow chart mapping an established change control process.

Benefits

Once you put a change control process into place, you will immediately begin to see benefits to your organization and to your infrastructure. For example, a certain large software company releases their software patches on a monthly basis. By having regularly scheduled outage windows for your servers, you can schedule the patches for all of your machines throughout the month. This brings all of your servers up-to-date from a security standpoint, users and business owners know when to expect the outages and your planning becomes just that, planning, instead of just throwing patches on when you get a chance.

If you include a security checklist as part of the approval process, your changes get the benefit of being reviewed by members of the security team, who are focused on one thing and one thing only. Many times in our rush to get a project moving, we focus only on the event itself and don't take security concerns into play. In today's world, security must be involved in just about every change event imaginable.

With an established and enforced scheduled outage window, the users can better plan their work schedules. It is much easier to have regularly scheduled outages than it is to pick a date and wait for each and every department to tell you why you can't have that date and time.

By integrating your change control processes into a trouble-ticket system or an asset-management database, you instantly have an audit and history trail of changes to all of your assets. Asset change data is invaluable for establishing and predicting trends in your infrastructure -- from failure trends to upgrade release schedules. This short- and long-term data is of great benefit when putting together two- and five-year plans for your operations.

Summary

The only constant is change. And in the world of IT, you must learn how to adapt to and how to manage change. I have presented a brief overview of a simple change control procedure that can be used as a template to help get your change management under control. Peer review, predictable outage windows, management signoff, and setting user expectations are all essential ingredients to an effective change control process.

Bob Ess is a Senior Manager responsible for Data Center Operations for Fujitsu Network Communications in Richardson, Texas. He is the author of several articles related to Unix systems administration. He has 25 years experience in the IT field and 12 years as a Unix systems administrator. He can be reached at: unixroot@computer.org.