Linux
Server Monitoring with IPMI
Philip J. Hollenback
If you have expensive computer systems running in your data center,
you want to make sure they keep running smoothly. Server vendors
have addressed this by adding system monitoring devices to motherboards
to report on temperatures, fan speeds, and voltages.
The standard way to monitor these parameters has traditionally
been with tools such as lm_sensors on systems running Linux. However,
this mechanism is far from perfect. For starters, it can be incredibly
difficult to configure lm_sensors correctly because of poor documentation.
Vendors often supply monitoring software that works perfectly and
requires no tweaking, but that only runs on one motherboard and
under Windows. Then, there's no guarantee that the temperature probes
or fan sensors are properly calibrated. Also, it would be nice if
you could reach out and reboot a hung system over the network without
the use of additional equipment.
One solution is the Intelligent Platform Management Interface
(IPMI). IPMI is a specification for monitoring and controlling server
hardware. Specifically, it is a standardized way to do things such
as:
- Monitor system temperatures
- Remotely force a hung machine to reboot
- Read hardware event logs
- Redirect the serial console over a network connection
IPMI covers several different instrumentation and reporting mechanisms.
However, in this article, I'll focus on the main use of IPMI in
monitoring and remotely controlling Intel-based servers running
Linux and containing a baseboard management controller (BMC). Many
higher-end server systems currently ship with this hardware pre-installed.
Additionally, BMC cards are available as an add-on for many server
systems.
What IPMI Can Do for You
If you are managing Intel-based servers of any sort, you are probably
going to encounter IPMI sooner rather than later. In my case, the
company I work for received a new shipment of Dell 1650 and Gateway
955 servers. We run a Linux shop, so the first order of business
with new hardware is getting Linux installed with all of the necessary
drivers.
Part of getting Linux operational on new systems is system monitoring.
Modern servers with dual Pentium 4 processors and big SCSI hard
drives run extremely hot. If a system fan fails, the server will
fail badly in a matter of minutes. In the best case, the system
will lock up. In the worst case, CPUs will burn up.
An even more insidious failure mode is the gradually slowing CPU
fan. The system will run hotter and hotter and start to suffer from
random lockups. In all these cases, it's critical to know that the
fans are running and the system is cool at all times. If you don't
detect and respond to a failure immediately, the system will go
down and possibly suffer physical damage.
Thus, extracting temperature, fan, and voltage readings from servers
is an absolute requirement at my company. Our systems are trading
stocks day and night and we can't have them fail without warning.
Ideally they don't fail at all, but realistically the best we can
hope for is warning that a problem is approaching. That way, we
hope to have time to move critical services to another system before
a catastrophic failure occurs. On these new servers, I discovered
the only way to obtain monitoring information is by using IPMI.
Our server room rack space is tight, so we prefer to run headless
servers as much as possible. This also has the desirable effect
of reducing electrical and heating loads in our server room. Additionally,
we maintain a disaster recovery site some 40 miles away. These two
facts make me extremely interested in remote control and management
tools such as IPMI. Anything that saves me an emergency train trip
to the DR site is an especially welcome addition to the system administration
toolbox.
All of this and more is available via IPMI. Even though I was
not initially interested in the rest of the remote management aspects
of IPMI, I had to use it to get that critical monitoring data, which
led me to investigate what else IPMI has to offer.
How IPMI Works
As I mentioned previously, IPMI is a standardized interface to
system monitoring and management. This is accomplished with a separate,
almost totally independent, small computer (the Baseboard Management
Controller or BMC) running inside your server. This is not a new
concept -- expensive servers have been shipping with management
controllers for years. The important difference is that IPMI is
standardized. With the proper tools, you should be able to use IPMI
on any system and not have to worry about installing proprietary
software.
The BMC is directly connected to the power supply in the system
and should be operational at all times, even if the main system
hangs or crashes. It controls the power connection to the server
and can cycle it as needed to restart the main server. The BMC is
also connected to your network. Some systems have a dedicated network
interface for the BMC (typically labeled something like "Management
Interface").
A more popular (and presumably cheaper) alternative is for the
BMC to intercept traffic on a regular Ethernet interface. Packets
to UDP port 623 are redirected to the BMC instead of to the motherboard.
The Ethernet interface is always powered up when the system is connected
to mains, so in theory you should always be able to use IPMI over
the LAN to access the server. Note that all servers I am aware of
come with LAN IPMI access disabled by default, because this is an
obvious security hole. You probably don't want to enable IPMI over
LAN until you understand the security implications and have at least
configured password protection. Remember that at the very least,
someone who can gain access to IPMI on your server can reboot the
system at will.
IPMI supports a number of other hardware interfaces. Many rackmount
servers come with a blue identification LED on the front and back
of the system. You can use this to quickly locate a server in a
rack of identical systems. Luckily, at my company we buy servers
in ones and twos so everything looks different anyway. But if you
have many identical systems, being able to control the ID light
remotely could be very useful. You could use IPMI over the network
or on a system to turn on the light, and then tell your tech to
go find the system with the blue light on it. This would be particularly
useful in remote server setups.
One IPMI feature that I was initially very excited about was Serial
over LAN. One of the other hardware interfaces to which the BMC
connects is the serial port. At my company, we have always configured
servers with serial consoles. We set the BIOS, bootloader, and Linux
all to redirect to the serial port. Then, we connect the serial
port to a terminal server (typically a Cisco 2620 with a multi-port
serial card). This solution works well; however, it creates extra
cabling. The ideal server would have just a power cord and a network
connection. This is exactly what Serial over LAN with IPMI promises.
The BMC intercepts all traffic on the serial port and redirects
it onto the network. The additional serial cable is eliminated.
This sounds wonderful, except for one small problem -- it doesn't
work, for several reasons:
1. Serial over LAN is only standardized in the 2.0 version of
IPMI. All of our servers have version 1.0 or 1.5.
2. The only way to get it to work on systems supporting IPMI versions
prior to 2.0 is to use proprietary Intel software (see number 1).
3. There's a Linux kernel bug in RTS/CTS handling on serial ports
that results in either a hung serial port (with RTS/CTS on) or dropped
characters (with it off).
Luckily, all this is documented in the Debian IPMI HOWTO listed
in the references at the end of this article. Although Serial over
LAN sounds tempting, it's not very practical with current hardware
and software.
IPMI does a lot more than I can cover in this article, including
things like a true hardware watchdog and system inventory reporting.
Read the references at the end of this article for more details.
Installing and Configuring IPMI
Because I manage Linux servers and thus already have good remote
access, I first decided to get IPMI working through the operating
system and not worry about IPMI over LAN. This was also motivated
by my first desire to obtain sensor readings from my servers. If
you are more interested in Serial over LAN or remote power-cycling
of systems, you may want to explore that mechanism first.
Vendors do offer some Linux support for IPMI. Dell's standard
system monitoring tool is OMSA, the Open Manage Server Administrator.
Similarly, Intel offers some Linux tools for their SE7501WV2 motherboard.
However, as it often the case with vendor-supplied tools, the packages
are large, intrusive, and contain proprietary software. Because
I was initially interested in just basic system monitoring, it seemed
appropriate to investigate completely open source solutions first.
The first step in enabling IPMI support in Linux is to install
the OpenIPMI kernel driver. These are included with the stock 2.4.21
and later kernels but may not be enabled in your configuration.
You may want to check the OpenIPMI homepage (see References) and
download the drivers from there as they might be more recent than
what comes with your kernel. In my case, the drivers that came with
my kernel (2.4.22) have been adequate.
You need the following driver modules:
- ipmi_msghandler
- ipmi_devintf
- ipmi_kcs_drv (2.4 kernels) or ipmi_si (2.6 kernels)
If you don't already have these on your system, follow the usual
Linux kernel build instructions and enable the IPMI drivers as modules.
These kernel modules are necessary, but you can't actually do
anything useful with them. For that, you need the ipmitool userland
utility. This doesn't appear to ship with any current distributions
(up to Fedora Core 2, anyway). Because of this and the fact that
these tools are still relatively young, your best bet is to download
ipmitool from the project Web page (see References).
Make sure you have at least version 1.6.0 of ipmitool as it contains
a number of fixes and improvements. For example, earlier versions
of ipmitool only partially work on my Dell 1650 servers because
those servers only support IPMI version 1.0 and ipmitool didn't
fully understand that version. This has been fixed in version 1.6.0.
Build and install ipmitool from source or install the package
if you have it. Building your own rpm package is fairly easy with
the spec file included in the source tarball (that's the approach
I used). Then, load the modules and verify they loaded correctly
by checking dmesg.
Now you are almost ready to test IPMI on your system. The last
step before doing that is configuring the device node that ipmitool
uses to communicate with the driver. You can create this manually,
in modules.conf or your init script. You can see my init script
(Listing 1) for how to create the device there. Or if you are a
master of modules.conf like one of my co-workers, use this modules.conf
entry he wrote (one line):
post-install ipmi_devintf /bin/awk '/ ipmidev$/{print $1}' \
/proc/devices | /usr/bin/xargs -r -imajor /bin/sh -c "rm -f \
/dev/ipmi0 && /bin/mknod -m 0600 /dev/ipmi0 c major 0" \
>/dev/null 2>&1 || :
which automatically determines the device number by checking in /proc/devices.
The device number will almost always be 254.
Testing IPMI
IPMI is now installed and configured on your system. To test it,
run ipmitool with appropriate arguments. Try this to query the BMC:
# ipmitool -I open bmc info
And you will see something like this:
Device ID : 0
Device Revision : 0
Firmware Revision : 1.71
IPMI Version : 1.0
Manufacturer ID : 674
Product ID : 1 (0x0001)
Device Available : yes
Provides Device SDRs : yes
Additional Device Support :
Sensor Device
SDR Repository Device
SEL Device
FRU Inventory Device
IPMB Event Receiver
Aux Firmware Rev Info :
0x00
0x00
0x00
0x00
First, note that you have to run ipmitool as root to get access to
the IPMI devices. Second, select the interface to use with the -I
switch. "open" is the OpenIPMI driver on the local system. The other
major interface is "lan" for communicating with the BMC on a remote
system over Ethernet.
Finally, the IPMI version supported on a particular system is
important. The sample output above is from a Dell 1650, which only
supports IPMI v1.0. As I mentioned, ipmitool versions prior to 1.6.0
don't really work on this system without a patch. Some features
(such as Serial over LAN) don't work on systems that only support
IPMI version 1.0.
The current IPMI version is 2.0, which adds standardized Serial
over LAN and better security compared to v1.5. Systems supporting
version 2.0 of the specification should be appearing shortly on
the market.
Making IPMI Work with (or at Least Like) lm_sensors
Because my initial goal in setting up IPMI was to obtain sensor
data from my systems, the next step was to get at that data. Remember
that IPMI-enabled systems only support retrieving sensor data via
IPMI. In other words, you can't use the normal lm_sensors drivers
to obtain readings as you would on other servers.
There is preliminary support in lm_sensors for IPMI, so you may
want to investigate system monitoring only with that package. However,
because I hoped to eventually use the other features of IPMI (particularly
remote reboot over LAN), I decided to work only with ipmitool. Furthermore,
my company's existing monitoring tools all expect data in the format
of lm_sensor's "sensors" program. This meant that for the easiest
integration I would have to write a script that called ipmitool
appropriately, parsed that programs output, and displayed it in
this format:
CPU1 Temp: 40 C
CPU2 Temp: 44 C
CPU1 Fan: 5311 RPM
CPU2 Fan: 5201 RPM
However, the impitool sensor data looks like this:
Sensor ID : CPU 1 (0x1)
Sensor Type (Analog) : Temperature
Sensor Reading : 40 ( -124) degrees C
Status : ok
Lower Non-Recoverable : -128.000
Lower Critical : 5.000
Lower Non-Critical : 10.000
Upper Non-Critical : 70.000
Upper Critical : 75.000
Upper Non-Recoverable : 127.000
Obviously, some munging is necessary. See Listing 1 for how I dealt
with this.
While IPMI provides many other sensor values, I was concerned
with just CPU temperature for the first version of my script. IPMI
sensor readings are available from both the "sdr" and "sensors"
commands:
# ipmitool -I open sdr list
and
# ipmitool -I open sensor get "CPU 1"
Though the sdr (Sensor Data Repository) seems to contain most of the
sensor data, it's not in a particularly useful format. On our Gateway
servers, the CPU temperatures did not appear in the sdr output at
all. The sdr output also just spits out all the sensor readings, which
results in a lot of waiting for sensors you don't care about. On the
other hand, the sensor command to the ipmitool open interface allows
you to query a particular sensor, so that seemed the more efficient
route to follow.
An annoyance quickly appeared when I tested ipmitool on systems
running both the 2.4 and 2.6 Linux kernels. Under the 2.6 kernel
I could obtain both CPU temperatures in 1-10 seconds, but under
the 2.4 kernel the wait was much longer. The worst case was under
the 2.4 kernel on the Gateway servers -- ipmitool took 1 minute
44 seconds to return both CPU temperatures! System load during this
wait was minimal, so ipmitool was just sitting around and waiting
for data to return from the OpenIPMI driver. This comment in the
ipmi_kcs driver documentation was illuminating:
If you have high-res timers compiled into the kernel, the driver
will use them to provide much better performance. Note that if you
do not have high-res timers enabled in the kernel and you don't
have interrupts enabled, the driver will run VERY slowly. Don't
blame me, these interfaces suck.
I found that in general, the performance on 2.6 systems was acceptable.
The Dell 1650s were much quicker to return values than the Gateways,
which seemed to be related to the Dells only supporting the 1.0
IPMI specification and thus having less data to read from the BMC.
Ultimately, the only optimization I was able to make was to ask
ipmitool for all the sensor values at once:
# ipmitool -I open sensor get "CPU 1" "CPU 2"
which brought the time for the Gateways running the 2.4 kernel down
to 53 seconds. This isn't great, but adequate for my needs since I
wanted to query for CPU temperatures every 5 minutes. Also, note that
the OpenIPMI driver is fairly resistant to becoming wedged in an unstable
or unresponsive state -- when ipmitool is waiting for data, I can
interrupt it with <ctrl-c> with no ill effects, other than a
warning message in the kernel log. I never experienced system lockups
or a hung driver while testing the OpenIPMI driver and ipmitool utility.
I also discovered that some sensor names were different between
the Dell and Gateway servers. On the Dells, the first CPU temperature
can be found in "CPU 1". On the Gateways, this same sensor is called
"Processor1 Temp". I chalked this up to a difference between the
1.0 and 1.5 IPMI specification, but I did not verify this.
The fact that ipmitool must be run as root is an annoying limitation.
I understand the security commands that require this; however, it
would be handy if there were some sort of read-only access to IPMI
on Linux so that regular users could retrieve system sensor values.
This may be a feature in the lm_sensors IPMI support and could be
a reason to investigate that solution. My fix was to wrap the ipmitool
invocation in a script that was accessible to certain users via
sudo, which is not entirely secure but adequate for my needs. I
hope to get my init script accepted into the ipmitool distribution
soon as it should be useful for others.
After much tweaking, I finally arrived at a script that extracts
CPU temperatures from IPMI. I included this in an init script, as
seen in Listing 1. The init script handles loading the driver and
creating the necessary device node, as well as displaying CPU temperatures.
Here are my final run times for "ipmi status" on both the Dell and
Gateway test systems:
System |
Kernel 2.4 |
Kernel 2.6 |
Gateway 955 |
55s |
12s |
Dell 1650 |
9s |
1s |
As I noted, this is time spent just waiting for a response from
the BMC -- system load is minimal. You can verify this with the
time command -- user and sys time for the command is 0 or
very close to 0, while the real time is 55 seconds. The processor
is not working at all.
Future Developments and Closing Thoughts
Now that I have basic monitoring working, I plan to implement
the more exciting features of IPMI such as Serial over LAN and remote
system power control. The ability to always connect to a system
even if it is down will obviously make IPMI very useful in a data
center situation, and I plan to integrate that into our environment
next. Later, it would be useful to get Serial over LAN operational
to reduce cabling requirements in our data center.
Similarly, watchdog support should prove useful for unattended
systems (as long as it works perfectly, of course). My company will
probably wait to explore these other IPMI features until we move
all systems to the 2.6 Linux kernel to minimize problems like the
long sensor read times on some systems.
I have only scratched the surface of IPMI. This is a large and
complex mechanism with many features. I attacked the issue from
one perspective -- how do I obtain CPU temperatures on a system
equipped with IPMI? The answer is through a combination of Linux
kernel drivers, a userland IPMI tool, and some custom scripting.
Along the way, I discovered that IPMI under Linux has some interesting
quirks, such as different sensor names on different systems and
wildly varying sensor read times. Luckily, all of these issues were
resolved, and my tool is now in production at my company, providing
temperature data on many of our servers.
References
The IPMI on Debian Howto: http://buttersideup.com/docs/howto/IPMI_on_Debian.html
Ipmitool (Linux userland tools) home: http://ipmitool.sourceforge.net/
OpenIPMI (Linux kernel driver) home: http://openipmi.sourceforge.net/
Phil Hollenback is a Linux Systems Administrator at Telemetry
Investments in New York City. He holds a BS in Computer Science
with an emphasis in AI. Outside of work he tries to avoid getting
run over while skateboarding the streets of Manhattan. He can be
reached via his Web site, http://www.hollenback.net. |