Monitoring with Nagios
System and network monitoring is essential for systems administrators.
There are many SNMP-based management tools -- both commercial and
free -- that can be used to manage and monitor nodes on a network.
There are also non-SNMP based tools, which do a good job of monitoring
network nodes. In this article, I will explain how to install and
use Nagios, a GPL tool written by Ethan Galstad that you can use
for host and service monitoring.
Nagios is primarily based on Linux, but will run under most Unix
variants. Nagios has many useful features that can help enhance
performance and promote a user-friendly environment. For example,
if you have a large number of Web servers running in your environment,
you can monitor the HTTP service on all of them using Nagios. The
best part is that you do not have to install any program on your
Web servers to do the monitoring because Nagios can monitor the
HTTP service remotely. Similarly, you can monitor almost any TCP-
or UDP-based service that is running on any platform without having
to install Nagios on any of the monitored hosts. I use Nagios to
monitor services such as HTTP, DHCP, DNS, FTP, SMTP, and LDAP.
Nagios can also monitor host resources such as disk space, CPU
load, log file size, running processes, and memory usage. Unlike
service monitoring, host resource monitoring requires installation
of the Nagios monitoring agent on the host that is being monitored.
Nagios supports notification of problems and resolutions via email
or pager. With the help of a plugin, Nagios can also be used to
monitor SNMP-based events. Information about monitored hosts and
services can be displayed in a 3-D VRML map.
You can download Nagios at: http://www.nagios.org. CVS-enlightened
users can download Nagios source code by typing:
with a blank password. For example, hit the enter key when prompted
for a password and then type:
cvs -z3 -d:pserver:firstname.lastname@example.org:/cvsroot/nagios
email@example.com:/cvsroot/nagios co nagios
You should also check:
before attempting to download via CVS in case of any updates.
Nagios consists of a core program and some plugins. The core program
calls the plugins to do the monitoring. check_smtp, for example,
is a plugin that checks to see whether SMTP is running on the specified
host. The core program is written in C and some of the plugins are
in C as well as Perl. The core program (as of this writing) is v1.0
and can be downloaded as nagios-1.0.tar.gz.
For the purposes of this article, I will show compile and install
information on a Red Hat Linux v7.2 box with kernel v2.4.7-10smp.
Nagios does not require a fast machine to run, although I happen
to have a dual processor 2.4-GHz Pentium IV with 4 GB of RAM on
which I am running Nagios. Once you have downloaded Nagios, download
the basic set of plugins from: http://www.nagios.org/download/.
You can start with the base set, "nagios-plugins-1.3.0.tar.gz".
A list of the plugins is available at: http://sourceforge.net/project/showfiles.php?group_id=29880.
I performed the installation of Nagios as root and Nagios was
smart enough to set the appropriate permissions as well as ownership
on files. Nagios is not required to run as root, nor would I recommend
it. The default user and group that Nagios configuration files contain
is called "nagios".
It's best if you create this user and group as follows. The useradd
command will automatically create a user and a group called "nagios",
and will set the primary group of user "nagios" as group "nagios".
The home directory will also automatically be set as /home/nagios,
but you can modify that using the -d option:
root:/usr/local/src:#useradd nagios -d
Next, unzip the Nagios program:
root:/usr/local/src:#tar xvfz nagios-1.0.tar.gz
cd to the nagios-1.0 directory and run the configure command:
The configure command assumes the following defaults, which you can
change as you see fit. I recommend you leave the default installation
directory as is:
To install the binaries, run:
Three sample configuration files are automatically created by Nagios
-- nagios.cfg-sample, cgi.cfg-sample, and resource.cfg-sample. If
you want these files installed in the appropriate directory (i.e.,
/usr/local/nagios/etc), assuming you choose the default installation
path of "/usr/local/nagios", run the following command:
To automatically start Nagios at the next reboot, run:
which will place the Nagios startup file "/etc/init.d/nagios". I recommend
creating start and stop links for the appropriate run level in /etc/rc3.d
root:/etc/rc3.d:#ln -s ../init.d/nagios S100nagios
root:/etc/rc3.d:#ln -s ../init.d/nagios K01nagios
Installing and Configuring Plugins
Your next step is to install the plugins Nagios uses to monitor
hosts and services. Downloading "nagios-plugins-1.3.0.tar.gz" (http://sourceforge.net/project/showfiles.php?group_id=29880)
will get the latest set of plugins. You do not have to configure
the plugins in the same source directory as Nagios. For example,
I downloaded the plugins module in /usr/local/src and then went
on to uncompressing and untarring the plugins file as follows:
root:/usr/local/src:ls -l nagios-plugins-1.3.0.tar.gz
-rw-r--r-- 1 root root 491510 Mar 2 00:09 nagios-plugins-1.3.0.tar.gz
root:/usr/local/src:gzip -d nagios-plugins-1.3.0.tar.gz
root:/usr/local/src:tar xvf nagios-plugins-1.3.0
The above procedure should create a nagios-plugins-1.3.0 directory
where it is untarred. Once you cd to the nagios-plugins-1.3.0
directory, you must configure the plugins and install them, which
can be done as follows:
Since we choose the default install path of Nagios (i.e., /usr/local/nagios),
we do not have to specify the path for Nagios when configuring the
plugins. However, if you choose another path, use the following syntax
when running the configure script for the plugins:
The next step is to configure Nagios, which is the most involved
part of the project. Nagios has a number of configuration files
that can be seen by running ls in /usr/local/nagios/etc:
Notice that all of these have -sample appended because they
are sample configuration files. Before using any of these files, remove
We can effectively break the configuration files of Nagios into
the following types:
1. Main configuration file (nagios.cfg)
2. Resource file(s) (resource.cfg)
3. Object configuration files
4. CGI configuration file (cgi.cfg)
5. Extended information configuration files
Nagios is highly customizable, so I will only cover some of configuration
directives in this article. Since nagios.cfg is the main configuration
file for Nagios, start configuration of nagios.cfg as follows:
root:/usr/local/nagios/etc:su - nagios
There are more than 65 configuration directives in nagios.cfg. Because
we chose the default configuration, we do not have to modify any of
the configuration directives in this file to get Nagios up and running.
A few of the directives are listed below.
Specify the location of the nagios log file:
Specify which user nagios should run as:
Specify which group nagios should run as:
If you want Nagios to use syslog, leave the option below at 1:
Specify how often you want Nagios to rotate its log file (d
stands for daily):
Next, modify the object configuration files. According to Ethan
Galstad (the main developer of Nagios), an object is "simply a generic
term I use to describe various data definitions you need in order
to monitor anything." The files that contains object definitions
include the following:
The contacts.cfg file contains information on contacts. A contact
is a systems administrator or other person who will be notified
in the event of an emergency. An example entry is as follows:
alias System Admins
alias System Admins
Next, edit the contactgroups.cfg file and make an entry as follows:
alias System Admins
Contact groups are very useful because you can create groups such
as HttpAdmins or SmtpAdmins and use these groups for notification
of respective services. (See sidebar "contact.cfg Parameters".) In
the above example, I have created a group called "administrators"
that contains user "joe" and "mary". I use this group for notification
of all hosts and services because we do not have separate HTTP or
To define the hosts that Nagios will monitor for us, we need a
host definition file. Nagios host definition file is called hosts.cfg
and in case of the default installation resides in /usr/local/nagios/etc/hosts.cfg.
Defining a host in the host definition file means providing a name
for the host, an IP address, and defining other parameters that
Nagios will use to monitor the host.
Configuration of object files can be done either with the "old"
method or the template-based "new" default method. The "old" method
was file-based, but did not use templates. Using a template for
host and service definition simplifies adding new hosts and services
for monitoring. For example, consider adding a host monitor by editing
hosts.cfg file. A host monitor is an instance of a host that defines
attributes about the host (such as name, IP address, how many checks
Nagios should run against the host) that can be seen in the following
This shows a template called "generic-host", and a host called "www1"
and "www2", which uses the generic template. The generic-host template
defines all the configuration directives that any host using this
template will inherit. You can override the inheritance by duplicating
the template-specified configuration in the host definition itself.
For example, if you want notifications disabled for host www2 while
you perform maintenance on it, you would modify the host entry as
In the above case, www will inherit all properties of generic-host,
and the notifications_enabled property of the generic-host template
will be overwritten with the new value of 0 (i.e., notifications disabled
for host www2).
The required configuration directives are shown in the sidebar
(of the same name). For the optional configuration directives, visit:
The hostgroups.cfg file lets you create a logical group of hosts.
You could modify the hostgroups.cfg file to create a group for HTTP
servers as follows:
alias HTTP Servers
members www1, www2
Logically grouping servers based on the service to monitor is one
possible methodology. A host should belong to at least one group and
may belong to multiple groups. For example, if one of your HTTP servers
is also your email server, you can have another hostgroup such as:
alias SMTP Servers
We now need to define a service to monitor on our previously defined
host. You can use templates in service definitions just as you can
with host definition templates. I first define a generic service
(called "generic-service") that has a large number of configuration
I can then create a service definition for an http-servers group that
inherits the generic-service properties and adds a few of its own:
(See the sidebar for Nagios required service parameters.) For additional
directives, visit: http://nagios.sourceforge.net/docs/1_0/xodtemplate.html#service.
More on Plugins
Plugins do the actual monitoring for Nagios. The core Nagios engine
calls the plugins to check on hosts and service. Nagios provides
a number of plugins, and you can write your own plugins if want
to monitor almost any host or service. The plugins that come with
Nagios have help available when you execute the plugins with the
When you specify the check_command option in services.cfg for
a particular service, Nagios looks up the check_command in the file
checkcommands.cfg and then runs the command based on the specified
options. Look at one of the entries in checkcommands.cfg:
command_line $USER1$/check_http -H
The $HOSTADDRESS$ is a macro that Nagios expands to the IP address
of the host as defined in hosts.cfg. Nagios provides 32 user-defined
macros from $USER1$ through $USER32$. All macros are defined in the
resource.cfg file. In the resource.cfg file, $USER1$ has already been
defined as "$USER1$=/usr/local/nagios/libexec". Therefore, Nagios
/usr/local/nagios/libexec/check_http -H 10.10.10.1
for the following entry in services.cfg file:
Remember from the previous example that www1 has IP address 10.10.10.1.
When the service check executes, Nagios will look up the command check_http
in checkcommands.cfg and will run the command with the options specified
Nagios provides an excellent, optional Web-based user interface.
If you want to use the Nagios Web interface, you must edit cgi.cfg
and also your Web server configuration file. In my case, I am using
Apache and have modified the httpd.conf as follows:
ScriptAlias /nagios/cgi-bin /usr/local/nagios/sbin/
Allow from all
Alias /nagios/ /usr/local/nagios/share/
Allow from all
The above configuration will let me access the Nagios GUI by typing:
The ExecCGI options allows execution of CGI scripts and the AllowOverrise
AuthConfig allows me to use the directives below in my .htaccess file:
AuthName "Nagios Access"
Setting up authentication to access the Nagios Web interface is a
good idea. To set up authentication, create a .htaccess file (as shown
above) and set up a new user as follows:
htpasswd -c /usr/local/nagios/etc/htpasswd.users admin
For the Nagios cgi configuration file cgi.cfg, I have modified the
The refresh_rate is the rate at which Nagios will refresh the Web
page for the status, statusmap, and extinfo CGIs. The hostextinfo.cfg
file is explained in the "Beautification" section later in this article.
The Web interface lets you view a status "map" of the hosts being
monitored. The map can be laid out in the following coordinates:
0 = User-defined coordinates
1 = Depth layers
2 = Collapsed tree
3 = Balanced tree
4 = Circular
5 = Circular (Marked Up)
User-defined coordinates allows a user to pick and choose where
hosts are displayed on the status map. The coordinates of a host
on the status map should be defined in the extended information
file, which is hostextinfo.cfg by default. Coordinates are defined
as 2d or 3d, in positive integer format, with x and y coordinates.
The coordinates you specify are for the upper left-hand corner of
the host icon that is drawn. For example, in the hostextinfo.cfg
The depth layers option displays the parent nodes and not the child
nodes. Child nodes are visible by clicking on a parent node. Depth
layer is useful for a large network where there are many parent/child
The collapsed tree option displays all hosts in a layered tree-like
manner, giving you the option of clicking on a host and zooming
in to see its child nodes, if any are defined. Child nodes are not
displayed by default.
The balanced tree option displays all the hosts that are being
monitored as nodes of a tree, with the root of the tree being the
Nagios server process. All nodes are considered equal distance from
The circular option shows all the hosts around the central Nagios
server, arranged in a circular manner. This gives a cluttered view
if you have many hosts being monitored.
The sound parameters define sounds that are played on the client
Web browser. I think the sounds are useful because I do not have
to constantly watch the Nagios GUI or check my email to be notified
of a system-down status.
If you want to use image icons to represent the host and services
you are monitoring, Nagios comes with a decent amount of icons in
jpg, gif, jd2, and png format. You can download six additional logo
Each logo pack contains of a number of logos that you can use to represent
your hosts and services.
Nagios can be a useful tool in a network for monitoring hosts
and services. The ability of Nagios to remotely monitor services
without the installation of software on the monitored hosts is a
great plus. Nagios can also be used without SNMP or along with SNMP
to effectively monitor a network. The professional GUI that Nagios
offers rivals those of many commercial tools. If you are low on
budget or favor free software over commercial ones, then give Nagios
a try. You will not be disappointed.
Nagios html and pdf documentation -- http://www.nagios.org/docs/
Nagios FAQ -- http://www.nagios.org/faqs/
Nagios email list -- http://www.nagios.org/mailinglist.php
There are seven mailing lists: nagios-[users,announce,devel,checkins]
and nagiosplug-[help,devel,checkins]. The one most helpful to new
users is probably nagios-announce (for new announcements). If you
want extra help after reading the documentation and the FAQ, then
Syed Ali has a Master's in Computer Science from Stevens Institute
of Technology and is an MCP, MCSE, MCT, CCNA, CCAI, RHCE, and SCSA.
He currently works for a research laboratory in Princeton, New Jersey,
as a supervisor for systems administration. Syed can be contacted