Network Operations Center On-line
Ron McCarty
Despite the tremendous advances in network management systems, many of us find ourselves in need of network monitoring tools that are not as complex as commercial network management systems, or tools that are not licensed based upon the number of nodes monitored. Sometimes we just need a monitoring system that can be put together in less than a day. Whatever the reason, if you need a network monitoring tool that supports both a CLI and Web interface, you should consider the Network Operations Center On-line (nocol). (Major future versions of nocol will be called System and Network Integrated Polling Software [SNIPS].)
nocol is a freely available monitoring took that will compile and run on most versions of UNIX. The examples here are based upon RedHat Linux 6.1, but nocol installs without problems on Solaris and BSDi systems.
The nocol home page is located at:
http://www.netplex-tech.com/software/nocol/
and the nocol gzipped tarball can be downloaded from:
ftp://ftp.navya.com/pub/nocol.tar.gz
The latest version of nocol at the time of writing is 4.3.1, but the tar file just referenced is always linked to the latest version.
nocol Monitoring
nocol will likely meet the monitoring needs of most small and mid-sized networks. It supports multiple monitoring methods including ICMP (ping), TCP port monitoring, syslog messages, and SNMP variables. A complete list of monitors directly supported are listed at:
http://www.netplex-tech.com/software/nocol/
Should the monitors included with nocol not be sufficient, additional monitors can be programmed. With the included monitors, it is easy to monitor hosts for connectivity issues and to monitor network services such as Web and mail servers. Although nocol is not a distributed system, multiple nocol stations can be easily configured and managed with separate or overlapping networks to monitor.
Installation As mentioned, the installation shown here is based upon Red Hat 6.1, but nocol installs easily on other systems. All configuration options shown are not dependent on the version of UNIX used. To avoid possible security problems associated with running applications as root, the installation below uses the user nocol for installation.
Note that the nocol programs etherload, multiping, and trapmon require root access during installation, but root access is not necessary if these programs are not to be installed. These programs are installed with the SUID bit set so they will run as user root.
To add the user nocol use:
adduser nocol
Be sure to change the nocol users password. Now either login as nocol or use the su command to change to the user.
Download the nocol distribution from:
ftp://ftp.navya.com/pub/nocol.tar.gz
and place it in an appropriate source directory. On Linux systems, I prefer to use /usr/local/src/.
Once the nocol distribution is downloaded, unzip the file:
cd /usr/local/src/
gunzip nocol-4.3.1.tar.gz
Then untar the file using:
tar xvf nocol-4.3.1.tar
This will create a source directory structure within the /usr/local/nocol-4.3.1/ directory. Now go into the nocol-4.3.1 directory:
cd nocol-4.3.1
Read through the INSTALL file for a quick rundown on the installation steps and to see whether there are any last minute changes for the installation. Also, take a look at the README file to see if there is any late-breaking news.
At this point, the source can be configured for the local installation. Run:
./Configure
The Configure script will prompt you for the top-level directory of where the nocol distribution will be installed. As the script mentions, by placing all the software in a central directory, the software can be easily upgraded. Additionally, the binaries can also be easily tarred for backup or further distribution.
If the suggested /usr/local/nocol/ is appropriate, then hit return at the prompt; otherwise, type in the directory to which you wish to install. The script will also prompt for the man page location and defaults to /usr/local/nocol/man/, which you can change if you wish.
The script will then ask for the loghost. It will use the localhost by default or a configurable host. After entering the loghost, you will be prompted for email addresses to send lower-level messages (operation messages) and urgent messages. If the nocol-ops and nocol-crit are used, be sure to either create the accounts or create a mail alias to have the email distributed to the appropriate administrators.
Finally, the script will ask for the locations of several programs needed by nocol. The Configure script will usually guess these correctly; however, the program does check the defaults and any answers you supply and print a warning, but will use the incorrect entries during installation. If a warning is received about the file not existing, use:
find / -name filename -print
to find the file and re-run Configure. Upon completion, the Configure script creates a Makefile that can be further edited. Keep in mind that the Makefile will be overwritten by running the Configure script again. If the email addresses that you chose during Configure do not exist, create them or email aliases to ensure notifications are delivered properly.
To compile the program and install it to the appropriate directory:
make
make install
su -
make root
Logout to return to the user nocol.
Configuration Assuming all goes well with the compile and installation, you can test and configure nocol. The easiest nocol monitor to configure is the ippingmon monitor, which (as its name implies) pings hosts to test network connectivity. So, lets use ippingmon to test the nocol installation:
cd /usr/local/nocol/
cp etc/samples/ippingmon-confg etc/
Now, edit the file for testing:
vi etc/ippingmon-confg
The polling interval (POLLINTERVAL) can be left at 300 seconds, but remove all of the host entries and add two to simplify testing. One of the host entries should be a host that will reply to a ping, and the second host should be a non-existing host to ensure that a notification is made:
realhost 192.168.1.1
fakehost 10.1.1.1
The format for the ippingmon-config file host list is simply hostname and IP address. The hostname does not necessarily have to be the true hostname of the machine this is the name reported by nocol; however, in larger environments it is wise to ensure that the two correspond.
The ipping-confg file should resemble the following:
## Config file for ippingmon.
##
# Can also mark a site as TEST
# POLLINTERVAL is number of seconds between
# each poll cycle.
##
POLLINTERVAL 300
# ALL Unix hosts should be monitored using \
'rpcpingmon'. Put them
# in the 'rpcppingmon-confg' file and not here.
realhost 193.141.226.1
fakehost 10.1.1.1
Now, the ippingmon can be started:
bin/ippingmon &
The netconsole CLI utility can be used to view the output of the monitors in real time. To start the utility, type:
bin/netconsole
netconsole begins by displaying only critical errors; however, the level of errors can be changed by pressing the letter l and the level number from 1 through 4. The four levels are 1-critical, 2-error, 3-warning, and 4-informational. Level 4 will show all hosts being monitored, as well as the monitor being used to monitor the host. Figure 1 shows that the realhost and fakehost are being monitored. After the next iteration of ippmon, the warning would turn into an error. After error, assuming another failure, the error would turn into a critical. Operation centers will typically monitor using warning or higher depending on local needs. The information level is typically used by the administrator to discover which hosts are being checked.
tcpmon is one of the most useful tools for monitoring network services, but it is a bit more complicated to configure, because either knowledge of the application protocol is required or a configuration example must be used. There is an example configuration included, so it will be used for the initial configuration.
cd /usr/local/nocol/
cp etc/examples/portmon-confg etc/
Edit the file:
vi etc/portmon-confg
The portmon-confg file can seem overwhelming at first, because so many examples are included. To get a quick understanding of the configuration, remove all text below the SIMULCONNECTS=64 line and enter the following after the SIMULCONNECTS=64:
HOST test 193.141.226.1 SMTPport 25 \
Critical HELO portmon.test
info 250
QUIT quit
Heres how to interpret the entry:
HOST test with the IP address 193.141.226.1 is being monitored on port 25. When the tcp connection is established, nocol will issue the command HELO portmon.test. nocol then expects to receive the reply (info) text of 250, at which nocol will send the SMTPport the quit command. The hostname and SMTPport above are description fields and used by nocol for reporting. The HELO and quit commands are SMTP commands supported by mail transport agents. Once the syntax is understood, other entries can be copied from the original configuration file or created for any TCP applications not included.
Start the tcpmon monitor:
bin/tcpmon &
If a mail transport agent is not running on the host specified above, the error will be logged and reported.
nocol Reporting The netconsole utility is very useful within an operations area. Often, however, an error report should automatically send an alarm or message to someone to take action. The noclogd daemon is used for this purpose. The daemon uses a configuration file like the other utilities included with nocol. An example noclogd is included in the nocol/etc/samples/ directory, so we will use it to get started:
cd /usr/local/nocol/
cp etc/samples/noclogd-confg etc/
Edit the file:
vi etc/noclogd-confg
and change the log entries to point to where log files should go. They can be placed wherever there is enough room, and the files should be rotated as well. The program log-maint in nocol/bin can be used to rotate the files.
The noclogd is also a network application that listens on UDP port 5534. It logs entries for any hosts that are configured to do so and that are specified in the noclogd-confg file. If you add any additional nocol hosts to your network and use the noclogd on another station, be sure to add the entry on the permithosts line.
Besides the logging just configured, noclogd can also run external programs for notification. The syntax, which resembles syslogd configuration, is:
monitor level | /path/to/program
For example, remove the two notification entries in the noclogd-confg and enter:
ippingmong critical | /bin/mail^netadmin
This entry will send netadmin an email whenever the ippingmon records a critical event, which should be the case if the ippingmong started earlier is still running. Notice that spaces are represented with the ^ character. Start the noclogd daemon:
bin/noclogd &
noclogd will only run the notifications when an event changes to the level specified so the receiver will not receive multiple messages based upon an event; however, an event can be notified at each level so that messages will be sent whenever the situation worsens or betters itself.
Final Steps The basic steps required to configure nocol and set up some monitors for nocol deployment has been covered. Other monitors can also be used, but they are outside the scope of this column.
The only step left to fully automate nocol is to integrate it into the system for automatic startup. The file nocol/bin/crontab.nocol gives examples for the keepalive_monitors, notification.pl, log-maint, and genweb.pl. The keepalive_monitor script checks to see whether the nocol programs are running. If not, it starts them back up. The notification.pl script, as the name implies, is used for automating notifications; the log-maint is used to rotate logs; and the genweb.pl is used to create Web pages based upon nocol warnings and errors. For the current configuration, only the following entry is needed.
15 * * * * /usr/local/nocol/bin/keepalive_monitor
Before making the change:
vi bin/keepalive_monitor
and change the PROGRAMS= entry to reflect the nocol monitors that are running. For this example use:
PROGRAMS="noclogd ippingmon portmon"
Save the keepalive_monitor file and add the following entry to the /etc/cronttab file:
15 * * * * /usr/local/nocol/bin/keepalive_monitor
Restart cron:
killall -HUP crond
To test the crontab, you can kill the noclogd, ippingmon, and portmon previously started, and they should be restarted within 15 minutes.
After completion of these steps, nocol should be up and running and mostly automated. We did not fully cover log file rotation, but the log-maint will get you on the right track. Several monitors were not discussed for space reasons, but the ones mentioned will cover 95% of most monitoring needs. nocol is definitely worth having in your toolbox.
About the Author
Ronald McCarty received his bachelors degree in Computer and Information Systems at the University of Marylands international campus at Schwaebisch Gmuend, Germany. After completing his degree, Ronald McCarty started his network career as network administrator at the Schwaebisch Gmuend campus. Ronald McCarty works for Lucent Technologies as a senior systems engineer on a customer team responsible for a major telecommunications carrier. He spends his free time with his two best friends in the world: his daughter, Janice, and his wife, Claudia. Ron can be reached at: ronald.mccarty@gte.net.
|