System
V Init Staged on an RS/6000 SP Platform
Bill McLean
Every administrator has to manage user applications, and on an
SP complex with a large number of nodes that can be quite an ordeal.
To help manage them, it's a good idea to consider using System V
init. Briefly, System V init uses the runlevel state of the init
daemon to call the rc command with the runlevel parameter from the
inittab to send commands to all scripts needed for a specific init
state. This document describes a project that utilizes System V
init and a tool set providing easy management of all your user applications
across an SP complex. It simplifies application control links, secured
application control, application listings and status checks.
Installation
You will need to create a globally shared directory on all the
nodes. In the SP complex this is easily done on the home filesystem
of your SP complex, but you may use any NFS mounted on all the nodes.
You will need to create a local directory on each node to house
the local "rc" scripts. The local directory is necessary to avoid
the need for any NFS filesystems at system boot (Figure 1). The
next step is the creation of an "rc.script" standard template and
naming scheme. Additionally, I will provide a set of Perl management
tools with sample configuration files that you can adapt to your
environment.
The "definition of standards" is critical to any System V init
process. You have to decide content and local location of the application
"rc" scripts. The first step is to create an "rc" script template.
It's a simple case statement in ksh skeleton designed to accept
the start and stop commands from the rc command. You may
expand the template to include whatever you desire, but it must
have start and stop functionality to be compatible with the "/etc/rc.d/rc"
(Listing 1).
For my example, I added "status" to the template; rc will
not send "status" but my management tools will. It is good idea
to include status functionality because it provides an easy way
to verify whether an application is running. Also, the "good" and
"bad" status messages of your application scripts will have to be
addressed. I used "Not running" and "Running" for my case-sensitive
status, but any binary convention will work. The standardization
of the application scripts is the basis of the entire System V init
implementation. This is where you will need to impose restrictions
and guidelines.
The next step is to decide on a naming standard for all the application
scripts. My naming standard consisted of two script types -- a global
script and a local script. The global application name was used
in the global shared directory, and the local application resided
locally on the node with an added field for startup ordering (Listing
2a).
Basically, the Global Application name and Local Application names
are the same, except locally I add the "Start order" field at the
beginning (Listing 2a and Listing 2b.) The field delimiter for both
type of scripts is a ".". Each field may have any character other
than the field delimiter and may be of any length. The heart of
each script is the application name. Each application name will
have to be unique to be contained in the global script directory.
This allows all node scripts to reside in the global shared directory,
helping ease management. Although you may decide on any naming standard,
all my scripts are designed to use this standard name and field
structure and will need to be adjusted on deviation.
After script standards and the naming standards have been chosen,
you must create and build the application scripts. For each application,
insert its start, stop, and status code into the case skeletons,
then establish your global script directory. In my case, I used
"/u/shared/appscripts" and created the local script directory "/appscripts"
on each node (Figure 1). I recommend not placing the local scripts
under /etc. Keeping the application scripts outside of /etc enables
the assigning of rights to application or operations people to enable
reading and debugging of application scripts. Of course, we will
be linking to the "runlevel" directories under "/etc/rc.d/rc#.d",
so their actual location is not critical to System V init.
Implementation
To begin, create all the necessary directories under "/etc/rc.d/rc#.d"
to support the runlevels you will be using. Normally AIX uses runlevel
2. If you want to separate the kill and start scripts, you must
edit the "rccfg.config" tools file. It's important to note that
the "shutdown" command will not call the kill scripts, so put a
hook into the rc.shutdown to call an "rccfg stop All" before the
node goes down. Splitting the links is nice here because you can
just call "rc #" where only the stop links reside. You must also
create the inittab entries for your inittab. Depending on which
version of AIX you use, they may already be there as:
l2:2:wait:/etc/rc.d/rc 2
Of course, I would add a log so that you can see what messages the
scripts reported when called on to start and stop. For example:
l2:2:wait:/etc/rc.d/rc 2 >/tmp/rc2.log 2>&1
Next, create all the necessary directories. You will need a directory
on the control workstation as the global script directory. This directory
will house all the site's scripts in global name form (Listing 2b).
You will also need a local directory on all the nodes. This directory
will house all the local application rc.scripts. These need to be
local and not on an NFS. After directory creation, you must prepare
the "rccfg.config" file with the correct directory parameters (Listing
5). This configuration file will be used by "rccfg" and all the provided
tools. Here is a list of the parameters needed in the configuration
file:
gscripts /appscripts/global -- Global script location
lscripts /appscripts/local -- Local script location
stlink /etc/rc.d/rc2.d -- Start link location
stplink /etc/rc.d/rc2.d -- Stop link location
rclog /tmp/rclog -- rccfg log location
secaccess /etc -- Option parm for secured version
Once the parameters have been set, copy to all nodes that will
be running "rccfg". The config file is to reside in "/etc".
Move all the script tools locally to the nodes. I typically put
the tools in /etc/rc.d; the configuration file will be expected
in the /etc directory. Once the "rccfg" is locally on the node,
I like to make a link to it from /usr/local/bin, but it's not necessary.
Last, copy your scripts to the Global application script directory,
then to the local nodes application directories. When placing the
local scripts on the node, make sure to apply the starting field
delimiter "SXX" onto each filename. This field will control the
order in which the links are called by rc, and thereby the startup
and shutdown order of the applications. You should now be able to
run the "rccfg" tool to create the links on that node:
< /etc/rc.d/rccfg build All >
If it works, then it created your links in the correct rc directories
based on the "rccfg.config" file; if not, then a common failure would
be a parameter in the config file that is missing or inaccurate. Next,
check the scripts to confirm that they can run on their own, etc.
If you can't get "rccfg build All" to work, then you have a problem,
and you may want to review all the steps.
Management Tools
Now let's review the tools to help you use these scripts. The
main tool is "rccfg" (Listing 3). It's used to call all scripts
with start, stop, or status. It additionally will manage link creation
and removal based on application name of the local scripts. Keep
in mind that the "rccfg" will create the link script in order of
the "SXX" delimiter for startup links, and backwards for the "KXX"
kill links:
<rccfg >:
rccfg <command> All -- Send command to all start scripts.
rccfg <app name> < command> -- Send command to
script with matching app name.
rccfg <command> < app name > -- Send command
to script with matching app name.
Rccfg <list> [All | appname] -- Will list all local
application script names.
rccfg Valid Commands:
start -- Send "start" argument to the script.
stop -- Send "stop" argument to the script.
status -- Send "status" argument to the script.
noauto -- Remove the start link for application; the script
name will still denote "S##".
build -- Send "build" argument to the script; restricted
to "All" command.
list -- Will list all local application script names.
Examples:
rccfg start tsm -- Send start to tsm application script.
rccfg stop tsm -- Send stop to tsm application.
rccfg noauto tsm -- Remove the start link in start link run-level
directory for tsm.
rccfg stop all -- Send stop to all applications.
rccfg noauto all -- Remove all application start links.
rcccfg tsm noauto -- Remove start link for tsm application
only.
rccfg list -- Will list all local application script names.
rccfg build All -- Will remove and recreate all stop and
start links.
The main benefit of this tool is that you may "dsh" it out from
the control workstation to many nodes. You may easily stop all applications
from coming up the next boot without editing an rc file or the inittab.
Additionally, if you want to check the entire complex status, you
may:
<dsh ' rccfg status All ' >
and grep for your status message. Another feature of "rccfg" is that
it can also be farmed out as a secured tool. I have included all the
security code in the original script. Just change the name from "rccfg"
to any other name ("opscfg" for example) and it will trigger its security
protocols. To use a secured version of the script, you must create
a /etc/<scriptname>.access file according to your "rccfg.config"
parameter. I have included a sample of the file needed for secure
operation (Listing 4). The main limitation of the secured version
is that any start or stop functions to an application name that is
not in the access config file will fail and be logged. If someone
tries to use a secured version to control an application to which
it's not "configured", it will log the attempt and exit with a message.
This feature enabled me to add one command in the sudoers file and
allow an entire account team to control all their applications.
The next tool I include is a tool for updating the Local script
from the global directory (Listing 6). This command uses the same
config file as "rccfg" in the "/etc/rccfg.config" directory. The
tool is to be run locally on the node or "dsh'd'" out to the node.
If the modified time stamp of the global script matching the local
scripts application name is newer then the local copy, it will push
the global copy to the local node. It will also create a "/tmp/<scriptname>"
backup of the overwritten script, just in case:
<rcndupdt>
< dsh ' /etc/rc.d/rcndupdt '>
The rcndupdt will update all local scripts that are on a node
from the global script directory.
Script Promotion
Moving something into promotion for a node is quite easy. If an
application script already exists, you just update the script in
the global directory and run rcndupdt on the node to pull
it down. If the application is new, you must create a new application
script and copy it into the "Global Applications Directory", and
then copy it into the "Local Application Directory" with the start
order field as necessary, and use "rccfg build All" to recreate
the local nodes links.
Conclusion
Once the standards are created and the scripts put in place, the
System V init approach is very beneficial. You will no longer have
to edit the inittab or an rc.local script to stop an application
from starting on reboot. Also, if you need to check an application,
you can just use rccfg status <appname>. It really
helps checking multiple node application status after an outage.
It has helped standardize the way people start and stop applications
on my nodes. The feature that management likes best is the logging
of all application control calls by rccfg and the secured
versions. The example here was presented for a SP complex, but it
can be easily applied to standalone systems as well. Standalones
will not have dsh -- it's a PSSP command -- but rexec
will work, and the rccfg script could be easily converted
if you do not have Perl on your systems.
Bill McLean is currently a systems administrator for IBM's
Scaleable Processor systems on the Viewpointe Oneline Archive Services
account. Bill has been in the computing field for 11 years, since
his graduation at the University of Arkansas in 1992. For the past
seven years he has been specializing in AIX and Unix server administration.
|