Using a Distributed Shell to Manage Deployments of Linux and Unix Servers

Jeff McElroy

Linux and Unix servers have proliferated across many corporate networks. Some deployments have been well organized, but many have happened slowly and haphazardly. Unsurprisingly, many of these deployments lack organization and standardization. In smaller deployments, an administrator can manage each server individually, but as the number of servers increases this approach becomes awkward and additional tools must be utilized.

There is no standard way to maintain middle- to large-sized deployments. Instead, an administrator puts together a hodgepodge of solutions including SNMP, syslog, shell scripts, webmin, or more recently, the Red Hat Network. This article introduces distributed shell commands to this collection. A distributed shell is a good first step in standardizing your deployment because it provides a mechanism for you to view and manage your deployment as a single system.

Distributed Shell Description

A distributed shell allows a user to have a shell command run simultaneously on a large number of servers. The output from these commands is returned to the user's shell with the name of his source login and server pre-pended to each line. For instance, the command dssh hostname would give the following output:

root@japan.radux.com   japan
root@italy.radux.com   italy
root@sweden.radux.com  sweden
root@germany.radux.com germany

Other, more useful commands can be issued just as readily. For instance, if you want to locate any server with a user currently logged in, you could issue the command dssh users. This is the same in function as the command rusers, although the output is a bit different. The commands executed by dssh do not need to return immediately. For instance, the command dssh tail -f /opt/apache.instance/logs/access_logs would never exit, but its output could be used to monitor the logs of many Web servers in a simple way.

The best use of a distributed shell is often as a building block for more powerful scripts. For example, you could quickly write a script that would check and restart any frozen or malfunctioning mail server process. If you added that script to your crontab, you would have a centralized form of automated recovery. A side benefit to using a distributed shell through other scripts is that you are somewhat protected from typos that occur when performing ad hoc commands.

The DCMD Package

There are many implementations of a distributed shell. The implementation used for this article can be found at http://dcmd.sourceforge.net. This is a simple (less than 200 lines of scripts and no binaries) implementation based on OpenSSH. In addition to the dssh command referred to above, dcmd provides several other commands and utilities. For the transfer of files or the maintaining of identical directory structures across your servers, dcmd provides the commands dscp and drsync. The command dscp is to scp what dssh is to ssh; it allows you to easily transfer a file to or from all of your servers at once. For example, if you want to transfer the subdirectory /opt/newApplication to all of your servers, you could do so by issuing the command dscp -rp /opt/newApplication []:/opt/newApplication. The [] characters will expand to the login@hostname used to contact each server. The above example might expand into the following scp commands:

scp -rp /opt/newApplication root@japan.radux.com:/opt/newApplication
scp -rp /opt/newApplication root@italy.radux.com:/opt/newApplication
scp -rp /opt/newApplication root@sweden.radux.com:/opt/newApplication
scp -rp /opt/newApplication root@germany.radux.com:/opt/newApplication

The dscp command can also transfer files from a group of servers back to your workstation. This can be done with the command dscp -rp []:/var/log []. After running this command, a copy of the /var/log directory from each server will be located in your current working directory. Each retrieved /var/log directory would be named after the login and hostname used to contact the server (i.e., login@hostname).

The drsync command is similar in function to dscp, but unlike scp, rsync attempts to transfer only the differences in the source and destination file sets. This makes it a better command for maintaining large or remote file sets. See the Web site http://rsync.samba.org for more details.

The dcmd package also provides several other commands. They are dps, which runs the ps command on each server; dfind, which runs the find command on each server; and dusers, which is a synonym for the rusers example above. These commands are all one-line wrappers for the command dssh. The commands dssh, dscp, and drsync are in turn one-line wrappers for the command dcmd, after which this package is named.

Setup of dcmd

Download the dcmd tarball from http://dcmd.sourceforge.net and extract it into a convenient destination directory on your workstation. For this article, I will put it into "/opt/dcmd". Add the directory "/opt/dcmd/bin" to your path. If you execute the command which dssh, it should report a path of /opt/dcmd/bin/dssh. Do not run any of the dcmd commands until after you have set up OpenSSH.

Next, edit the /opt/dcmd/etc/dcmd.hosts file. Put the login and hostname for each server you want to manage into this file. The format for this file is "login@hostname". Comments are allowed in this file and should begin with a pound sign (#).

The name and location of the dssh.hosts file is configurable by setting the environment variable DCMD_HOSTS. Examine the /opt/dcmd/bin/dcmd_getHostsFile script for details. By creating several host files and setting this environment variable appropriately, you can easily manage different groups of servers.

Setup of OpenSSH

OpenSSH should be installed and running on all the servers you want to manage. If you already have OpenSSH configured to allow you to log into each server without requesting a password, you can safely skip this section. If you are unfamiliar with OpenSSH, visit the Web site (http://www.OpenSSH.org) for background information and installation instructions. You will need to generate and distribute encryption keys.

There are two sets of keys with which to be concerned: the host keys and the user keys. Both sets of keys are asymmetric, which means that they consist of a private key and public key. The public keys are the ones that will be distributed.

Host keys are used by OpenSSH to validate that you are connecting to the proper machine. They provide protection from man-in-the-middle attacks. These keys are usually generated when OpenSSH is first installed (see the OpenSSH documentation). Whenever your OpenSSH client connects to an OpenSSH server, it verifies this key. If you are connecting to the server for the first time, your client will tell you that this is a new key and ask you whether it is valid. You will not be able to respond to this query while dcmd is running because dcmd runs OpenSSH in the background, and OpenSSH is not in control of your terminal. Therefore, these host keys must be verified and saved on your workstation prior to running any dcmd commands.

User keys are used by OpenSSH to authenticate your client to the remote server. Unlike the host keys, you must generate these keys manually, which can be done with the command ssh-keygen -t dsa. This command will first prompt you for a filename. Use the default ~/.ssh/id_dsa. The next two prompts will ask you to enter and verify a password. You have two options. The first is to just press enter twice and leave your private key un-encrypted. Use this option only if you are sure your home directory is safe from prying eyes. If you choose to encrypt your private keys, you will need to use ssh-agent (see the ssh-agent man page for details). The easiest way to do this is to define the environment variable DCMD_USE_SSH_AGENT. When this is defined, the dcmd script will re-launch itself under ssh-agent and execute the command ssh-add. ssh-add will ask for your user key's password and manage your keys for you. If you already use ssh-agent and don't want dcmd to request a password every time you run a dcmd command, leave the variable undefined and OpenSSH should operate fine.

The dcmd package contains a script that can simplify the distribution of public keys. This script appends your user public key to the end of the "authorized_keys" file on your servers, which allows the server to authenticate who you are, without requesting a password. This script will also collect each server's host key for you. To use this command, type "dcmd_distributeKeys ~/.ssh/id_dsa.pub /root/.ssh/authorized_keys" where "~/.ssh/id_dsa.pub" is the location of your public key and "/root/.ssh/authorized_keys" is the location of the authorized_keys file on your servers. You will be prompted for the password to each server listed in your dcmd.hosts file as dcmd_distributeKeys works.

Caveats

There are a few possible issues to be aware of. As the number of servers listed in dcmd.hosts increases, so does the utilization of system resources on your workstation. You will have many run-able OpenSSH processes waiting for CPU resources and will probably see a load average in the double digits. dcmd does not mind this, but the performance of other applications will suffer. If dcmd has 30 run-able processes and your Web server only has 3, dcmd will take up 10 times more of your CPU resources than your Web server. It is best not to run a dcmd command from a machine on which you run other production applications, especially ones that are response-time sensitive. If you are managing several hundred servers at once, you may also run into other system resource issues, such as the number of file descriptors available to the kernel. These limits can be easily increased, but you may need to do so manually.

Conclusion

Implementing a distributed shell can simplify an environment and save time for an administrator who is tasked with maintaining many servers. It is easy to set up, and builds upon the scripting skills that most administrators already possess.

Jeff McElroy is a consultant based in Sioux Falls, South Dakota. He specializes in Java, C/C++, Perl, AIX, and Linux and has more than 10 years working experience. Jeff has a bachelor's degree with majors in Chemistry, Math, and Psychology. He recently started focusing on providing single sign-on and unified directory services for heterogeneous networks. Jeff can be contacted at: jmcelroy11@sio.midco.net.