Cover V13, i01

Article
Listing 1

jan2004.tar

Current: An Open Source Update Server for Red Hat Linux

John Berninger

Beginning with version 6.2, Red Hat distributed a system updater tool called "up2date", which interoperates with the Red Hat Network (RHN) to intelligently download and install system updates and new packages. The client code, per Red Hat's usual business practices, was made publicly available. The server side of this code, however, was not released.

The usefulness of the up2date program was immediately apparent, but the inability to establish a local server for departmental or organizational use hindered some deployment models. Therefore, a project was begun to reverse-engineer the protocol to re-create a server. This effort was made possible by two factors: first, by the fact that the client code was released as open source, and second, that the underlying protocol was an open standard -- XML-RPC.

The result of this development effort is a project called "Current", an open source update server for Red Hat Linux. In this article, I will show how to deploy and maintain a Current server. I will also cover some of the prerequisites for successful installation and use of a Current server, and I'll describe how you can contribute to the Current project.

Background

The Current project started as an unnamed effort by a single sys admin at Duke University -- Hunter Matthews -- and remained so for quite some time. I became involved in the project as the second developer around the time version 0.9.5 of the server code was released, and there was suddenly a lot of work regarding communication, coordination, and code management. Once we established a source code repository, actual development was started again and things moved right along.

The Current development tree is hosted now in a CVS repository at Tigris.org. The CVSROOT for this repository is :pserver:anonCVS@current.tigris.org:/cvs, with no password. This provides anonymous CVS access. Once you've logged in, you need to check out the "current" module.

The present development model is relatively straightforward. We have adopted the same model for release numbers that is used by the Linux kernel; odd-numbered minor versions are development versions, and even-numbered minor versions are production/stable versions.

We also split the code out into functional sections that include an API layer, a database layer, a support layer, and a "main" server layer. The layers are relatively orthogonal as far as behavior. This was done by design to decrease the amount of interdependency in the code. This model allows us to change the API layer relatively easily in case of up2date API changes without worrying about how these changes will impact the database code, and vice versa. The database layer is further subdivided into a generic back-end and database implementation specific back-ends, allowing us to trivially change between database implementations (e.g., MySQL vs. PostgreSQL). We can also take advantage of features of some database implementations that aren't available in others, by isolating the actual database interaction from the general database layer.

General

So far, we only know that Current is the server side of an update tool. Let's take a closer look at what it is and what it does. The main purpose of Current is to provide the missing portions of the up2date tool for systems administrators to keep their systems updated without the necessity of maintaining hundreds or possibly thousands of registrations with RHN. It also allows administrators to customize the package set available to clients without lengthy contract negotiations with Red Hat for custom channel on RHN.

The main audience for Current was initially thought to be academic departments and colleges, where the need for a centralized update scheme was critical and where the costs in both capital and time expenditures prohibited the use of RHN. Another ideal environment for Current deployment was thought to be corporate research labs and environments that required either a much more distributed management of channels or a more flexible package management scheme for the channel packages than RHN offered. Current has been adopted by administrators in more environments than we anticipated, however, and we encourage administrators seeking a system update management solution to consider whether Current can meet their needs.

Deployment

The majority of the design, coding, and documentation decisions were made with the goal of easing deployment of the server for organizations wanting to use it to update Red Hat Linux-based systems. Our first goal was to make the installation process understandable to users. This task, of course, involved a great deal of documentation. Our second objective was to make the installation and potential removal of the package a simple matter. Our third, and most important, objective was to not interfere with any pre-existing configurations or applications.

With that said, I'll describe what it takes to actually install Current on a given system. At the time of this writing, the latest stable release was 1.4.4; the latest development release was 1.5.5rc1 (1.5.5 never actually made it out). I'll cover the installation and later configuration and management of the 1.4.4 release. Then, I'll describe changes introduced in the 1.5.x series and explain how those changes affect the installation, configuration, maintenance, and behavior of the server.

The easiest way to install Current is to simply download the binary RPM provided for each release from:

http://current.tigris.org/
and issue an rpm -ihv against it. We've built packages for installation on Red Hat Linux versions 7.x and 8.0, and our users have successfully installed the 8.0 packages on Red Hat Linux 9 systems as well.

We've tested the binary RPM's against many different systems, so we believe that the actual package installation will be error-free. If you have a problem installing from our RPMs, please let us know, and we'll fix it. Using the binary RPM package is by far the easiest install method available.

There is, however, a second install method available if you prefer to have a greater degree of control over your system or are installing on an system that does not have the RPM tool suite available. On the download page, in addition to binary and source RPM files, we provide tarballs of the released packages. These are the tarballs used to build the RPM packages, so when you download the tarball you're getting the same source code that you get through the RPM package. If you want to rebuild the RPMs on your system, you can simply issue:

rpmbuild -ta <tarball>
and RPM will build a Current binary and source package for you. This is the same command we use to generate the packages available for download.

Unpacking the tarball and installing without resorting to the RPM tool suite is also relatively simple -- simply issue make; make install in the "current/" directory that's created when you unpack. By default, the Makefile shipped specifies a destination directory of "/usr" so if you want to install somewhere else, you'll have to modify the Makefile or override the PREFIX variable on the "make" command line:

make ; make PREFIX=/opt install
Configuration

Once the program has been installed, you must configure it before you can start it up and use it. This is another area that we have tried to simplify as much as possible, but there are still several important steps that must be completed.

As of version 1.4.0 and later, Current makes use of Apache's mod_python module, which allows us to ignore a significant amount of gory detail regarding accepting connections, session management, multi-threadedness, and as I'll describe later, file transfer. This setup allows us to concentrate more closely on getting the protocol behavior correct and making sure that the portions that Apache can't handle are as correct and streamlined as possible. Not only do we not have to worry about the "heavy lifting" data transfers, we can take comfort knowing that people who've been doing such work for years are assisting us.

So, we have to configure our application from two points of view. First, we have to make it work with the system on which it's just been installed, and second, we have to tell Apache where to direct requests that should be handled by Current. Let's look at the first item now.

For reference, Listing 1 shows the sample configuration file that is distributed with Current v1.4.4. I will use this sample file in the next few sections as I examine how to configure Current. Note this file is an example only; it does not contain actual valid information being used on a server.

The configuration file for Current is fairly well documented; as shipped, it contains explanations of all the fields you must specify and what those fields represent to the actual Current server program. This configuration file is used by both the actual Current server when running, and by the "cadmin" program to assist with Apache configuration, which I'll examine later.

There are eleven (11) configuration directives in the main section of the config file, which must be labeled [current]. The first directive is "valid_channels", which is simply a whitespace-separated list of valid channel definitions for which the server should handle requests. Each item in this list should correspond to a single [label] designation; there can be additional [label] sections not listed in this directive, but all items in this directive must have a corresponding [label] section.

The next two directives, "log_file" and "log_level", determine how much of the server's operation should be logged and to where it should be logged. The log_file directive is a filename on the server's file system, and the log_level directive is an integer value from 0 to 10 inclusive, where a greater value equals greater logging. Be warned that setting logging to level 10 will generate a substantial amount of information in the file; levels above 4 are recommended only for diagnosing server operation problems.

The next directive, "apache_config_file", is the location of the configuration file that "cadmin" will create to configure Apache to properly handle up2date connections. In RHL 7.x, the value should be "/etc/httpd/conf/current.httpd.conf"; for RHL 8.0 and 9, it should be "/etc/httpd/conf.d/current.conf".

The next two directives are used to perform a check to ensure that permissions on the database and channel directories are set with sufficient read permissions to allow normal server operation. "access_check_type" determines whether the user, group, or other permission field will be checked and "access_check_arg" determines which user or group the permissions will be checked against. The values "all" and "none" for the access_check_type cause the access_check_arg to be ignored.

The next directive is "server_secret", which is a string used to authenticate client sysid's against the server to ensure that the systems attempting to use this server are registered with this server. This string should never be revealed or shared. It will become more central to permissions and access checking in future releases and, consequently, it will become more important that its value is protected.

The "server_id" is another string specific to this server; it is used only cosmetically in the 1.4.x series, but still needs to be specified.

The "current_dir" is the top-level directory where Current will store its database and channel information. It is very important that this directory tree be managed only by Current and not changed by other programs. This directory tree is the single most critical piece of Current operation in versions 1.4.x.

The final two directives -- "welcome_message" and "privacy_statement" -- are text messages that will be displayed to client machines when they are registered with this server.

Channels

The next portion of the configuration file deals with the individual channels the server will be handling. Before I get too far, though, I should define channels more clearly. The concept of a channel, as far as up2date and Current are concerned, is a unique combination of an OS release and base architecture. For example, the channel I use to perform testing is defined by the base architecture of my client system, which is i386 (i386 is the base arch for the i586, i686, and Athlon architectures), and by the release of the Red Hat Linux I'm using on the client, which is RHL 8.0. Note that, as of versions up to 1.6.x, Current does not support duplicate channel creation, so you can only create a single channel for each arch/release combination. You can't create one channel for 8.0/i386 and a second for 8.0/i686; the i686 and i386 arches are identical insofar as channel creation and management.

Now that I've how explained how channels work, I'll look at the configuration of a channel within Current. As I mentioned previously, the name of the section should correspond with, at most, one entry from the "valid_channels" list in the [current] section. In each channel stanza, there are eight (8) configuration directives. The first is "name", which is simply a human-readable designation of the channel's name. Again, all configuration directives refer to the sample configuration file in Listing 1.

The second directive, "parent_channel", is not used in versions through 1.4.x; it is in place for future expansion when we begin support channel parenting.

The next two directives are the two items that define the channel, the "arch" and "os_release" directives. The arch directive is not the real architecture of the machine, rather it is the base-compatible architecture. For example, if a client has an i686 processor, the arch of the channel it uses would be i386, not i686, as we've seen previously.

The next directive is "description", which is a human-readable description of what the channel serves; it can be as short as a blank string, or several paragraphs long and is free-form within the restrictions of printable ASCII characters.

The next two directives, "rpm_dirs" and "src_dirs", specify the directories, which contain the binary and source RPM files for this channel, respectively. The rpm_dirs is the more critical of the two, as these are the actual RPMs that will be applied to the system. The src_dirs can be empty, but it must be present in the channel definition. The final directive, "srpm_check", tells the program what to do if there is no source RPM available for a given binary RPM. This is either 0, 1, or 2 -- 0 does not check for a source RPM; 1 causes a warning to be issued for any binary RPM without a corresponding source RPM; and 2 causes an error condition when a binary RPM is found without a corresponding source RPM.

After creating the current.conf file, we need to tell Apache to use Current when it gets an up2date request. We already specified the location for the apache_config_file, so now we need to run "cadmin create_apache_config". This will create a configuration file, in the designated location, that will be read by Apache at startup and tell Apache how to handle up2date connections. For Red Hat Linux 7.x servers, you must modify the normal Apache configuration file (/etc/httpd/conf/httpd.conf) and add a directive "Include current.httpd.conf" to this file; in RHL 8.0 and beyond, the file is included by default when it is created in the /etc/httpd/conf.d directory.

Once the server has been configured, we need to actually create the channel databases that were specified in the "valid_channels" directive. To do this, simply run the command cadmin create <chan>, substituting each item from valid_channels for <chan>. Note that when you are using cadmin to manage the server, either through channel creation or updating, the Apache server cannot be running. The create_channel command takes a fair amount of time to complete -- on the order of 15 minutes for a fully populated RHL 8.0 channel with all updates. Once the command completes, you should be able to start the Apache server through "service httpd start" and to register and update clients against the new Current server.

Before you begin registering or updating clients, however, you need to ensure that you have the proper SSL certificate in use on both the server and client sides. To create a new SSL certificate for the server, issue the command cadmin create_certificate, which will create a file in /etc/current/ called RHNS-CA-CERT. This is the SSL certificate that you must use with Apache and that you must place in the /etc/sysconfig/rhn/ directory on the clients. This is a common troubleshooting issue we've seen on the mailing lists. Another common issue is for the clients' system clocks to be significantly skewed from the server system clock. There can only be about a four-minute difference (maximum) between system times for the SSL negotiation to succeed, so you'll may want to enable NTP on the server and on all clients.

Server Operation

The actual operation of the Current server is relatively invisible to the administrator or the users of client systems; even on the server, there's not too much to see. What's actually happening behind the scenes, however, is a completely different story. I mentioned previously that Current operates in a mod_python environment, called by Apache when certain requests are received. Here's exactly what happens when a client attempts to use a Current server to update itself.

To begin, the client invokes (or the root user on the client invokes) the up2date command. The interesting stuff starts happening almost immediately. The up2date client logs into the server with an XML-RPC call and presents the client credentials, which include the system ID, or sysid. This is done by a unique HTTP POST request; Apache knows to send the POST request to the Current server because of the entries in the current.apache.conf file generated by the "cadmin create_apache_config" command. Once the request is received, the Current server authenticates the sysid as having been registered with this server and presents the client with a list of valid channels it can access. This list is returned to the client as an XML-RPC response to the HTTP POST.

After the client has logged in, it sends the next request to the server; the different requests that could be sent at this point are too numerous to list here. Ultimately, the request will be either an HTTP POST or an HTTP GET; the POST requests are processed by the Current server code and usually involve tasks such as answering dependency resolution queries and channel information requests. The GET requests are redirected to the file system by the current.apache.conf file and are simply handled by Apache's usual file retrieval process. These are usually requests for package lists, RPM header files, or actual RPM files.

With the exception of file transfers, all data passed from the client to the server and back is encoded using XML-RPC, and all transfers of sensitive data are encrypted using SSL

Future Directions

Work on version 1.5.x, which is the development series for the 1.6.x production releases, is continual, and includes a number of major changes to the way the server is managed. The most significant change is the integration of a full, formal SQL-backed database backend; we've chosen PostgreSQL as the model database on which development is done. As mentioned before, we carefully considered how to provide an easy method for changing database back-ends as desired, and there is also separate work being done on a MySQL backend in parallel with our development.

This change enables us to modify the behavior with respect to rebuilding channels. In versions 1.5.x and above, this is done through Apache just like a normal up2date request, so there's no longer a need to shut down Apache to modify the channels. Other changes include a reduction in size of the configuration file, moving the channel configuration from the config file into the database, and a better logging system.

Features scheduled for later development include support for child parenting, tracking of client systems, multiple administrator authentication, and support for pushing packages from the server as opposed to the current pull-only model.

Conclusion

Current was created to fill a relatively specific need for what was thought to be a niche audience. During the development process, there were many problems to be solved, unexpected delays in startup, and a lot of effort spent setting up a multi-developer environment. There were also unexpected bonuses and many timesavers noticed as a result. Now development is continuing at a comfortable pace, and I hope future releases will see additional developer participation.

Current is not the only answer to the problem of an automated system updater; it is just one of many possible solutions. For those who want to use the client tools provided by Red Hat for Red Hat Linux, though, we think it is one of the better answers available. We invite you to send ideas for improvement, bug fixes, or patches to the development mailing list.

John is an RHCE working in Red Hat's Global Support Services group. He and his coworkers can be reached by phone at: 888-REDHAT-1.