Current:
An Open Source Update Server for Red Hat Linux
John Berninger
Beginning with version 6.2, Red Hat distributed a system updater
tool called "up2date", which interoperates with the Red Hat Network
(RHN) to intelligently download and install system updates and new
packages. The client code, per Red Hat's usual business practices,
was made publicly available. The server side of this code, however,
was not released.
The usefulness of the up2date program was immediately apparent,
but the inability to establish a local server for departmental or
organizational use hindered some deployment models. Therefore, a
project was begun to reverse-engineer the protocol to re-create
a server. This effort was made possible by two factors: first, by
the fact that the client code was released as open source, and second,
that the underlying protocol was an open standard -- XML-RPC.
The result of this development effort is a project called "Current",
an open source update server for Red Hat Linux. In this article,
I will show how to deploy and maintain a Current server. I will
also cover some of the prerequisites for successful installation
and use of a Current server, and I'll describe how you can contribute
to the Current project.
Background
The Current project started as an unnamed effort by a single sys
admin at Duke University -- Hunter Matthews -- and remained so for
quite some time. I became involved in the project as the second
developer around the time version 0.9.5 of the server code was released,
and there was suddenly a lot of work regarding communication, coordination,
and code management. Once we established a source code repository,
actual development was started again and things moved right along.
The Current development tree is hosted now in a CVS repository
at Tigris.org. The CVSROOT for this repository is :pserver:anonCVS@current.tigris.org:/cvs,
with no password. This provides anonymous CVS access. Once you've
logged in, you need to check out the "current" module.
The present development model is relatively straightforward. We
have adopted the same model for release numbers that is used by
the Linux kernel; odd-numbered minor versions are development versions,
and even-numbered minor versions are production/stable versions.
We also split the code out into functional sections that include
an API layer, a database layer, a support layer, and a "main" server
layer. The layers are relatively orthogonal as far as behavior.
This was done by design to decrease the amount of interdependency
in the code. This model allows us to change the API layer relatively
easily in case of up2date API changes without worrying about how
these changes will impact the database code, and vice versa. The
database layer is further subdivided into a generic back-end and
database implementation specific back-ends, allowing us to trivially
change between database implementations (e.g., MySQL vs. PostgreSQL).
We can also take advantage of features of some database implementations
that aren't available in others, by isolating the actual database
interaction from the general database layer.
General
So far, we only know that Current is the server side of an update
tool. Let's take a closer look at what it is and what it does. The
main purpose of Current is to provide the missing portions of the
up2date tool for systems administrators to keep their systems updated
without the necessity of maintaining hundreds or possibly thousands
of registrations with RHN. It also allows administrators to customize
the package set available to clients without lengthy contract negotiations
with Red Hat for custom channel on RHN.
The main audience for Current was initially thought to be academic
departments and colleges, where the need for a centralized update
scheme was critical and where the costs in both capital and time
expenditures prohibited the use of RHN. Another ideal environment
for Current deployment was thought to be corporate research labs
and environments that required either a much more distributed management
of channels or a more flexible package management scheme for the
channel packages than RHN offered. Current has been adopted by administrators
in more environments than we anticipated, however, and we encourage
administrators seeking a system update management solution to consider
whether Current can meet their needs.
Deployment
The majority of the design, coding, and documentation decisions
were made with the goal of easing deployment of the server for organizations
wanting to use it to update Red Hat Linux-based systems. Our first
goal was to make the installation process understandable to users.
This task, of course, involved a great deal of documentation. Our
second objective was to make the installation and potential removal
of the package a simple matter. Our third, and most important, objective
was to not interfere with any pre-existing configurations or applications.
With that said, I'll describe what it takes to actually install
Current on a given system. At the time of this writing, the latest
stable release was 1.4.4; the latest development release was 1.5.5rc1
(1.5.5 never actually made it out). I'll cover the installation
and later configuration and management of the 1.4.4 release. Then,
I'll describe changes introduced in the 1.5.x series and explain
how those changes affect the installation, configuration, maintenance,
and behavior of the server.
The easiest way to install Current is to simply download the binary
RPM provided for each release from:
http://current.tigris.org/
and issue an rpm -ihv against it. We've built packages for
installation on Red Hat Linux versions 7.x and 8.0, and our users
have successfully installed the 8.0 packages on Red Hat Linux 9 systems
as well.
We've tested the binary RPM's against many different systems,
so we believe that the actual package installation will be error-free.
If you have a problem installing from our RPMs, please let us know,
and we'll fix it. Using the binary RPM package is by far the easiest
install method available.
There is, however, a second install method available if you prefer
to have a greater degree of control over your system or are installing
on an system that does not have the RPM tool suite available. On
the download page, in addition to binary and source RPM files, we
provide tarballs of the released packages. These are the tarballs
used to build the RPM packages, so when you download the tarball
you're getting the same source code that you get through the RPM
package. If you want to rebuild the RPMs on your system, you can
simply issue:
rpmbuild -ta <tarball>
and RPM will build a Current binary and source package for you. This
is the same command we use to generate the packages available for
download.
Unpacking the tarball and installing without resorting to the
RPM tool suite is also relatively simple -- simply issue make;
make install in the "current/" directory that's created when
you unpack. By default, the Makefile shipped specifies a destination
directory of "/usr" so if you want to install somewhere else, you'll
have to modify the Makefile or override the PREFIX variable on the
"make" command line:
make ; make PREFIX=/opt install
Configuration
Once the program has been installed, you must configure it before
you can start it up and use it. This is another area that we have
tried to simplify as much as possible, but there are still several
important steps that must be completed.
As of version 1.4.0 and later, Current makes use of Apache's mod_python
module, which allows us to ignore a significant amount of gory detail
regarding accepting connections, session management, multi-threadedness,
and as I'll describe later, file transfer. This setup allows us
to concentrate more closely on getting the protocol behavior correct
and making sure that the portions that Apache can't handle are as
correct and streamlined as possible. Not only do we not have to
worry about the "heavy lifting" data transfers, we can take comfort
knowing that people who've been doing such work for years are assisting
us.
So, we have to configure our application from two points of view.
First, we have to make it work with the system on which it's just
been installed, and second, we have to tell Apache where to direct
requests that should be handled by Current. Let's look at the first
item now.
For reference, Listing 1 shows the sample configuration file that
is distributed with Current v1.4.4. I will use this sample file
in the next few sections as I examine how to configure Current.
Note this file is an example only; it does not contain actual valid
information being used on a server.
The configuration file for Current is fairly well documented;
as shipped, it contains explanations of all the fields you must
specify and what those fields represent to the actual Current server
program. This configuration file is used by both the actual Current
server when running, and by the "cadmin" program to assist with
Apache configuration, which I'll examine later.
There are eleven (11) configuration directives in the main section
of the config file, which must be labeled [current]. The first directive
is "valid_channels", which is simply a whitespace-separated list
of valid channel definitions for which the server should handle
requests. Each item in this list should correspond to a single [label]
designation; there can be additional [label] sections not listed
in this directive, but all items in this directive must have a corresponding
[label] section.
The next two directives, "log_file" and "log_level", determine
how much of the server's operation should be logged and to where
it should be logged. The log_file directive is a filename on the
server's file system, and the log_level directive is an integer
value from 0 to 10 inclusive, where a greater value equals greater
logging. Be warned that setting logging to level 10 will generate
a substantial amount of information in the file; levels above 4
are recommended only for diagnosing server operation problems.
The next directive, "apache_config_file", is the location of the
configuration file that "cadmin" will create to configure Apache
to properly handle up2date connections. In RHL 7.x, the value should
be "/etc/httpd/conf/current.httpd.conf"; for RHL 8.0 and 9, it should
be "/etc/httpd/conf.d/current.conf".
The next two directives are used to perform a check to ensure
that permissions on the database and channel directories are set
with sufficient read permissions to allow normal server operation.
"access_check_type" determines whether the user, group, or other
permission field will be checked and "access_check_arg" determines
which user or group the permissions will be checked against. The
values "all" and "none" for the access_check_type cause the access_check_arg
to be ignored.
The next directive is "server_secret", which is a string used
to authenticate client sysid's against the server to ensure that
the systems attempting to use this server are registered with this
server. This string should never be revealed or shared. It will
become more central to permissions and access checking in future
releases and, consequently, it will become more important that its
value is protected.
The "server_id" is another string specific to this server; it
is used only cosmetically in the 1.4.x series, but still needs to
be specified.
The "current_dir" is the top-level directory where Current will
store its database and channel information. It is very important
that this directory tree be managed only by Current and not changed
by other programs. This directory tree is the single most critical
piece of Current operation in versions 1.4.x.
The final two directives -- "welcome_message" and "privacy_statement"
-- are text messages that will be displayed to client machines when
they are registered with this server.
Channels
The next portion of the configuration file deals with the individual
channels the server will be handling. Before I get too far, though,
I should define channels more clearly. The concept of a channel,
as far as up2date and Current are concerned, is a unique combination
of an OS release and base architecture. For example, the channel
I use to perform testing is defined by the base architecture of
my client system, which is i386 (i386 is the base arch for the i586,
i686, and Athlon architectures), and by the release of the Red Hat
Linux I'm using on the client, which is RHL 8.0. Note that, as of
versions up to 1.6.x, Current does not support duplicate channel
creation, so you can only create a single channel for each arch/release
combination. You can't create one channel for 8.0/i386 and a second
for 8.0/i686; the i686 and i386 arches are identical insofar as
channel creation and management.
Now that I've how explained how channels work, I'll look at the
configuration of a channel within Current. As I mentioned previously,
the name of the section should correspond with, at most, one entry
from the "valid_channels" list in the [current] section. In each
channel stanza, there are eight (8) configuration directives. The
first is "name", which is simply a human-readable designation of
the channel's name. Again, all configuration directives refer to
the sample configuration file in Listing 1.
The second directive, "parent_channel", is not used in versions
through 1.4.x; it is in place for future expansion when we begin
support channel parenting.
The next two directives are the two items that define the channel,
the "arch" and "os_release" directives. The arch directive is not
the real architecture of the machine, rather it is the base-compatible
architecture. For example, if a client has an i686 processor, the
arch of the channel it uses would be i386, not i686, as we've seen
previously.
The next directive is "description", which is a human-readable
description of what the channel serves; it can be as short as a
blank string, or several paragraphs long and is free-form within
the restrictions of printable ASCII characters.
The next two directives, "rpm_dirs" and "src_dirs", specify the
directories, which contain the binary and source RPM files for this
channel, respectively. The rpm_dirs is the more critical of the
two, as these are the actual RPMs that will be applied to the system.
The src_dirs can be empty, but it must be present in the channel
definition. The final directive, "srpm_check", tells the program
what to do if there is no source RPM available for a given binary
RPM. This is either 0, 1, or 2 -- 0 does not check for a source
RPM; 1 causes a warning to be issued for any binary RPM without
a corresponding source RPM; and 2 causes an error condition when
a binary RPM is found without a corresponding source RPM.
After creating the current.conf file, we need to tell Apache to
use Current when it gets an up2date request. We already specified
the location for the apache_config_file, so now we need to run "cadmin
create_apache_config". This will create a configuration file, in
the designated location, that will be read by Apache at startup
and tell Apache how to handle up2date connections. For Red Hat Linux
7.x servers, you must modify the normal Apache configuration file
(/etc/httpd/conf/httpd.conf) and add a directive "Include current.httpd.conf"
to this file; in RHL 8.0 and beyond, the file is included by default
when it is created in the /etc/httpd/conf.d directory.
Once the server has been configured, we need to actually create
the channel databases that were specified in the "valid_channels"
directive. To do this, simply run the command cadmin create <chan>,
substituting each item from valid_channels for <chan>. Note
that when you are using cadmin to manage the server, either through
channel creation or updating, the Apache server cannot be running.
The create_channel command takes a fair amount of time to complete
-- on the order of 15 minutes for a fully populated RHL 8.0 channel
with all updates. Once the command completes, you should be able
to start the Apache server through "service httpd start" and to
register and update clients against the new Current server.
Before you begin registering or updating clients, however, you
need to ensure that you have the proper SSL certificate in use on
both the server and client sides. To create a new SSL certificate
for the server, issue the command cadmin create_certificate,
which will create a file in /etc/current/ called RHNS-CA-CERT. This
is the SSL certificate that you must use with Apache and that you
must place in the /etc/sysconfig/rhn/ directory on the clients.
This is a common troubleshooting issue we've seen on the mailing
lists. Another common issue is for the clients' system clocks to
be significantly skewed from the server system clock. There can
only be about a four-minute difference (maximum) between system
times for the SSL negotiation to succeed, so you'll may want to
enable NTP on the server and on all clients.
Server Operation
The actual operation of the Current server is relatively invisible
to the administrator or the users of client systems; even on the
server, there's not too much to see. What's actually happening behind
the scenes, however, is a completely different story. I mentioned
previously that Current operates in a mod_python environment, called
by Apache when certain requests are received. Here's exactly what
happens when a client attempts to use a Current server to update
itself.
To begin, the client invokes (or the root user on the client invokes)
the up2date command. The interesting stuff starts happening almost
immediately. The up2date client logs into the server with an XML-RPC
call and presents the client credentials, which include the system
ID, or sysid. This is done by a unique HTTP POST request; Apache
knows to send the POST request to the Current server because of
the entries in the current.apache.conf file generated by the "cadmin
create_apache_config" command. Once the request is received, the
Current server authenticates the sysid as having been registered
with this server and presents the client with a list of valid channels
it can access. This list is returned to the client as an XML-RPC
response to the HTTP POST.
After the client has logged in, it sends the next request to the
server; the different requests that could be sent at this point
are too numerous to list here. Ultimately, the request will be either
an HTTP POST or an HTTP GET; the POST requests are processed by
the Current server code and usually involve tasks such as answering
dependency resolution queries and channel information requests.
The GET requests are redirected to the file system by the current.apache.conf
file and are simply handled by Apache's usual file retrieval process.
These are usually requests for package lists, RPM header files,
or actual RPM files.
With the exception of file transfers, all data passed from the
client to the server and back is encoded using XML-RPC, and all
transfers of sensitive data are encrypted using SSL
Future Directions
Work on version 1.5.x, which is the development series for the
1.6.x production releases, is continual, and includes a number of
major changes to the way the server is managed. The most significant
change is the integration of a full, formal SQL-backed database
backend; we've chosen PostgreSQL as the model database on which
development is done. As mentioned before, we carefully considered
how to provide an easy method for changing database back-ends as
desired, and there is also separate work being done on a MySQL backend
in parallel with our development.
This change enables us to modify the behavior with respect to
rebuilding channels. In versions 1.5.x and above, this is done through
Apache just like a normal up2date request, so there's no longer
a need to shut down Apache to modify the channels. Other changes
include a reduction in size of the configuration file, moving the
channel configuration from the config file into the database, and
a better logging system.
Features scheduled for later development include support for child
parenting, tracking of client systems, multiple administrator authentication,
and support for pushing packages from the server as opposed to the
current pull-only model.
Conclusion
Current was created to fill a relatively specific need for what
was thought to be a niche audience. During the development process,
there were many problems to be solved, unexpected delays in startup,
and a lot of effort spent setting up a multi-developer environment.
There were also unexpected bonuses and many timesavers noticed as
a result. Now development is continuing at a comfortable pace, and
I hope future releases will see additional developer participation.
Current is not the only answer to the problem of an automated
system updater; it is just one of many possible solutions. For those
who want to use the client tools provided by Red Hat for Red Hat
Linux, though, we think it is one of the better answers available.
We invite you to send ideas for improvement, bug fixes, or patches
to the development mailing list.
John is an RHCE working in Red Hat's Global Support Services
group. He and his coworkers can be reached by phone at: 888-REDHAT-1.
|