v14, i10: System Automation with PXE, Kickstart, and Cfengine

oct2005.tar

System Automation with PXE, Kickstart, and Cfengine

John Borwick

Manually installing operating systems and software is hard, and inevitably results in mistakes that make each server subtly different. Many administrators have surmounted the problems of manual installation by implementing a fully automated installation (FAI) process. With FAI, you can tell a server to install itself and return later to find a fully configured machine. The problem with FAI is that, while it guarantees a certain initial state, it does not necessarily configure each application or maintain the server's state over time -- you're on your own scheduling cron jobs and installing configuration files.

Our site implemented an FAI process with Red Hat Enterprise Linux 2.1 and 3. We use this FAI process to install cfengine, a tool that will maintain the server's state over time. Cfengine also finishes configuring our servers so that they are production-ready. We can install a functional incoming MX relay or a Usenet news server by updating one configuration file and network-booting the machine.

Requirements and Setup

The FAI we use requires many components: ISC's dhcpd, a TFTP server, PXELINUX, the appropriate initrd and kernel images for our distributions, Red Hat's Satellite Server, a cfengine RPM, a bootstrap site-specific cfengine RPM, a cfengine server, and CVS. To duplicate our FAI process, you could trivially substitute our Satellite Server for the open source "current" server (http://current.tigris.org/) or a yum RPM repository, and Subversion or arch for CVS.

The first step in setting up our FAI is to network-boot a server and to have it load a Kickstart configuration. Network-booting a machine to a Kickstart-parsing Linux kernel is not straightforward.

When a machine PXE boots, it sends out a DHCP request. The DHCP server must respond to that request with a DHCP packet that includes a few options. The packet must tell the machine which TFTP server to use, and where on that TFTP server to find a PXE boot file. These two options are named "next-server" and "filename" in ISC's dhcpd.

After the machine receives the DHCP packet, it goes to the TFTP server and loads the specified filename. Our PXE boot file is PXELINUX (http://syslinux.zytor.com/pxe.php). The server loads PXELINUX, which tells the machine to convert its IPv4 address into a hexadecimal filename, such as "7F000001" for "127.0.0.1". The machine then looks for this address in a TFTP directory called "/pxelinux.cfg". If it can't find that file, it starts stripping characters off the filename ("7F00000", ... "7"), and if it still can't find any file, it looks for a file called "default".

Once PXELINUX finds a configuration file, PXELINUX reads it to find how to boot. The two types of commands we use in the configuration file are for booting from the local disk and for booting from an initrd image and kernel on the TFTP server. If the machine is going to boot from the TFTP server, it finds the kernel images we use from Red Hat and appends a specified "ks" parameter to find our Kickstart configuration.

So, to install a new server, we have to configure lots of services: DHCP and TFTP at a minimum. Our cfengine configuration requires DNS lookups, so we have to update DNS, too. The TFTP files are the most annoying to configure, because each machine needs a kernel, initrd image, and a PXE configuration file. However, since we're installing the same image over and over, we can minimize this work by using one of a few standard PXE configuration files that point to the same kernels and initrd images.

autoconfig.pl

Such a complex process begs for automation. We have a Perl program, autoconfig.pl (Listing 1), which parses a space-delimited file to configure the TFTP, DHCP, and DNS services on our FAI server. The autoconfig.pl configuration file contains entries for each host to be installed. Each line of the file contains a machine's hostname, MAC address, IP address, TFTP server address, and PXELINUX prototype file. The script uses this information to populate DHCP, DNS, and the TFTP server. The script also purges any stored cfengine keys -- a detail needed only when you reinstall a cfengine client and want it to continue talking to the cfengine server.

We name most of our configuration files autoconfig.pl, it will overwrite the DHCP and DNS configurations and restart the two services. If the "--install" option is set to the name of a host defined in the script's configuration file, then the appropriate PXELINUX hexadecimal configuration file will be copied from a template. This PXELINUX configuration file is set to be world-writable, so the machine to be installed can overwrite the configuration via TFTP at a certain stage in the bootstrap process. We then purge all world-writable files on a regular basis.

Configuring Kickstart

The second step in our FAI setup is the Kickstart configuration. We only use one Kickstart file per distribution; we have one for RHAS2.1 and one for RHEL3. The server-specific configuration doesn't come until the third step, when cfengine runs. For the Kickstart, we use Red Hat's Satellite Server, but you could use an FTP or Web server with the appropriate RPMs.

The only unique thing about our Kickstart environment is the %post section. When the install finishes, the machine does two things. The %post section first tells it to go to its world-writable PXELINUX configuration file on the TFTP server and overwrite it with the contents of our "default" PXELINUX configuration file. This allows the machine to reboot to its local disk rather than being installed all over again. The second command in the %post section moves /etc/rc.d/rc.local so that it can create a "first-boot" rc.local file. This temporary rc.local file downloads some cfengine RPMs, runs cfengine, and then replaces itself with the vendor-delivered "rc.local".

The RPMs that we have the machine install on reboot are the cfengine RPM and our own "site-cfengine" RPM. The site-cfengine RPM installs our site's essential cfengine configuration files: /var/cfengine/inputs/update.conf and the cfengine server's public key. The rc.local file then tells the machine to run cfagent -Kq -DInit. I will describe exactly what this shell command means in a moment.

Running Cfengine

Enter cfengine, the third and final step in our FAI process. Cfengine is a suite of systems administration utilities designed to configure and maintain machines. Cfengine can configure a machine's time-zone, DNS servers, and NFS filesystems. It can copy files from a cfengine server, edit local files, and run arbitrary shell commands. Cfengine can also report on installed RPMs, verify processes are running, and check file permissions and attributes. The two resources I've found most helpful in writing cfengine configuration files are the reference guide at:

http://www.cfengine.org/docs/cfengine-Reference.html

and the cfengine wiki:

http://www.cfwiki.org/

Cfengine can be run via the command-line program "cfagent" or in daemon form via "cfexecd". Either way, the file /var/cfengine/inputs/update.conf is read. update.conf is intended to be a baseline configuration file that can keep the rest of your cfengine configuration up to date. If you whack your main configuration, update.conf can copy a new version from your cfengine server. In our case, update.conf sets the time and downloads all our main cfengine configuration files. After update.conf is finished, the main configuration file, /var/cfengine/inputs/cfagent.conf, is read.

The configuration files we push out via update.conf are listed below. We name our configuration files after cfengine "action sequences" -- the various kinds of actions cfengine can take. All our cfengine configuration files are controlled via CVS, so we can audit their changes and protect them from accidental deletion:

cf.copy -- Copies configuration files from our cfengine server.

cf.daily and cf.weekly -- Replaces what we would otherwise store in /etc/cron.daily and /etc/cron.weekly.

cf.dirs -- Creates directories and checks their permissions.

cf.disable -- Renames dangerous files like "/etc/hosts.equiv".

cf.disks -- Checks the available disk space.

cf.editfiles -- Adds lines to files like "/etc/hosts".

cf.files -- Verifies file permissions, ownership, and checksums.

cf.groups -- Defines which services should be offered by each machine.

cf.iptables -- Initializes a machine's firewall.

cf.links -- Creates symbolic links.

cf.netinit -- Converts a machine's DHCP address to a static one, by rewriting /etc/sysconfig/network-scripts/ifcfg-eth0.

cf.packages -- Queries a machine's RPMs and sets classes if RPMs are missing.

cf.processes -- Examines which processes are running and restarts programs as needed.

cf.resolve -- Sets up /etc/resolv.conf.

cf.shellcommands -- Runs arbitrary commands; ours contains a lot of "cvs checkout" and "up2date" commands.

cf.tidy -- Deletes files.

cfagent.conf -- Imports all these files and sets a few global cfengine variables.

Our configuration is structured around the groups defined in our cf.groups file. Administrators with no knowledge of cfengine can just edit cf.groups to have a set of cfengine actions run on a machine. Our groups define which server-specific packages need to be installed, which directories need to be emptied, and which jobs need to be run on a daily basis. A typical line looks like:

mx_server = ( examplehost1 examplehost2 )

Other configuration files then do something useful if the machine is in the "mx_server" group, such as check out or update /etc/mail from an "mx/etc/mail" directory in CVS.

So, when one of our machines first boots, its rc.local file calls cfagent -Kq -DInit. The "-K" means to ignore lock files. The "-q" means not to wait any time before running. (Cfengine normally waits a certain random amount of time before running, to lessen the chance that two clients demand a resource at the same time.) The "-DInit" defines a special cfengine class, "Init", which in our environment means "you can do things that might break services". The "Init" class tells the machine to turn off and on its network interfaces, set up its firewall, and delete old configuration information.

The FAI Process

Let's walk through an example of our FAI process. Say we have a machine, "deacon1", with MAC address 00:00:00:00:00:00 and IP address 10.1.1.1, which needs to be installed as a virus filtering machine. Our TFTP server IP address will be 10.2.2.2.

The first step is to tell the DHCP/DNS/TFTP server, which controls our network-boot process, that deacon1 exists. We do that by adding "deacon1.example.com 00:00:00:00:00:00 10.1.1.1 10.2.2.2 rhel3-x86-latest" to our hosts.conf file. We verify that there is a PXELINUX configuration file called "install/rhel3-x86-latest" in our TFTP root. We verify that the initrd and kernel files referenced in this configuration file also exist in the TFTP root in the proper places. After everything looks okay for the network-boot, we run the autoconfig.pl script:

perl autoconfig.pl --install=deacon1 hosts.conf

The script prints that it will install the machine with hexadecimal ID "0A010101".

The second step is to ensure that "deacon1" is in the proper cfengine groups. We do that by going to the "cf.groups" file and adding "deacon1" to the "virus_filter" group. The "virus_filter" group is in turn a member of the "mail_server" group, so mail services will also be installed. After deacon1 looks like it is in all the appropriate groups, we save the cf.groups file.

The third step is to actually reboot the server. You can make most servers boot to the network either by changing a BIOS setting, so the machine always boots from the network before its hard disk, or by following the POST directions and hitting an indicated function key on boot.

The server will print that it is PXE booting. When it receives an address, the machine will print that it's booting from the network. Some familiar Red Hat Linux diagnostic boot messages should appear. The PXELINUX-loaded kernel will then try to get a DHCP address, perhaps configure some hardware, and boot to the Kickstart server.

After deacon1 hits the Kickstart server, it will follow the Kickstart rules. For us, that means installing LVM partitions and a few universal base packages. After the installation, the "post-install" phase will write over the TFTP configuration file "0A010101" created earlier, so deacon1 will reboot to its local disk.

When deacon1 reboots, it will look suspiciously like a fully installed machine. Indeed, it will be as installed as some FAI processes take you. However, after deacon1 boots up to the designated run-level, it will run its rc.local file. Deacon1 will go to our Red Hat Network Satellite Server and download cfengine and the site-cfengine bootstrap RPM. The rc.local file will then run cfagent -Kq -DInit in the background and replace rc.local with the factory-installed version.

Cfengine reads "update.conf" and realizes it must copy many files from the cfengine server. Deacon1 updates its time, because, as with Kerberos, cfengine's authentication does not work correctly when clocks are out of sync. Deacon1 then contacts the cfengine server and downloads the site configuration files.

Cfengine then reads "cfagent.conf". It realizes that the server on which it is running, deacon1, is in the "virus_filter" group. Cf.packages says that "virus_filter" machines must have the "MailScanner" RPM. If they don't, the "rpm_MailScanner" cfengine class is set. Cf.shellcommand file notices the "rpm_MailScanner" class is set and runs up2date --solvedeps=MailScanner. Cf.processes notices that "MailScanner" isn't running; it executes "/usr/sbin/service MailScanner start" and sets the "chkconfig_mailscanner" class. Cf.shellcommands.2 sees that "chkconfig_mailscanner" is set and runs "/sbin/chkconfig MailScanner on".

Conclusion

Setting up this fully automated installation process was not simple, but having it in place has been well worth the investment. We impressed our manager by installing a machine into the rack and onto the network in 30 minutes. We can now plan recovery strategies around machine re-installations rather than restores from tape. We simply have to rebuild a few RPMs to support the same configuration on a new vendor distribution.

John Borwick is a SAGE Level III systems administrator at Wake Forest University, with a background in Computer Science and English. His goal is to make systems administrators' lives easier by optimizing common processes.