Persistent Linux with User-Mode Linux

Edward L. Haletky

Have you ever wished for an operating system that disregards all changes the next time there is a login? For security reasons, a persistent operating system maybe exactly what you need. And, given the security complexities of using Linux in critical positions, a persistent Linux may solve your concerns because customizations will be strictly controlled. Setting up a persistent User-Mode Linux (UML) session has many uses, from running a kiosk-style computer to creating a virtual jail for hackers. But let's define persistent. For the purposes of this article, persistent does not mean a completely static system, but a system where all customizations are strictly controlled and only those controlled customizations will last between uses. Most of the strictly controlled customizations will be those that allow the system to boot onto a network, set up specific applications for use, and other customization that administrators feel is important.

UML Uses

How is this different from a standard UML session? It is different in that a standard UML session allows you to save its state and disk changes. A persistent disk will be unchangeable allowing for a consistent session whenever it is launched. This gives an administrator an unprecedented level of control but provides the user with a consistent system to use. It will always revert to its previous state no matter how much someone may change it. All that is required to go back to this state is to reboot the UML session. This has applications in such areas as Internet kiosks, production application servers, and computer-aided manufacturing.

For Internet kiosks, a persistent UML session guarantees a standard way for all users to access the network. Because access to the network saves crucial data in cookies and passwords for email and network tools in volatile memory, users can feel safe using a persistent UML session. All personal data will be lost with the termination of the session and will thereby be inaccessible to the next user. Because each session starts fresh, the previous use data is just not there.
Production application servers can use a persistent UML to guarantee that a known good configuration does not change. In the world of production servers, changing an application could happen accidentally, or even maliciously, but with a persistent UML session, a quick reboot will revert the system to the last known good configuration. Also, using multiple copies of the same UML disks can help replicate these servers easily and conveniently.
In computer-aided manufacturing, a machine may be in a critical area and again persistent UML can help keep manufacturing consistent and ultimately controlled. Mistakes made at a console can quickly be repaired.

In the above examples, use of a highly controlled, consistent session protects users and administrators from themselves and others. In any case where a local customization could adversely affect the next user, or where you need an extremely reproducible environment, a persistent UML can help. Ultimately, such an environment can be used in the most secure locations where you might use a bootable CD-ROM that loads just enough of an operating system to launch a UML session that is copied from a CD-ROM to a local disk each time. By using a local disk, you gain the speed and use of all tools available per launch of the system. If you have a large enough system, you can even have multiple persistent sessions.

UML is not the only tool that can grant persistent disks. The commercial products available from VMware can also be used to create persistent disk and machine instances. UML is an open source tool with similar characteristics. User-Mode Linux is a growing open source project and, useful and advanced though it is, commercial level support is not available from the authors.

Configuring UML Sessions

In this section, I will show a method to automatically start, stop, and update a persistent UML session without changing the base file system image once it is set up. I will also present some tips for configuring the persistent disk images used. There are two goals, besides persistent disk images, with this project -- the first is to have a fast startup of UML, and the second is to have a system that is easy to update.

The project was completed using a staged approach. The first stage was to get UML to run. While setting up UML has been discussed elsewhere ("Testing an iptables Firewall from within the Host", Sys Admin, July 2003, http://www.samag.com/documents/s=8284/sam0307e/), here is a brief synopsis and some tips for execution. Start by downloading the following files:

uml-utilities
RH7.2 file system

from:

http://user-mode-linux.sourceforge.net

and installing them into a local "fw" directory.

By installing these files in a local directory, I do not have to worry about overriding the system files. Some versions of Linux cannot run UML yet, because the kernel does not have the support. For example, Red Hat 8.0 has this support, but Red Hat 9 does not. Because Red Hat Enterprise Linux 3 (RHEL 3) is built on Red Hat 9, it does not support UML without kernel modifications.

The example I present in this article uses Red Hat 8.0 as the base operating system, which has its own uml-utilities installed. The RH7.2 file system is a pre-built file system ready for running within UML. Once it is unpacked, you can immediately start up UML and go. However, that is not what I did; I split the file system into two where /usr became its own file system partition file and added a swap file system partition file. The /usr partition is needed to install other utilities like mozilla, and a swap partition alleviates startup errors and enables swapping within UML. Although UML applications can be seen from the host OS, they will not be able to access the host OS file systems directly. To create these other file systems, I used the mkswapfs and mkemptyfs UML utilities, which are available from:

http://www.stearns.org/mkrootfs/

Once I booted UML, I created the /usr partition with mkfs and moved the data from the existing /usr to the new partition. To do this, start up UML and specify a second and third partition for the swap and usr file systems with the ubd1= and ubd2= options to the UML Linux command.

To make a golden copy of these three file systems (root_fs, usr_fs, and swap_fs), shut down the UML session cleanly using the shutdown command. If UML does not shut down cleanly, it will not reboot cleanly and you will have to repeat this process until you can safely shut down UML. Once you have your first golden master, make a copy of it! Then you can seed the UML file systems with the applications you need for your persistent image. Make a compressed image of each file system for quick restore if necessary.

In the past, I have added tools like Nessus, nmap, Mozilla and all its plugins, along with other applications. Your list will probably vary, but remember if you add something that requires local disk access to hold user data, then you will have to either configure it every time you boot UML or provide access to a local share.

The script that follows is the tool to start UML specifying the appropriate arguments for file system, network, and memory constraints. This tool requires the uml_net program that comes as a part of uml-utilities to be marked setuid by root. Or, if you are security conscious, you will need to pre-load the ethertap device:

% cat dolinux
#!/bin/sh
cd /home/user/fw
xhost +
PATH=/home/user/fw/usr/bin:${PATH}
export PATH
linux mem=128M ubd0=root_fs_1 ubd1=swap_fs_1 ubd2=usr_fs_1 \
  umid=rh eth0=ethertap,tap0,fe:fd:0:0:0:1,192.168.0.254

This script will start UML, allocating 128 MB of memory to the UML copy of RH7.2. It will assign the disk partitions to the given root_fs, swap_fs, and usr_fs partitions, provide a UML ID of "rh", and create an ethertap device with an appropriate IP address. Be sure to use a non-routable private IP address or you might end up with serious problems. Given this network setup, you will then need to enable NAT using iptables to allow the UML system to access the Internet. Your host OS will become the router for the UML session to your network or the internal network.

Speeding Up the Process

Now, let's look at how to start, as well as end, the UML session quickly. Starting a Linux system takes time, and most of that time is spent verifying that things are as they should be. To speed up this process, I had to overcome two obstacles.

The first obstacle is the speed at which copies are made of the golden master file systems. It simply takes a long time to copy 2 GB of data. Given that I had plenty of disk space, I opted to have copies of the persistent file systems available for a quick rename to the name necessary for the first script. Out of this, came a daemon process that would run to maintain the copies of the UML disk files:

% cat ub
#!/bin/sh
cd /home/user/fw
PATH=/home/user/fw/usr/bin:${PATH}
export PATH

while [ 1 ]
do
    mis=88
    if [ ! -f swap_fs.2 ]
    then
        mis=2
    fi
    if [ ! -f swap_fs.1 ]
    then
        mis=1
    fi
    if [ ! -f swap_fs.0 ]
    then
        mis=0
    fi

    while [ $mis -ne 88 ]
    do
        # replace with the missing element.
        cp usr_fs usr_fs.$mis
        cp root_fs root_fs.$mis
        cp swap_fs swap_fs.$mis
        mis=88
        if [ ! -f swap_fs.2 ]
        then
            mis=2
        fi
        if [ ! -f swap_fs.1 ]
        then
            mis=1
        fi
        if [ ! -f swap_fs.0 ]
        then
            mis=0
        fi
    done
    sleep 30
done

This script runs as a daemon at startup and checks which of the three copies of the file systems needs to be replaced. I'll use the swap_fs file system as the test because it's the last one copied and takes the least amount of time. To make this daemon more robust, we can ensure the copy has finished before proceeding by comparing the md5 checksum of the swap_fs partition. For my purposes, three copies and the golden master are sufficient for a quick bootup of the UML system. These files will then be moved to a new name. The copy is expensive yet the move is quick. By doing the copies in the background as the UML session runs, we will have a copy ready for the next use.

The second obstacle is the Linux boot process; it takes a very long time if everything is enabled. To address this, we'll disable all but the critical services. In this example, we disable everything but syslog, network, random, crond, keytable, and linuxconf, because we are not running this as a server. However, you may want to retain a few services, like ntpd.

The next step is to go through /etc/rc.d/rc.sysinit and find the truly time-consuming items. This means we make some assumptions and hope they remain true. First, we disable all runs of fsck by symbolically linking /sbin/fsck to /bin/echo, because fsck takes the most time. This action implies that our file system images are safe and pristine. This is why it is critical to have a clean shutdown when making the golden master images. Second, we do the same for the depmod command, another time-consuming process. With these obstacles overcome, we can boot a UML session in less than a minute.

Because we removed the fsck and depmod calls and because we are using copies of the golden master file systems, we really do not care how the UML session ends. A standard shutdown can take up to five minutes to complete. No one wants to wait that long to use the system, so we'll use a "hard kill" of the UML session. This "hard kill" is destructive and non-recoverable. But, because we don't care how the session ends, we can use the following script to perform the rapid death of the UML session:

% cat hk
#!/bin/sh

# Force a kill! We may not always catch it the first time around.
/bin/ps ax | /bin/grep -v dolinux | /bin/grep linux > /dev/null
status=$?
while [ $status -eq 0 ]
do
    /bin/ps ax | /bin/grep -v dolinux | /bin/grep linux | \
      /usr/bin/awk '{printf "/usr/bin/kill -9 %s\n",$1}' > t
    sh ./t >& /dev/null
    /bin/ps ax | /bin/grep -v dolinux | /bin/grep linux > /dev/null
    status=$?
done

This script takes advantage of how UML is presented to the host OS, which is that all the UML processes are seen by and can be killed by the host OS. It is a mean approach to killing a session, but it is also effective. The shutdown times have decreased from five minutes to less than ten seconds. But, there is one problem: the UML session must somehow notify the host OS that it's ready to be killed. This is done by monitoring the UML console for a message that says "Hey Host OS, I am finished."

Monitoring with Expect

Because we need to monitor the console of the UML session, we can also use the console of the UML session to configure various aspects on the fly. We use Expect to do this. With Expect, I can spawn the UML startup script and monitor the output of the boot process for a console login, then I can login and download updated configuration information to the UML session. Also I can monitor for a UML shutdown request. Here is an example script to start UML via Expect:

% cat linux.exp
#!/usr/bin/expect --
set timeout -1
stty -echo
set res [open /etc/resolv.conf]
set resolv [read $res]
close $res
set app ./app
set linux "/home/user/fw/dolinux"
set inet 192.168.0.254

spawn $linux
# Wait for Login and send over
# our network configuration
expect "*login*" {
    send "root\r"
}
expect "*Password*" {
    send "xxxxxxx\r"
}
expect "*bash-2.05*" {
    send "echo $resolv > /etc/resolv.conf\r"
}
expect "*bash-2.05*" {
    send "exit\r"
}
expect "*login*" {
    send "user\r"
}
expect "*Password*" {
    send "userp\r"
}
# Here is the fun stuff
# update app
expect "*user*" {
    send "/usr/bin/lynx -dump -dont_wrap_pre http://$inet/app_cls \
      > $app; chmod +x $app; /usr/bin/lynx -dump -dont_wrap_pre \
      http://$inet/app.conf > app.conf\r"
}
# Now run app and wait for it to finish
expect "*user*" {
    send "DISPLAY=$inet:0; export DISPLAY; $app -f ./app.conf; \
      echo 'SHUTDOWN NOW'\r"
}
while {1} {
    expect "SHUTDOWN NOW*user*" {
        exec $hk
        exit 0
    }
}

This script has four parts. The first part spawns the "dolinux" script to start the UML session with the proper arguments. The second part waits for a login prompt and submits the UML session root user and password in order to invoke a command on the UML session that will download the updated configuration files from a local Web server. The third part starts the application (e.g., Mozilla) within the UML session and then waits for that application to end and issues the SHUTDOWN NOW request to the console. The last part of the script waits for the SHUTDOWN NOW message to appear and then performs the "hard kill" of the UML session. Again, we are taking advantage of the UML session's ability to display the console output to the host operating system's controlling terminal. Expect is a powerful tool for interacting with the controlling terminal.

The script covers two of the possible ways to download a predetermined set of configuration and application files to the UML session. Because we want to maintain tight control over the UML session configuration, we have to download them to UML. Using Expect, we use echo to write the contents of a host OS file to a file on the UML session. We can do this because Expect is the glue between the host OS and the UML session. It has access to both systems. In this case, we read a file from the host OS using TCL commands, then copy the contents of the file via the echo command run in the UML session after logging in. This action was performed for the /etc/resolv.conf file.

It is impossible to use the echo method to write a shell script file because Expect and echo interpret various shell variables and quotes differently. Thus, we can never get a complete or non-interpreted copy of the script. To solve this problem, we can set up a Web server within the host OS to serve up configuration files and scripts for the UML session. With Expect, I can issue a character cell Web browser to get the files from the host OS via Web protocols. This method requires the httpd Web server RPMs be installed and also requires a modification to the default Red Hat firewall to allow access for the ethertap device to the Web server. This is accomplished by adding the following lines to /etc/rc.d/rc.local:

# Setup NAT to go out eth0
echo "Starting NAT"
/sbin/iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

echo "Turn on IP Forwarding"
# Make sure that IP forwarding is turned on
/bin/echo "1" > /proc/sys/net/ipv4/ip_forward

/sbin/iptables -I INPUT 1 -s 192.168.0.144 -j ACCEPT

The first line of this script enables NAT to allow the host OS to act as the router and gateway for network access to the UML session. The second line allows the ethertap device full access to the host OS. This is specified via IP address and inserted as the first iptables rule; otherwise, it would be blocked. You cannot use the "tap0" device name for the ethertap device unless you have created it before running UML. Because this is a part of a startup script for the system, I chose to create the network connection between the UML session and the host OS on the fly. You may want to allow access to only part of the host operating system, like the Web server.

Other mechanisms can be enabled to transfer your tightly controlled configuration files from the host OS to the UML session. For example, you could use ssh to copy the files using access keys that do not require a password, you could use SOAP, you could mount a host OS directory as read-only and copy the files, or you could create symbolic links within your golden master to your mounted file system. Choose the mechanism that works best to meet your startup time requirements. A lot of copying will slow down the boot process.

There is still one script required to glue all this together. This script will rename one of the golden masters to the appropriate name for the "dolinux" script. The script will also call the Expect script:

% cat ft
#!/bin/sh
cd /home/user/fw
PATH=/home/user/fw/usr/bin:${PATH}
export PATH
ext=88
if [ -f swap_fs.2 ]
then
    ext=.2
else
    mis=2
fi
if [ -f swap_fs.1 ]
then
    ext=.1
else
    mis=1
fi
if [ -f swap_fs.0 ]
then
    ext=.0
else
    mis=0
fi
if [ $ext != "88" ]
then
    mv swap_fs$ext swap_fs_1
    mv usr_fs$ext usr_fs_1
    mv root_fs$ext root_fs_1
    ./linux.exp
fi

This glue script could also be improved. It could, for example, wait for an image to be made available by the UML copy daemon instead of exiting. To do so, it would also need to compare checksums to verify that any copy was indeed complete. By exiting, I've created a method to allow systems administrator or user access. It all depends on how the UML session and host will be used. You could also keep the golden masters on CD-ROM with a CD-based host operating system and copy to a large ramdisk or local scratch disk as necessary.

Implementing a persistent UML session requires several assumptions that are safe to make as long as the golden masters remain in perfect condition. To allow easy and fast recovery, I recommend making backups to tape and other systems, as well as multiple images of the original for critical applications. The use of golden masters facilitates a fast startup and shutdown of a UML session. This, in turn, allows UML to be used to create persistent systems in real-world environments.

Edward L. Haletky graduated from Purdue University in 1988 with a degree in Aeronautical and Astronautical Engineering. Since then, he has worked programming graphics and other lower-level libraries on various Unix platforms. He currently works for Hewlett-Packard in the High Performance Technical Computing Expert Team, dealing with Tru64 and Linux Clustering technologies, as well as general Linux and VMware environments.