apr2005.tar

Makeinstall -- Capture File System Changes with Snapshot and Rsync

Julie Wang and Michael Wang

Generally speaking, a software application is just a set of files on the file system. Some applications will automatically start one or more processes -- such as sendmail, Apache, and MySQL -- upon startup. Other applications -- such as Perl, PHP, Python, and shell scripts -- wait to be called.

A big part of Unix systems administration is managing sets of files (such as add, delete, and modify) either directly -- like "vi httpd.conf" -- or indirectly by running front-end programs (such as make install, pkgadd, pkgrm, rpm, useradd).

The first step in understanding and managing an application is to find its fileset. Traditional Unix makes this difficult because files belonging to different packages are intermixed. Once you have installed software with make install, it is difficult to know where the files went, and how to uninstall the software cleanly. The package manager excels in this regard. However, it won't help you if there is no pre-created package, and you will need to create one yourself.

This article will describe using Solaris UFS snapshots and rsync to capture the file system changes and to optionally reverse the changes. This technique can be used to make software packages, to run a command in test mode, and to find out what is happening behind the scene.

The makeinstall script was originally written as a wrapper for make install, identifying installed files for packaging (hence the name). For maximum flexibility and capability, the script is divided it into five steps that can be called to catch the changes to the file systems made by make install or any other command.

How makeinstall Works

In a simple case of replacing an existing file, a prudent sys admin would use the following steps:

cp   foo foo.old # save the old copy
cp   foo.new foo # install the new copy
diff foo.new foo # find what has changed
cp   foo.old foo # optionally to reverse the change
rm   foo.old     # remove the old copy

The five steps in the makeinstall script are similar to the above steps, but are on the file system level. The steps are:

1. Save the old copies of file systems with snapshots.

2. Make the changes.

3. Compare the old and new file systems with rsync.

4. Optionally reverse the changes.

5. Delete the snapshots.

The alternative methods to meet the same goals include gaining intimate knowledge of the software package, using time stamps to find the newly installed or changed files, and working on the chroot environment. Compared with the alternative methods, makeinstall uses brute force, and therefore is simpler and more reliable -- as reliable as snapshot and rsync. The alternative methods can still be used to validate the results.

Basic Script Usage

The makeinstall script is called with four named arguments processed with my_getopts (see Michael Wang's January 2003 article on UnixReview.com: http://www.unixreview.com/documents/s=7781/uni1042138723500/):

fs:=[+-]<list of filesystems>        Default: /:/usr:/var:/opt
snaproot:=<snapshot top mount point> Default: /snapshot
bs:=<backing-store>                  Default: $(< <snaproot>/.backing-store)
step=<step number>                   Default: 0 (do nothing)

The fs option specifies which file systems to make snapshots of. The default is the existing subset of "/", "/usr", "/var", and "/opt". For example, if "/opt" is not a separate file system, it will not be in the list. You can use fs:=+<file system list> to specify additional file systems, use fs:=-<file system list> to remove the file systems from the default list, and fs:=<file system list> to specify alternative file systems. The <file system list> is a list of file systems separated by colons.

The snaproot option specifies the snapshot root directory under which the snapshot file systems are mounted. The default is /snapshot. The snapshot for a file system will be mounted as their original name under this snaproot, except that "/" is replace by "/root" for clarity.

The bs option specifies the common backing store for all the snapshot file systems. Backing store is used to back up the old data when there is a change in the original file system. The backing store should not be in any of the file systems of which you are making snapshots.

The step option specifies which step to run. For example, 1 is to run step 1, and 2 is to run step 2, and so on. 34 is to run step 3 and 4; 9 is to run all the steps; 0 (the default) is to run no steps. For example:

makeinstall fs:=+/apps4 snaproot:=/snapshot bs:=/apps2 step=1
Code Review

Step 1 is to make a backup of all possibly involved file systems. This is like cp foo foo.old, but the temporary backup is done using snapshot. This part is done using a do_snapshot function; the fssnap command hidden inside this function is Solaris specific:

  function do_snapshot {
    typeset fs=${1} sr=${2} bs=${3} mp dv ix
    for ix in ${fs//:/ }; do
      mp=${sr}${ix/%"/"/"/root"}
      mkdir -p $mp

      mount -F ufs -o ro $(fssnap -F ufs -o bs=${bs},unlink $ix) $mp
      fssnap -i -o blockdevname $ix | awk "{print \$NF}" | read dv &&
      df -Pk $mp | grep "^$dv" || return 1
    done
  }

This function takes three arguments: a list of file systems for which to make snapshots, separated by colons; the snapshot root directory for the snapshot mount points; and the backing store directory.

For each of the file systems in the list, the snapshot is created and mounted under the snapshot root directory, under their original names (except "/" file system is mounted as "root"). For example:

do_snapshot /:/usr /snapshot /backing-store

will create snapshots for "/" and "/usr", and mount them as "/snapshot/root" and "/snapshot/usr". The line:

mp=${sr}${ix/%"/"/"/root"}

is a shorter but cryptic way to say:

if [[ $ix == "/" ]]
then mp=${sr}/root
else mp=${sr}${ix}
fi

The last two lines in the for loop check whether the snapshot is successfully created and the file system mounted. If the check fails, the function will immediately return with a non-zero status. This will cause the program to terminate because "errexit" (set -e) is set in the makeinstall script. Because of the nature of this program, we do not want to proceed when there is an error; we do not want to proceed with step 2 if step 1 fails.

Although this is a standalone function, the function itself is not exposed to the user. The user passes arguments to the makeinstall script; the script processes the arguments and calls the function.

Step 2 is a place holder. It simply reminds you to type make install (or other command) for which you want to see the changes it makes on the file systems.

Step 3 is to find what changes have been made. This is done using rsync in "dry-run" mode (-n or --dry-run option), without doing the actual synchronization:

rsync -vanx --delete <source>/ <target>/

As of this writing, there are two issues with the current version of rsync (2.6.3) in dry-run mode related to this application:

1. It does not produce a list of files with changes in owner/group, permission, or time stamp ("Rsync Bug 1764").

2. It does not report empty directories to be transferred to target ("Rsync Bug 1433").

However, this does not affect the majority of the applications of the makeinstall script, as changes in most cases do not fall into these categories.

Step 4 provides the option to reverse the changes. Rsync is run first in dry-run mode from the snapshot file system to the live file system. It then asks for your confirmation for the actual run.

We can think of two situations for which this step is needed. One is if the changes you just made are not desired (e.g., it overwrites your existing files). The other situation is for packaging (e.g., if you just want to find the list of files installed by make install, but do not actually install it).

Step 5 deletes the snapshots. This is a critical step, as the previously saved state of file systems will be gone. Unless you explicitly invoke this step, confirmation is prompted. The undo_snapshot function used in this step is the reverse of the do_snapshot function:

function undo_snapshot {
  typeset fs=${1} sr=${2} mp ix
  for ix in ${fs//:/ }; do
    mp=${sr}${ix/%"/"/"/root"}
    umount $mp
    fssnap -F ufs -d $ix
    ! fssnap -i -o blockdevname $ix | awk "{print \$NF}" | read &&
    ! df -Pk $mp | grep "^/dev/fssnap/[0-9]\{1,\}" || return 1
  done
}

The last two lines in the for loop check whether the snapshot is actually deleted and snapshot file system unmounted. The function fails when either of the conditions is not satisfied.

Case Studies

Case 1: Install Perl module

Frank is a sys admin for a major bank in the United States. He received a request from a developer to install the Perl module "Date::Manip". Frank was not familiar with this Perl module (or Perl modules in general), but he found the command to install the module was:

perl -MCPAN -e 'install Date::Manip'

To be safe, he tried the installation on his own workstation running Solaris with UFS file systems. First, he ran step 1 of makeinstall to create snapshots for: /, /usr, /var, and /opt:

# makeinstall bs:=/apps4 step=1

/dev/fssnap/3   493688     77196    367124   18%   /snapshot/root
/dev/fssnap/2  2055705   1181127    812907   60%   /snapshot/usr
/dev/fssnap/1  2055705     48364   1945670    3%   /snapshot/var
/dev/fssnap/0  1915416     14166   1843788    1%   /snapshot/opt

Then he ran:

perl -MCPAN -e 'install Date::Manip'

to install the module. After this, he ran step 3 to find what files were installed or changed on his workstation:

# makeinstall bs:=/apps4 step=3

This produced a list of files:

/opt/csw/lib/perl/5.8.2/perllocal.pod
/opt/csw/lib/perl/site_perl/auto/Date/Manip/.packlist
/opt/csw/share/man/man3/Date::Manip.3perl
/opt/csw/share/perl/site_perl/Date/Manip.pm
/opt/csw/share/perl/site_perl/Date/Manip.pod

He looked at the files and realized that the perllocal.pod is a shared file with entries for all locally installed modules, and the .packlist contains the list of files in this module. He then distributed the last four files to the developer's workstation -- which was running VxFS and was not configured for CPAN access -- and manually modified perllocal.pod.

Since the installation worked fine, he skipped step 4, which would allow him to reverse the changes, but proceeded to step 5 to delete the snapshots:

# makeinstall bs:=/apps4 step=5
Deleted snapshot 3.
Deleted snapshot 2.
Deleted snapshot 1.
Deleted snapshot 0.

Case 2: Making sendmail packages

Andrew is a sendmail administrator and has Sun's sendmail packages SUNWsndmr and SUNWsndmu installed, but he wants to build the sendmail package from the current source. Andrew knows about makeinstall so he ran the first step to create snapshots for all the operating system file systems. Then he ran make install to install the sendmail. He continued to run step 3 to produce a list of newly installed files. He was glad he used makeinstall, as the files were installed all over the place, mixed with other operating system files:

/etc/mail/helpfile
/etc/mail/statistics
/usr/bin/hoststat -> /usr/lib/sendmail
/usr/bin/mailq -> /usr/lib/sendmail
/usr/bin/newaliases -> /usr/lib/sendmail
/usr/bin/purgestat -> /usr/lib/sendmail
/usr/bin/vacation
/usr/lib/sendmail
/usr/lib/smrsh
/usr/sbin/editmap
/usr/sbin/mailstats
/usr/sbin/makemap
/usr/sbin/praliases
/usr/share/man/cat1/mailq.1
/usr/share/man/cat1/newaliases.1
/usr/share/man/cat1/vacation.1
/usr/share/man/cat5/aliases.5
/usr/share/man/cat8/editmap.8
/usr/share/man/cat8/mailstats.8
/usr/share/man/cat8/makemap.8
/usr/share/man/cat8/praliases.8
/usr/share/man/cat8/sendmail.8
/usr/share/man/cat8/smrsh.8

He saved the files. He wanted to create a new package to replace Sun's packages, so he reversed the changes by running step 4. He verified the integrity of Sun's original packages with pkgchk, and then deleted the snapshots.

Case 3: Install CSW openssh package

Janet wants to replace Sun's ssh package with a newer version. She knows and likes the Blastwave distribution of Solaris software packages (http://www.blastwave.org/). Janet downloaded the openssh package and examined the file list contained in this package using pkgchk -l. She confirmed that the package does not conflict with Sun's package.

Janet knows about makeinstall and wanted to compare the fileset using both methods. Upon comparison of both lists, Janet found these differences:

1. Makeinstall indicated that the files /etc/passwd and /etc/shadow were updated. She looked at the files and found that "sshd nonpriv userid" sshd was created. Upon further investigation, she found a useradd command in the preinstall script. Files created and modified in preinstall and postinstall scripts are not in the pkginfo list, but are captured by the catch-all makeinstall script with brute force. Janet then researched userid and learned about privilege separation.

2. Makeinstall also reported that /opt/csw/etc/sshd_config was created but was not in the pkginfo list. Janet found this file was generated in the postinstall script:

cp -p $CONFDIR/$CONF_FILE.CSW $CONFDIR/$CONF_FILE

3. Makeinstall indicates the following files are installed or modified:

/var/sadm/install/contents
/var/sadm/pkg/CSWossh/install/copyright
/var/sadm/pkg/CSWossh/install/depend
/var/sadm/pkg/CSWossh/install/preremove
/var/sadm/pkg/CSWossh/pkginfo

Again, this is because makeinstall reports what actually happened. Thus, Janet learned how Solaris software packaging works.

Makeinstall also caught that after removing the package, the userid "sshd" created in the installation remained. The package removal should restore the system to its state prior to package installation, whenever possible.

Case 4: Run Oracle's root.sh

Ben is a sys admin and was asked to run "root.sh" as superuser as part of an Oracle software installation. What does root.sh do? Is it safe to run? He looked at root.sh and had an idea what it would do, but decided to use makeinstall to find the actual changes.

Because Oracle software is installed on /myapp-ds1, which is not part of the operating system file systems, he specified an additional file system on the makeinstall command line:

# makeinstall step=1 fs:=+/myapp-ds1 bs:=/apps2 snaproot:=/snapshot

Then he proceeded to step 2 to run root.sh, shown here:

  # ./root.sh
  Running Oracle9 root.sh script...
  
  The following environment variables are set as:
    ORACLE_OWNER=oracle
    ORACLE_HOME=/myapp-ds1/ora01/app/oracle/product/9.2.0
  
  Enter the full pathname of the local bin directory: [/usr/local/bin]:
    Copying dbhome to /usr/local/bin ...
    Copying oraenv to /usr/local/bin ...
    Copying coraenv to /usr/local/bin ...

  Adding entry to /var/opt/oracle/oratab file...
  Entries will be added to the /var/opt/oracle/oratab file as needed by
  Database Configuration Assistant when a database is created
  Finished running generic part of root.sh script.
  Now product-specific root actions will be performed.

Next, he ran step 3 to find the changes made to the OS file systems and on Oracle's own file system. He summarized the changes as follows:

1. Three Oracle-provided utilities were copied to a local binary directory:

/usr/local/bin/coraenv
/usr/local/bin/dbhome
/usr/local/bin/oraenv

2. "oratab" was created:

/var/opt/oracle/oratab

3. Three files were installed in /opt area:

/opt/ORCLfmap/bin/fmputl
/opt/ORCLfmap/bin/fmputlhp
/opt/ORCLfmap/etc/filemap.ora

4. File permissions were changed for many Oracle files, for example:

/myapp-ds1/ora01/app/oracle/product/9.2.0/bin/*

5. setuid root for the following two files:

/myapp-ds1/ora01/app/oracle/product/9.2.0/bin/dbsnmp
/myapp-ds1/ora01/app/oracle/product/9.2.0/bin/oradism

Changes 4 and 5 were discovered with the patched rsync code due to the rsync bug 1764 mentioned earlier ("Rsync Bug 1764").

Ben supports a failover cluster with the Oracle software and database installed on storage shared among the nodes. Upon the node failure, the Oracle file systems are unmounted from the failed node and mounted on another node within the cluster. The Oracle instance and Sqlnet listener are started on the new node.

Will the locally installed files affect the failover capability? Will the setuid files present a security issue? Ben presented these issues to the database group.

Conclusion

Makeinstall uses the "when in doubt, use brute force" technique to find the file system changes after running a command. The program can be used for software packaging, test run, and educational purpose. Makeinstall is written for UFS on Solaris. However, the idea can be extended to other systems.

The entire program is shown in Listing 1. Future versions of this program, if any, can be found at: http://www.unixlabplus.com/unix-prog/makeinstall/.

References

"Processing Command-line Arguments with my_getopts" by Michael Wang -- http://www.unixreview.com/documents/s=7781/uni1042138723500/

"Rsync Bug 1433." Samba Team. Retrieved 13 Sep 2004 -- https://bugzilla.samba.org/show_bug.cgi?id=1433

"Rsync Bug 1764." Samba Team. Retrieved 12 Jan 2005. -- https://bugzilla.samba.org/show_bug.cgi?id=1764

"Rsync Web Pages." The rsync team. Retrieved 9 Sep 2004 -- http://rsync.samba.org/

Julie Wang works for Independence Air (http://www.flyi.com/). She manages Oracle databases, Unix operating systems, Lawson enterprise systems, among other systems. She can be reached at: Julie.Wang@flyi.com.

Michael Wang is Julie's husband. He can be reached at: xw73@columbia.edu.