Everyone
Should Have a PUP
Richard Hellier and Alistair Gardiner
This article presents a way of organizing storage by using per-user
partitions (PUPs). With this layout, every account is kept in its
own partition (and file system). The next section describes the
operational background that gave rise to PUPs.
The "Old World"
In our main UK office, we have several hundred accounts (a mix
of "real" users and "project" accounts). Together, these accounts
occupy about a terabyte of disk storage. For several years, this
storage was supplied by two Auspex file servers, whose individual
disks were presented either as RAID strings (the majority), concatenations/stripes
of disks, or just as "single-disk" file systems.
When replacing both of these Auspex file servers with an LSI Logic
Storage Systems Fibre Channel-based storage array, fronted by a
Sun E420R, we had the opportunity to start from scratch and lay
out the storage in an optimal fashion rather than just preserving
the status quo.
To provide the software abstraction from the RAID volumes (LUNs)
on the storage array, the Veritas Foundation Suite (Veritas Volume
Manager plus Veritas Filesystem) was chosen. The ability of this
package to shrink file systems as well as grow them (on the fly)
was an important factor in this decision. In principle, though,
the same approach could be taken using Sun's Online Disk Suite or
HP's Logical Volume Manager. The next section describes the way
in which the storage was laid out on the new system.
The "New World"
While attending presentations and reading through the manuals
for Foundation Suite, we saw many examples of ganging together small
disk areas into a larger unit to be presented as a single file system.
Although we have a need for some accounts to have, say, 50 GB of
disk space, many accounts need much less, so we played around with
the idea of using Foundation Suite to "go small" rather than "go
large". This led to the idea of PUPs, that is, creating a volume,
and with it a file system, for every account. Several benefits of
this scheme (listed in the next section) immediately sprang to mind,
but we were sure there had to be a downside, too. Discussions with
colleagues in our own and other European offices failed to find
any such downside. If you can think of any, please let us know.
We identified the benefits as follows.
The Benefits
1. Damage Limitation -- Gone forever is the ability of a single
"rogue" user to stop work by all other users in the same partition
by filling up that partition. Of course, a user's own partition
can still fill up, but now only that user is affected. This "insulation"
of one account from another is very significant in a high availability
environment since "damage" is so tightly controlled.
2. No More Moves -- When a user changes departments, the previous
layout entailed the physical copying of the account's data from
one partition to another (assuming space could be found in the target
partition) followed by a change of group membership. In the new
world, no movement is necessary -- just a quick chgrp!
3. Hiding Places -- The PUPs provide a per-user "hiding place"
as well as space for the account itself. This hiding place has many
possible uses, some outlined below. A technicality of the layout
is that, on the file server, account fred's home directory is:
/export/users/fred/fred
though this would appear to users, via the automounter, as /users/fred
on client machines. The repetition arises because the first fred is
the mount point for the file system and the second is a directory
within that file system (at the same level as lost+found). Only the
innermost fred belongs to user fred so the enclosing file system can
be used for "marker" files for reporting, disk space monitoring, etc.
Normally, such files must be kept in a user's home directory and,
although such files are owned by (say) root, they can be removed by
the owner of the account (because users normally own and have write
permission to their home directories). Hiding these marker files and
the lost+found directory in "..", so to speak, keeps them out of reach.
4. Quotas without Quotas -- Running quotas to keep disk space
usage in check is often unpopular as all users are penalized to
police the few (because every file system write must be checked).
An alternative is monitoring schemes that use, for example, du
to gather statistics. The problem with du is speed. If the
accounts of hundreds of users need to be checked, the process is
lengthy.
With the new layout, there are "hard" quotas (since a user cannot
write outside a partition), which cost nothing to enforce. Additionally,
sizes/usages can be checked quickly using df. df is
fast because it operates at partition level (i.e., the level at
which we're working).
5. Fine-Grained Access Control -- Having each account in a separate
file system means that one could mount each with individual options.
Equally, each file system could have its own permissions for NFS
export.
6. Simple Freespace Monitoring -- Rather than maintain elaborate
records of pockets of free disk space, one only needs to keep a
certain amount of headroom for the server as a whole.
When a new account is added, the required amount of disk space
is taken from the free pool maintained by Volume Manager. At first,
this seems to be a worrisome loss of control (in that you have no
idea where the account is "physically"), but keep in mind that when
you create a file within a file system, it rarely matters exactly
which blocks on the disk are occupied by the file. The principle
is the same when creating an account, except now we are working
with entire file systems rather than individual files.
7. Simple Backup Administration -- The fact that PUPs are sized
to fit makes for better backup granularity. In a non-PUP environment,
a backup of a partition containing, say, five 50-GB accounts must
either be done as a single backup job or five separate jobs, one
for each account. In the first case, the risk of having to re-start
the entire 250-GB backup after a "hiccup" is increased and in the
second case, the separate backup jobs are more work to set up and
administer.
However, with PUPs we get the best of both worlds as individual
backup jobs are smaller (and run more quickly). Also, the backup
configuration can be done with wildcards (e.g., with Veritas NetBackup,
the file list can be given as ALL_LOCAL_DRIVES).
Summary
This article has presented a way of organizing large numbers of
accounts on a file server with a minimum of initial and ongoing
administration. At first, we worried that having hundreds of mounts
(rather than a few dozen) and an equal number of NFS exports would
overload either Volume Manager or the Sun E420R. Initial tests and
subsequent monitoring have shown that the file server handles the
whole shebang with ease.
We would certainly recommend this arrangement to anyone and would
be very interested to hear of any downside you experience. In the
authors' view, there may be such a thing as a free lunch after all.
Richard Hellier has been around UNIX systems since 1976, first
as a user, then a software developer, and most recently as an administrator.
He is part of a team that manages the UNIX machines in the European
offices of LSI Logic. He can be contacted at: rlh@lsil.com.
Alistair Gardiner received a BsC in Computer Science from the
University of Strathclyde, Glasgow, in 1992. He has worked as a
Unix/Network administrator at two UK universities, and has worked
at LSI Logic Europe for the past six years. |