High
Performance Content-Filtering Mail System
Thomas H. Jones II
Email-borne spam, viruses, worms, and other nastiness make daily
life a hassle for users and a hazard for unprotected systems. A
number of commercial solutions exist to help manage these problems.
However, their associated costs, speed penalties, and overall ease
to install and administer can leave admins who are concerned with
security looking for a better answer.
Fortunately, open source comes to the rescue. With a very basic
system running a free operating system -- such as Linux or the BSDs
-- it is possible to create a fast, powerful, and very low-cost
email content filtering system. Such a system can be set up either
as an ultimate email delivery destination or as an inline network
device for the main Internet mail host. In this article, I will
describe how to create a content-filtering email host and how to
test its functionality.
Requirements
This article is based around software packages listed in Table
1. Each one of these packages (with the exception of the C compiler)
will need to be downloaded, compiled, installed, and configured
to create the desired integrated mail system. At the time of the
writing, the following software versions were used:
- Postfix 2.0.18 (Feb 5, 2004 build)
- Postfix TLS patch for Postfix 2.0.18 (Feb 5, 2004 build)
- Cyrus SASL 2.1.17
- OpenSSL 0.9.7c
- Berkeley DB 4.2.52
- Clam A/V 0.67-1
- SpamAssassin 2.63
- AMaViSd-New 6/16/03 build patch-level 7
- GCC 3.3.2 C Compiler
- GNU Patch 2.5.4
- GNU Make 3.80
- Perl 5.8.3
A few of the packages require additional Perl modules to be downloaded
and installed from CPAN. Each package comes with a manifest that
details its Perl module dependencies.
The GNU make requirement is noted primarily for commercial Unix
systems. Some systems' developer packages include a make program
that will fail on some of the above packages. In those instances,
GNU's make may be the only functional alternative.
Note that a source of entropy (randomness) needs to be available
on the mail system. Ideally, this would be in the form of a random
device driver such as "/dev/random" or "/dev/urandom". Alternately,
entropy may be provided by an entropy service such as EGD (Entropy
Gathering Daemon) or PRNGD (Pseudo Random Number Generator Daemon).
I chose each of these software packages due to one, overriding
reason -- cost. When I originally undertook this project, it was
as a personal/professional enrichment exercise. Money was not available
for things like commercial anti-virus (A/V) or content-filtering
software, mail transport agents (MTAs), or compilers. Therefore,
open source software was the only real choice. The software packages
represent a combination of free, highly capable, and fairly security-conscious
software choices.
Create Required userids and groupids
The various components of this mail system run as different, non-privileged
(i.e., not root) users and groups. This is part of the design that
helps make this mail system more secure. This user and group creation
should be done first, as the configure and install scripts for some
of the applications will fail if required users are missing.
The Postfix software requires the creation of its own user under
which to run. This userid should be created with the user name and
group name "postfix". Postfix also requires a secondary group under
which to run certain setgid processes. This group should be named
"postdrop".
The anti-spam and anti-virus software all run under the control
of the same parent process. Thus, although each might normally be
built with its own user and group assignments as specified in each
application's README or INSTALL files, it is necessary to give them
a common user and group assignment. I recommend creating a user
"amavis" and assigning it to the group "amavis".
Create the Build Environment
This process involves making your system capable of creating the
mail-related programs from their source code distributions. You'll
need a C compiler and a Perl interpreter. Because it is also helpful
to have your Berkeley DB libraries available when the Perl interpreter
is built, building the Berkeley DB libraries is covered in this
section.
To begin, install a C compiler:
- If the system in question is a Linux host built with "developer
support" packages installed, the necessary C compiler (a GCC variant),
basic support libraries, and headers will already be installed.
If not already installed, most Linux distributions have some form
of package management tool that can be used to fetch and install
the needed files.
- If the system is a commercial Unix variant (e.g., Solaris),
either a vendor-supplied compiler or a free compiler can be used.
Given previous notes about the genesis of this project, a free
compiler like GCC is assumed. Typically, there are sites geared
towards commercial Unix variants that allow you to download a
binary distribution. Find the appropriate site (e.g., for Solaris,
SunFreeware.Com), download the binary distribution package, and
install it on your system.
The next task is to add Berkeley-style DBM support to the system's
build environment. SleepyCat software makes a Berkeley DBM package.
Before building this software package, it is a good idea to set
up build environment to hardcode library search paths into compiled
object codes (binaries, dynamic libraries, etc.). For the combination
of Solaris and GCC 3.x, the LD_RUN_PATH environmental may be used
to hardcode the run-time linker search paths into the binaries.
(Other operating systems might honor the DT_RUN_PATH environmental
setting -- check the compiler man pages for your operating environment.)
The Berkeley DBM installs into "/usr/local/BerkeleyDB.4.2" by default.
For systems like Solaris, there may also be some file dependencies
in "/usr/local/lib". Therefore, it is a good idea to set the LD_RUN_PATH
environmental to "/usr/local/lib:/usr/local/BerkeleyDB.4.2/lib".
Once this environmental is set, follow the build directions and
installation instructions outlined in the READMEs and INSTALL files.
Typically, when building SleepyCat's Berkeley DBM package, the configure
options --enable-cryptography --enable-hash --enable-queue --enable-replication
--enable-verify --enable-cxx --enable-java --enable-rpc --enable-shared
--enable-static produce a good Berkeley DBM package with the
features required by the programs in this project, as well as those
required by some programs not directly related to this project.
Building and Installing Perl
Some of the Perl modules required to build the content filters
for this project will need a very up-to-date version of the Perl
package. While most Unix operating systems now come with a Perl
package, the installed version may not be current and may not have
some features recommended for this project. Therefore, installation
of a new version of Perl is recommended, and will be required in
most cases. The Perl source code comes with a configure script to
set up the Perl build. Unless other specific deviations are required,
use the defaults with the following two exceptions: answer "yes"
to building a threading Perl and to building libperl. It
might also be a good idea to use the PerlIO option rather than the
STDIO default option. This will create the necessary makefiles and,
with luck, result in a good Perl package. With the presence of the
previously compiled Berkeley DBM package, the appropriate Perl DBM
modules should automatically get included in the Perl build. Follow
the included READMEs and INSTALL files for final installation instructions.
Install the Encryption/SSL Routines
Several of the remaining software packages rely on routines found
in the OpenSSL package to provide session encryption services. Therefore,
OpenSSL should be built after installation of the C compiler and
Perl interpreter.
As with most open source software packages, OpenSSL includes a
configure script to make code portability easier. Good choices for
build options include shared zlib-dynamic no-krb5 --prefix=/usr/local
--openssldir=/usr/local/openssl <OSARCH> (where
OSARCH equates to something like "solaris-sparcv9-gcc" or "linux-pentium").
This will result in:
- Building of shared libraries (recommended for use by later
software builds in this project).
- Linking against the zlib compression/decompression libraries.
Note that if these are not already installed or are installed
only as static libraries, they should be installed as dynamic
libraries before proceeding further. Otherwise, remove the zlib-dynamic
option.
- Disabling Kerberos v5 support.
- Installing all libraries, binaries, and man pages in "/usr/local".
- Setting "/usr/local/openssl" as the default certificate root.
By default, OpenSSL will root itself in "/usr/local/ssl". Many
software packages that work with OpenSSL expect this location and
will require overrides if the default is changed. However, using
the "/usr/local" root obviates the need for adding ":/usr/local/ssl/lib"
into the LD_RUN_PATH or equivalent variable. Keeping the path that
the run-time linker searches short will result in marginally quicker
initial application startups. It also means that, if a process must
be traced, there are fewer NOENT entries in the trace output.
OpenSSL can take a fair amount of time to compile. There are a
number of modules to compile and some are computationally intensive
to create. It also comes with a hideous amount of manual pages that
makes the package installation take longer than one might expect.
Install Authentication Libraries
The Cyrus SASL packages are responsible for providing authentication
services to other applications. These are done variously through
use of linked authentication libraries and optional authentication
daemons. Various plug-ins can be created to provide authentication
mechanisms, such as "plain" and login, CRAM and Digest MD5, One-Time
Password (OTP), NTLM, and others. It can even provide access to
other backend user credential stores such as LDAP or a MySQL database
(useful for distributed management of large user spaces that don't
require Unix shell accounts). These authentication routines can
be used by Postfix to allow third-party relaying only from authenticated
users.
Most modern email clients support SMTP authentication. Common
authentication methods used by email clients include login, plain,
CRAM-MD5, and Digest-MD5. Therefore, it is important that the Cyrus
SASL package be compiled to support these login mechanisms. To ensure
that these four modules get built, specify --enable-cram --enable-digest
--enable-plain --enable-login in your list of options passed
to the configure script.
Cyrus SASL typically also makes use of Berkeley DBM files. However,
unless the configure script is instructed where to find the Berkeley
DBM install location, DBM support will be absent. To tell the configure
script where to find Berkeley DBM, specify --with-bdb-libdir=/usr/local/BerkeleyDB.4.2/lib
--with-bdb-incdir=/usr/local/BerkeleyDB.4.2/include.
Miscellaneous options found to be useful:
- --enable-sample -- Build sample client and server; this
can be useful for diagnostic tasks.
- --enable-static -- Build static libraries.
- --enable-shared --enable-staticdlopen -- Include shared
library support and define usage methods.
- --enable-java --with-javabase=/usr/java/include -- Include
Java hooks.
- --with-devrandom=/dev/random -- Where SASL should look
for entropy sources.
- --with-pam -- Recommended, if your system supports Pluggable
Authentication Modules.
- --with-saslauthd=/usr/local/var/saslauth -- Where the
Cyrus SASL Authentication daemon will store state information.
- --with-openssl --with-des --with-rc4 -- To enable OpenSSL
support and use OpenSSL's DES and RC4 cryptographic routines.
These options are specified more as a matter of habit than of
true necessity. Most of them either have default values or are automatically
found on their own.
To make the whole configuration easier (and more easily repeatable
in case of configuration errors), it may be best to simply dump
all of the chosen options to a file, one option per line. This file
can then be fed to the configure script by issuing ./configure
'cat OptionsFile' in the top-level source directory. Once the
configuration successfully completes, simply follow the installation
and configuration instructions in the READMEs and INSTALL files.
Install TLS-Enabled Postfix
The stock Postfix software does not yet natively support TLS.
To add this support to Postfix, a patch must be applied to the source
code. This patch comes in the form of a unified diff.
Warning: The patch author writes the patch against specific versions
and builds of Postfix. The patch is also written against a specific
version of OpenSSL. Attempting to use this patch against different
versions/builds of Postfix and the wrong version of OpenSSL may
result in broken code.
Also, note that if the software is being built on a system like
Solaris, the vendor-supplied patch program will not work with this
diff file. Please download, build, and install GNU patch from the
link specified in the requirements section.
Actual application of the patch is fairly trivial. Unroll the
Postfix source code distribution and the patch distribution into
the same parent directory (e.g., "/usr/local/src"). From this common
directory, issue the command patch -p0 < PatchDir/pfixtls.diff
(where PatchDir is the directory created by unrolling the patch
tarball) A screenful (or so) of "patching file" messages will scroll
by. At this point, Postfix is ready to be configured for build,
including the TLS extensions.
Building Postfix with TLS support can be done by following the
READMEs and INSTALL documents. Minimal options must be set at configuration
time, as the install scripts will ask where to put the various components.
There is no "configure" script. Build options get passed directly
to make. Specifying extra options should be done while running
the make makefiles command. This should be done as follows:
make makefiles CCARGS='-DUSE_SASL_AUTH \
-I/usr/local/include/sasl -DUSE_SSL \
-I/usr/local/ssl/include' \
AUXLIBS="-L/usr/local/lib -lsasl2 -lssl -lcrypto"
The installation documents provide other options, but those are only
for providing default answers to the install scripts. Once the creation
of the makefiles is done, simply run make, then make install.
You will be prompted for where to install the various components.
Choose something that makes sense for your system [1].
Install SpamAssassin
SpamAssassin is a collection of Perl modules used for scanning,
categorizing, and tagging emails. This package can be obtained as
a source code distribution or via Perl's CPAN feature. I prefer
the CPAN method and will outline the steps necessary to create a
working SpamAssassin configuration via the CPAN tools. The following
should get you a nicely functional SpamAssassin package:
# perl -MCPAN -e shell
cpan shell -- CPAN exploration and modules installation (v1.76)
ReadLine support enabled
cpan> install Pod::Usage
cpan> install HTML::Parser
cpan> install Sys::Syslog
cpan> install Net::DNS
cpan> install Mail::Audit
cpan> install Mail::Internet
cpan> install Net::SMTP
cpan> install Digest::SHA1
cpan> install Net::Ident
cpan> install IO::Socket::SSL
cpan> install ExtUtils::MakeMaker
cpan> install File::Spec
cpan> install BerkeleyDB
cpan> install DB_File
cpan> install Mail::SpamAssassin
This will also take care of installing SpamAssassin and putting a
default configuration file into place. SpamAssassin writes its default,
system-wide configuration file to /etc/mail/spamassassin/local.cf.
Note that, sometimes, CPAN modules may not build correctly within
the CPAN construct. In those instances, you must build the source
code by hand. If a CPAN module fails to compile, try the following:
exit CPAN; cd $HOME/.cpan/build; cd into the module's
source directory; issue the commands make distclean, perl
Makefile.PL, make, make test, then make install.
If necessary, restart CPAN. If this still results in failure (other
than due to missing dependencies), retry your efforts using the
GNU version of make.
If you prefer to manually compile the above modules, CPAN can
still be used to grab the module list. Simply download the modules
by issuing perl -MCPAN -e "get Module::Name" (where Module::Name
is the name of one of the above-listed modules). Once all of the
above have been downloaded, cd to the $HOME/.cpan/build
directory and compile each set of modules. Note that this method
will not automatically account for dependencies. Only building within
the CPAN environment offers that functionality.
Install Anti-Virus Software
Clam A/V was chosen because it's a free, simple to use open source
solution. Its installation is fairly trivial and requires little
in the way of configuration and maintenance. Pre-compile configuration
is as simple as setting:
--enable-shared --enable-static --enable-id-check \
--enable-bigstack --with-user=USER --with-group=GROUP \
--with-dbdir=/path/to/definitions/dir.
The first two options are fairly self-explanatory. If you are using
user namespace management other than /etc/passwd (e.g.,
LDAP) and your Clam A/V user exists only in the alternate namespace,
you must enable the --enable-id-check option. Enabling "bigstack"
support will increase the amount of memory consumed by the application
and prevent larger messages from exhausting pre-allocated memory space.
Because Clam A/V will be running in concert with the AMaViSd-new
process, you must set Clam A/V to function with the same user and
group IDs that AMaViSd-new uses. The --with-user=USER --with-group=GROUP
options do this. Typically, create a "filter" user and group (or
other similar user and group) for use by both daemons. The --with-dbdir=
option tells the Clam A/V daemon where to look for its virus definitions
and the freshclam definition update process where to write new virus
definitions to. Once the configuration script completes, simply
use make to build and make install to finish the installation
of the Clam A/V software.
Install AMaViSd-new Software
AMaViSd is a master filter process. It listens as a daemon/service
awaiting inbound traffic. It then takes that traffic and passes
it off to the other filters for which you have configured it to
act as a front-end. By default, if you already have SpamAssassin
and Clam A/V installed, it will pass off to them.
Like SpamAssassin, it is written in Perl. It also has a list of
modules on which it depends. As with SpamAssassin, you will need
to install these via CPAN:
cpan> install Archive::Tar
cpan> install Archive::Zip
cpan> install Compress::Zlib
cpan> install Convert::TNEF
cpan> install Convert::UUlib
cpan> install MIME::Base64
cpan> install Mail::Internet
cpan> install Net::Server
cpan> install Net::SMTP
cpan> install Digest::MD5
cpan> install IO::Stringy
cpan> install Time::HiRes
cpan> install Unix::Syslog
Several of these should already be installed and up to date -- especially
since some were previously required by the SpamAssassin install.
On some Unix systems, the Unix::Syslog module can sometimes fail
to build correctly from within the CPAN context. If the compile
fails, exit the CPAN environment and cd to ${HOME}/.cpan/build/Unix-Syslog-X.XXX.
Clean up the environment from the broken CPAN build by issuing make
distclean. Recreate the make files by issuing perl Makefile.PL.
Once Perl creates the new makefiles, you should be able to issue
a make followed by a make install and have the Unix::Syslog
module correctly built and installed. If this bombs, try GNU make,
instead.
The MIME::Parser module is also required; however, the CPAN version
lacks a patch. You can get the correct MIME::Parser module from
http://search.cpan.org/dist/MIME-tools. Download the latest
6.2xxx version and build it similarly to Unix::Syslog.
Once these modules are installed, AMaViSd should work (when activated).
This is a manual process that is detailed in the INSTALL file. Be
sure that you run AMaViSd under the same userid and groupid under
which you installed Clam A/V. Otherwise, all sorts of permissions/ownership
errors will crop up when attempting to run the software.
Configure Clam A/V
The Clam anti-virus software is configured via the file /usr/local/etc/clamav.conf.
This file is fairly well commented. Pick configuration values that
make sense for your environment. Listing 1 shows an example configuration.
There are a number of other options that can be specified and these
values are just for example purposes. The options are laid out in
the default configuration file as well as the Clam A/V man pages.
Like any virus scanner, Clam A/V is only as good as its virus
definitions are current. Clam A/V comes with an application called
freshclam, which can either be periodically run from cron or set
up as a daemon that polls for virus definition updates from the
Clam A/V project's definition server. To set it up to run as a daemon
that hourly polls for new definitions, start the freshclam
process as 'freshclam -d -c 24'. The freshclam process will
also require proper configuration. The freshclam configuration file
is /usr/local/etc/freshclam.conf. If this file does not exist, create
it by copying the example configuration file included in the source
build directory. Modify it to suit your environment.
Configure AMaViSd
The AMaViSd software is configured via the file /usr/local/etc/amavisd.conf.
This file is fairly well commented. Pick configuration values that
make sense for your environment. Only a handful of changes are needed
to make the AMaViSd software work for your site. Listing 2 shows
an example configuration.
This example assumes that you are not going to run the AMaViSd
software as a centralized virus scanner for a number of networked
hosts. If you want AMaViSd to support multiple hosts, take special
care to modify the various "acl" directives. Also, AMaViSd will,
by default, scan for a number of packages at startup. If this behavior
is not desired, simply comment out the scans from the configuration
block. Of course, looking over all of these various scanner options
can give you ideas for other AMaViSd plug-ins with which to experiment.
Finally, it is also recommended that, if you will be doing extensive
white listing or blacklisting, that you configure AMaViSd to use
external hash files. This is done by way of the read_hash() directive:
read_hash(%whitelist_sender, '/path/to/whitelist.txt');
read_hash(%blacklist_sender, '/path/to/blacklist.txt');
When the files are updated, the AMaViSd process will need to be bounced
to cause the changes to be read in.
Configure Postfix
Configuring Postfix is done in two main parts -- the baseline
configuration, and the anti-UCE controls. There is also one optional
part, the AUTH/TLS components. The baseline configuration ensures
a minimally functional Postfix configuration. The anti-UCE controls
help make the Postfix MTA a bit more resistant to being used to
deliver, send, or relay spam and other garbage. The optional AUTH/TLS
components allow Postfix to provide SMTP-authentication functions
to SMTP clients and to do automated, point-to-point encryption of
message contents between TLS-enabled SMTP servers. All of these
are configured within the Postfix configuration file, main.cf.
A baseline configuration will include such things as how the SMTP
server identifies itself, whom it trusts, where it finds helper
programs, etc. Listing 3 shows an example of such a configuration.
Note that the setgid_group value in this file should be set
to something different from the Postfix user's main group assignment.
Anti-UCE controls can be as simple as strictly enforcing RFC-specified
SMTP behavior and as complex as doing lookups against DNS-based
blacklists or using custom filters. The next example (see Listing
4) does a bit of all of this, the last of which is instructing Postfix
to pass off (more advanced) content-filtering functions to the AMaViSd
process.
Note that any or all of these rules may not be present in the
default Postfix configuration file because it's possible to have
a working Postfix configuration without these directives. These
optional directives simply improve the SPAM killing ability of Postfix
by enabling certain built-in checking routines.
The last line in the configuration shown in Listing 4 also requires
a change to the Postfix master.cf file. Two services will need to
be added to the master.cf file, as follows:
smtp-amavis unix - - n - 2 smtp
-o smtp_data_done_timeout=1200
127.0.0.1:10025 inet n - n - - smtpd
-o content_filter=
-o local_recipient_maps=
-o relay_recipient_maps=
-o smtpd_restriction_classes=
-o smtpd_client_restrictions=
-o smtpd_helo_restrictions=
-o smtpd_sender_restrictions=
-o smtpd_recipient_restrictions=permit_mynetworks,reject
-o mynetworks=127.0.0.0/8
-o strict_rfc821_envelopes=yes
-o smtpd_error_sleep_time=0
-o smtpd_soft_error_limit=1001
-o smtpd_hard_error_limit=1000
The last section of the Postfix configuration is to enable SMTP-authentication
routines and/or host-to-host SMTP data encryption. If there is a need
to provide SMTP relay access to roaming clients or to a list of clients
that would be unwieldy to manage via explicit access controls, authenticated
access is the best way to go. It allows you to provide relay access
to those customers without turning the SMTP server into an abuseable
"open mail relay". Creating an "open mail relay" is irresponsible
Internet behavior and will almost certainly result in the server getting
banned by the Internet at large.
By setting up TLS functions, two things can be accomplished --
SMTP relay authentication credentials can be made highly difficult
to compromise, and SMTP-to-SMTP email data can be encrypted. Given
the number of hops that an email may go through, the unknown nature
of who might be snooping on said emails, and the potentially sensitive
nature of the data sent, it makes sense to enable encryption even
if authentication functions are unneeded. The configuration shown
in Listing 5 accomplishes just that -- encryption without authentication.
It should be noted that, given this Postfix configuration file,
turning on authentication is as simple as changing smtpd_sasl_auth_enable
from "no" to "yes".
The different map files listed in the example configuration segments
must at least exist and be placed into the proper format. For regular
access map files (such as the blacklist file), simply touch'ing
the file and running it through postmap will suffice. The
aliases file(s) is slightly different, in that it must be run through
the postaliases command instead [2]. Attempting to run the
aliases file(s) through postmap will produce a flurry of
error messages.
Starting It All Up
Assuming no typos in any of the various daemons' configuration
files, the mail system should be about ready to go. The mail system
should be started in the following order: anti-virus (Clam A/V),
content-filter engine (AMaViSd), then MTA (Postfix). The Postfix
and AMaViSd packages both come with init scripts. Ensure that these
are installed into the root run control script directory (on System
V Unix systems, this is /etc/init.d). The Clam A/V software
will require the creation of an init script. On Solaris systems,
the following should do:
#!/bin/sh
#
# Manage the clamav service
CLAMDAEMON="/usr/local/sbin/clamd"
FRESHCLAM="/usr/local/bin/freshclam"
CLAMREFRESH="-d -c 24"
case $1 in
start)
$CLAMDAEMON
$FRESHCLAM $CLAMREFRESH
;;
stop)
pkill clamd
pkill freshclam
;;
*)
echo "Usage: $0 [start|stop]"
;;
esac
The init scripts should then be linked into the appropriate run-level
directories. On a Solaris system, see Table 2 for recommended init
script locations. Adapt this setup as necessary for the appropriate
Unix flavor, taking special care to preserve the relative start orders.
Testing the Setup
Once all of the daemons are configured and running, a few tests
are needed to ensure correct functioning of the configuration. The
tests are: ability to send mail to remote systems, ability to receive
mail from remote systems, whether email is correctly flagged as
spam, and whether email is correctly flagged as containing viruses.
Testing Ability to Send Mail to Remote Systems
Depending on your operating system, you may not have tools to
correctly send email from the command line via Postfix. If this
is the case, the following test, using Postfix's Sendmail API, will
confirm outbound functionality:
# sendmail
From: testsender@smtp-gate.mail.domain
To: testrecept@mail.domain
Subject: TEST
TEST
.
If outbound email is functioning correctly, the remote email address
will have the test message in the designated user's inbox.
Testing Ability to Receive Mail from Remote Systems
Testing the ability to receive email is as simple as taking any
given mail client and sending a test message to an address on the
Postfix server. An address that is usually safe to send to is root@mail.server.FQDN.
If the message shows up in the root user's mailbox, inbound SMTP
is functioning correctly. If the message fails to show up, check
the system logs for answers.
Testing Virus Stopping
If the AMaViSd processes are properly configured, 'netstat'
should show something like the following:
127.0.0.1.10024 *.* 0 0 49152 0 LISTEN
127.0.0.1.10025 *.* 0 0 49152 0 LISTEN
This indicates that the AMaViSd listener processes are bound and listening.
Furthermore, the above indicates that they are listening only for
locally initiated traffic (preferred from a security perspective).
Listing 6 will test whether they are functioning correctly.
The "X50..." line is a special string of text used for testing
the virus scanning function. It should always result in at least
one SMTP "250" response statement that contains the token "BOUNCE".
An email warning should also have been sent to the user configured
to receive virus alerts (usually postmaster). It should look something
like:
From: virusalert@smtp-gate.mail.domain
To: virusalert@smtp-gate.mail.domain
Subject: VIRUS (Eicar-Test-Signature) FROM LOCAL <root@mail.domain>
If not, the virus scanner is malfunctioning. Check the system logs
for failure indications.
Testing Spam Stopping
This test also can be accomplished using any mail client desired.
It is critical that the message body contain the following text:
XJS*C4JDBQADN1.NSBN3*2IDNEN*GTUBE-STANDARD-ANTI-UBE-TEST-EMAIL*C.34X
Like the virus test, the above should result in an email alert being
sent to your spam alert account (usually postmaster). This alert email
should be addressed from "spam.police@mail.server.domain." Again,
if the alert email does not show up, check the system logs for failure
indication.
Lastly, confirm overall functionality of the mail system by sending
an email message through it and looking at the message headers.
If the message sent contains known spam indicators, output will
be similar to the below (truncated) headers:
Date: Wed, 3 Mar 2004 16:00:12 -0500 (EST)
From: Super-User <root@smtp.mail.domain>
Message-Id: <200403032100.i23L0CjP019360@smtp.mail.domain>
To: testuser@wsl.mail.domain
Subject: Header Test
X-Virus-Scanned: by amavisd-new at mail.domain
X-Spam-Status: No, hits=4.6 tagged_above=1.0 required=5.0
tests=BAYES_40, CLICK_BELOW, EXCUSE_REMOVE, FRONTPAGE,
HTML_70_80, HTML_LINK_CLICK_HERE, HTML_MESSAGE,
HTML_TITLE_EMPTY, MIME_HTML_ONLY, OFFERS_ETC
X-Spam-Level: ****
Pay special attention that an "X-Virus-Scanned:" line is present.
That proves that the message passed through the content filtering
agent.
If the test message was populated with spam-ish data, an "X-Spam-Status"
line should be present. This indicates that the message was scanned
for spam content, what types of spam tests it was flagged by, and
the cumulative hits of the combined spam hits. If enough spam-ish
content was populated into the message, the "Subject:" line will
also be modified to indicate high spam content.
Miscellaneous Content Blocks
Postfix also offers an in-built mechanism for blocking mails based
on header, body, and MIME content-type checks. These were mentioned
in passing by way of the Postfix configuration file example. The
specific configuration parameters are header_checks, body_checks,
and mime_header_checks. To quickly and easily block unwanted
attachments, place the following in your mime_header_checks
map file:
/name=[^>]*\.(ade|adp|app|asd|asf|asx|bas|bat|chm|cmd|com|cpl|crt|
dll|exe|fxp|hlp|hta|hto|inf|ini|ins|isp|jse?|lib|lnk|mdb|mde|msc|
msi|msp|mst|ocx|pcd|pif|prg|reg|scr|sct|sh|shb|shs|sys|vb|vbe|vbs|
vcs|vxd|wmd|wms|wmz|wsc|wsf|wsh|\{[^\}]+\}) /
REJECT E-mails with Unsafe Attachment Types Are Rejected by Rule.
The above is a regular expression that will check the "name=" field
of a MIME block. If that block contains any offending attachment types
(e.g., .pif, .exe, .dll, etc.), the email will be rejected and the
sending system given an error message of "E-mails with Unsafe Attachment
Types Are Rejected by Rule". If you find that desired emails are being
discarded or rejected, try breaking the above regular expression rule
up into discrete rules. Assign an error number to each pattern matched.
This will help you locate which rule resulted in the discard or bounce.
These three check mechanisms allow you to accept/reject email
based on a virtually limitless criteria set. It's mostly a matter
of figuring out the necessary regular expression to catch those
items that require action.
Deployment Options
Given all of the above, the question may follow "how to implement
it?" There are two main ways to implement the filtration server
into a mail solution: as a standalone destination host (i.e., a
system that will host user mail), or as an inline filtering device
for a downstream mail or mailbox server. There also exist a few
variations on how to implement an inline content-filtering mail
server: standard SMTP via MX precedence, via transport mapping,
and using either SMTP or LMTP direct delivery to downstream mail
hosts.
Use as an inline content-filtering device has two advantages:
it allows the downstream mail or mailbox servers to be afforded
protection that might not be otherwise available to them; it offloads
the content-filtering overhead to another host. Both of these allow
the downstream mail hosts to be essentially left unmodified.
Filter Host as Destination
This is perhaps the simplest way to deploy this configuration.
In this scenario, mail that comes into the filter server is either
delivered locally to a valid recipient, or bounced (rejected). Postfix
determines whether a recipient address is a valid local recipient
by either consulting the system's user database (e.g., /etc/passwd),
or by using defined local user lookup maps. These alternate namespace
maps are configured using the local_recipient_maps directive.
There are additional parameters to configure when using alternate
name spaces, but detailing those options falls outside of the scope
of this article.
Mail delivery can be configured to deliver to user's home directories
by way of the home_mailbox directive, or to a common mail
spool directory by way of the mail_spool_directory directive.
The Postfix software can natively handle either traditional Unix
"mbox" format or the newer "maildirs" format for local message delivery.
For maildirs format, simply add a "/" to the end of the directory
name specified through either the home_mailbox or mail_spool_directory
directives.
Local mail delivery will also require a way for users to collect
their mail. Although this is also outside of the scope of this article,
I have a few recommendations. For smallish implementations, the
Washington IMAP server is a fairly standard choice. For larger implementations,
Cyrus IMAP is probably a better choice. Both of these servers offer
email collection via POP3 and IMAP protocols and also support TLS-protected
sessions.
Lastly, local delivery will likely imply use of the server as
an SMTP relay for the POP/IMAP clients. If SMTP relay service is
to be provided to the POP/IMAP users, the relay service should make
use of SMTP authentication. Furthermore, to protect the login credentials,
TLS should be used to encrypt the SMTP authentication sessions.
Inline Filter Device -- SMTP via MX
The SMTP specifications provide a means by which mail servers
can be set up with delivery preference levels. For example, a given
email destination would have the highest preference, whereas hosts
that are intended to act as a backup for that server's deliveries
have lower preference. SMTP's behavior is to contact the highest
preference host that it can reach. Failing that, it will fall back
to the next highest preference host until it exhausts the list of
MX hosts.
Given some minor "abuses" (uses that might not have been at the
core of the design), this can be used to enforce a mail flow. This
flow can be set up to require inbound email from the Internet to
pass through the content-filtering host before delivery to the destination
SMTP host. The "abuses" would be to set up the highest preference
MX host to not accept inbound SMTP connections from any host other
than designated upstream SMTP servers. In this case, that would
be the content-filtering hosts. This could be accomplished via ACLs
within the destination SMTP application (e.g., TCP Wrappers), the
destination SMTP host (e.g., IP Filter) or by way of firewall devices.
The MX preference list would cause the inbound traffic to flow
to the first host it could read -- the content-filter host -- then
on to the destination MX host. This flow would be created via DNS.
Inline Filter Device -- SMTP Transport Mapping
Sometimes, it may seem advantageous to "hide" the last-hop MX
host in DNS. That is to say, do not advertise in public DNS that
your final destination SMTP host even exists. To do this, your upstream
mail hosts -- in this case, the content-filter host(s) -- will be
designated the last-hop MX host in DNS.
As last-hop MX hosts, emails to these systems would normally bounce
back to sender with a "mail loops back to myself" or similar error.
Normally, this error would be prevented by configuring the last-hop
MX for local delivery. To avoid this error without performing local
delivery, the Postfix process must be told what else to do with
the email. This explicit routing is used to make Postfix shunt email
on to the next-hop system, using the transport_map directive.
Typically, this will be defined something like transport_map
= dbm:/etc/postfix/transport_map. This tells Postfix to consult
a Berkeley DBM formatted lookup file (more efficient if lots of
explicit routes are required). It will be formatted similar to the
following:
.mail.domain smtp:[last-hop.FQDN]
mail.domain smtp:[last-hop.FQDN]
smtp-gate.mail.domain :
This map table instructs Postfix to deliver anything destined explicitly
for "mail.domain" or to any subdomain of "mail.domain" to the host
"last-hop.FQDN" (e.g., imap.mail.domain). The square brackets around
the "last-hop.FQDN" are critical, in that they instruct Postfix not
to attempt an MX lookup for that hostname, but to perform the delivery
directly to the named host. The two records with the bare ":" right-hand
argument instruct Postfix to perform normal, MX-based delivery for
the hosts explicitly named on the left-hand side. In this case, those
are the names by which the content-filter host is known, which were
previously set in the Postfix configuration file with the myhostname
directives.
Final Thoughts
I'm a long-time Sendmail administrator, and I've found that fighting
spam with Sendmail, although possible, exerted a significant overhead
on the systems on which it was used. It also proved to be annoyingly
slow and sometimes buggy and was not holding up well to the increasing
volumes of spam.
Sendmail can be configured to use helper programs called Milters.
However, the Milter I used, MIMEDefang, did not behave or perform
well in a Sun Solaris 9 Sendmail environment. This caused me to
search for a better solution. After searching mail-related forums,
I tried several different MTAs. I chose Postfix because of its good
combination of performance, extensibility, stability, and administrative
ease.
Postfix proved to be such a good solution, that I relegated my
Sendmail server to be purely an SMTP relay for POP and IMAP customers.
Even with the virus and spam-filtering functions removed from the
Sendmail server, the Sendmail server was considerably slower than
either of the eventual Postfix content-filter servers. Note, however,
that the Sendmail server in question is a Sun Enterprise 250 with
2GB of memory and 2x400MHz CPUs; one Postfix server is a Sun Ultra
I with 512MB of memory and 1x167MHz CPU; and the other Postfix server
is a Sun Ultra II with 512MB of memory and 2x296MHz CPUs. A queue
flush of only a few hundred messages sent from the slowest Postfix
server to the Sendmail server would put a serious hurt on the Sendmail
server.
I also tested Postfix on a secondary MX (the Ultra I). When I
saw how well the secondary MX was functioning, it made sense to
offload the filtering work from the Sendmail host completely to
the secondary MX. Furthermore, it made sense not to make the filter
host a single point of failure. Therefore, an old Ultra II was allocated
to be a parallel filter host. Then MX rules were changed for all
of the domains trafficking the mail system to cause them to MX terminate
at the filter hosts. The filter hosts, in turn, implemented transport
maps to ensure delivery to the Sendmail server.
All of this experimentation proved that a reliable and fast mail
architecture could be created with just three hosts. The fact that
two hosts could act as a front-end for a back-end mail store indicates
that the solution was fairly scalable [3] in addition to being reliable
[4].
About two months after initially writing this article, I was able
to convince my employer to allow me to install the previously described
filtering system. These systems were deployed as in-line content
filters, in front of the corporate Exchange servers. After full
deployment of two filter hosts, a daily average of 72% of SPAM traffic
and nearly 100% of virus traffic was stopped at the filter hosts.
This greatly reduced the administrative load on the Exchange administrators
as well as significantly reducing the network, CPU, and disk usage
of the Exchange systems. Finally, it also greatly cut down on the
amount of SPAM received by the more SPAM-afflicted Exchange users.
Thanks to Robert Bastille for proofing and testing the original
manuscript. Thanks to members of the All Things Unix forum at DSL
Reports for reforming a Sendmail bigot.
Thomas H. Jones II currently works for a small consulting company
based in Virginia. Previously, he spent seven years in the ISP industry
and three years working for Enterprise Hardware Vendors.
1 When the installation script asks for "install_root:", all files
will be installed relative to this root. For example, if you set
install_root = /usr/local, everything will be installed under
/usr/local. This is useful if you want to run components of Postfix
chroot()ed.
2 If you already have a Sendmail aliases file, Postfix can be
made to understand it via the alias configuration directives, above,
and using postalias against the Sendmail alias file.
3 Scalability is provided by means of DNS. If the content-filtering
configuration is to include more than one filtering server, each
server should be configured with the same MX preference level. SMTP
traffic will tend to be distributed in a round-robin fashion across
the available filter servers. If/when the number of content-filter
servers is insufficient to support the mail flow to the last-hop
MX host, simply add more identically configured content-filter hosts
to the mail flow architecture. As more servers are added, traffic
will be spread across the new systems as the updated DNS information
propagates.
4 Availability comes through the fallback nature of MX references.
As previously noted, SMTP's includes an availability fallback feature.
Therefore, if one of the content-filter hosts is offline for repair,
upgrade, etc., the calling SMTP host will attempt to connect to
the next available MX host. That will be whichever of the remaining
content-filter hosts it is able to contact first. It should become
fairly obvious that, in an ideal world, a minimum of two content-filters
must be configured in front of the downstream mail hosts. |