Procmail
Hints and Hacks
Kevin Shortt
This article covers the popular mail delivery agent procmail,
written by Stephen R. van der Berg. If you are familiar with procmail
and its uses, then this article will suit you well. In it, I'll
describe some procmail filtering techniques that I have used as
an email administrator over the years. These recipes illustrate
how I filled some real-world needs with procmail.
I will begin with a simple backup recipe for all incoming mail.
No recipe gets simpler than this. It begins with its declaration
(:0) and the flag c. Essentially, this means to use
a copy (c) of the incoming message and place it in the folder
called "backup". Placing this recipe first in your .procmailrc file
will ensure that a copy of all incoming mail will be put aside before
any other recipe erroneously destroys it.
The recipe:
:0 c
backup
This can be useful if you are in the early stages of learning the
power of procmail. Additionally, as an administrator, it is useful
to place this into users' .procmailrc files as an easy way to restore
inboxes for users who remove their email from the server with a PC
client and then watch their PC die with lots of unread mail. I implement
this for some select cases in such a way that a backup copy of each
message is placed into a separate file with the date used as part
of the filename. Then, every night I delete any backup inboxes that
are older than three or five days. Each case is different. I will
expand further on the files named by date later in this article.
Auto-Responding
Next, let's look at a simple auto-responder. This auto-responder
recipe uses a copy of the incoming message, checks whether the message
is for a particular alias, extracts the sender, and then generates
a reply that includes a new subject and a pre-written message body
taken from a text file. It then emails a reply to the original sender.
The recipe:
:0 hc
* ^TOalias@sysadminmag.com
* !^FROM_DAEMON
* !^FROM_MAILER
* !^X-Loop: reader@sysadminmag.com
| (/usr/bin/formail -r \
-I"From: Firstname Lastname <reader@sysadminmag.com>" \
-I"Subject: *** OUT OF OFFICE ***" \
-A"X-Loop: reader@sysadminmag.com"; \
/bin/cat /home/reader/outofoffice.msg) \
| /usr/sbin/sendmail -oi -t
This recipe begins with its declaration (:0) and the flags
h and c. This line essentially means to use a copy (c)
of the incoming message and process only its headers (h). The
filter conditions follow next. The conditions are one or more lines
that begin with a "*". The first condition in this recipe verifies
that the incoming email message is actually intended for the address
you've chosen. This is not a necessary condition, but it is helpful
when you have alias email spooling into a shared mailbox and you only
want to send auto-replies for emails that were sent to the alias address.
The two lines "* !^FROM_DAEMON" and "* !^FROM_MAILER" indicate
to skip any incoming message that is from mailer-daemon or postmaster.
This will help prevent you from sending an auto-reply in the event
the incoming message is from a mailing list. These two variables
actually expand to a much larger regular expression. Please read
the man page on procmailrc(1) (in the MISCELLANEOUS section)
for further details.
There are many dangers associated with allowing your auto-responder
to post back to a mailing list. It is clearly not a good idea. The
line "!^X-Loop: reader@sysadminmag.com" will prevent a reply from
being generated if the email address (reader@sysadminmag.com) is
listed in an X-Loop: header of the incoming message. If a message
contains this X-Loop: header with our email address, it is likely
that this recipe has already replied to a message whose sending
email address also has an auto-responder set up. This prevents the
endless looping of replies.
The next line of our recipe is the action line. Procmail only
allows one action line. So for readability, the end of each line
is escaped with a backslash to indicate the line is to be continued
onto the next line. Our action uses formail, cat, and sendmail to
generate the recipe's response.
Formail is a component of the procmail package that can be used
as a filter to format or reformat an email message. The action line
in this recipe consists of two main components that each begin with
a pipe "|". The first component pipes the message to a single block
of parentheses that executes formail and /bin/cat to generate the
reply. The formail call prepares the headers for the reply message.
This step is crucial, because at this point in the recipe, we do
not know who to reply to; formail handles that for us here.
The "-r" instructs formail that we want only to generate an auto-reply
header. The "-I" flag will replace or insert a new field into the
header of the outgoing message. The From field gets inserted here
if you like. The Subject header of the incoming message gets fully
replaced with our new one. If you would like to keep the existing
Subject header of the incoming message, simply omit an argument
with "-ISubject". The "-A" flag is for appending custom headers
to the message.
This recipe adds the message header "X-Loop: reader@sysadminmag.com".
You should replace that with the username sending the reply. This
header helps prevent mail loops with the given mail account. It
essentially prevents another auto-responder from responding to our
auto-reply. As discussed above, the recipe will not reply again
once this header is in the message. Once all of the arguments to
formail are complete, it is important to note that a semicolon is
necessary to be able to execute a second command inside the first
set of parentheses.
Before the close of the parentheses, the recipe shells out to
execute /bin/cat to grab the message body from a predefined file
in the user's home directory. Inside the file should appear the
text you would like the reader of the auto-reply to see as the body
of the message. The parentheses are typically used when multiple
shell commands are required to generate your reply. In this case,
the recipe uses formail and cat to generate a message that gets
piped to the second component of our recipe.
The second component is how the response is sent. The recipe now
shells out to sendmail to send the auto-reply. The argument "-oi"
tells sendmail to ignore dots in the mail message. The "-t" tells
sendmail to look for recipients to send the message.
Adding an Attachment
Now let's expand this recipe to include adding an attachment with
the reply. I have had a few requests over the years that have asked
me to set up an auto-reply for an incoming email alias that would
reply with a PDF or an MSWord document with marketing information
regarding a client's products. For example, business Web site A
sells widgets. They offer free information for their widgets by
stating in their advertisements that the customer can simply send
email to "freeinfo@websiteA.com" to receive a free brochure via
email. A potential customer emails "freeinfo@websiteA.com" and a
reply is immediately sent out with a MIME attachment of a predefined
electronic brochure.
The following recipe expands on the previous one by changing the
layout of the message into a MIME-formatted message. (MIME stands
for Multipurpose Internet Mail Extensions and is defined in RFC
2045.) Essentially, the defined format allows each message and all
of its components, including body and attachments, to exist and
be printed as plain ASCII text. Consequently, each reply that our
recipe builds is plain ASCII text.
The recipe:
:0 hc
* ^TOalias@sysadminmag.com
* !^FROM_DAEMON
* !^FROM_MAILER
* !^X-Loop: reader@sysadminmag.com
| (/usr/bin/formail -r \
-I"From: Firstname Lastname <reader@sysadminmag.com>" \
-I"Subject: *** Free Info from WebSite A ***" \
-A"X-Loop: reader@sysadminmag.com"; \
-I"MIME-Version: 1.0"\
-I"Content-Type: multipart/mixed; \
boundary=\"8323328-1766922534-967595046=:27818\"";\
echo "--8323328-1766922534-967595046=:27818";\
echo "Content-type: TEXT/PLAIN; charset=US-ASCII";\
/bin/cat /home/reader/FreeInfoGreeting.txt \
echo "--8323328-1766922534-967595046=:27818";\
echo "Content-Type: APPLICATION/MSWord; \
NAME=\"brochure.doc\"";\
echo "Content-Transfer-Encoding: base64";\
echo "Content-Disposition: \
attachment; filename=\"brochure.doc\"";\
echo "Content-Description: MSWord Document";\
echo "";\
/usr/bin/mimencode -b /full/path/to/brochure/brochure.doc;\
echo "--8323328-1766922534-967595046=:27818--"\
) | /usr/sbin/sendmail -oi -t
Since I've explained most of this recipe previously, I will only elaborate
on the additions. I did not read the RFC to learn how to do this.
I essentially studied my inbox using vi. Then I wrote this recipe
to mimic a MIME-formatted message, which is a message created with
sections that are differentiated with unique boundary tags.
The additions to this recipe include adding a "MIME-Version: 1.0"
header that communicates to the mail client software (Mail User
Agent -- MUA) that the message is in MIME format. A Content-type
header is inserted to indicate to the MUA that the message is a
multipart/mixed message with a boundary tag of "8323328-1766922534-967595046=:27818".
This boundary tag is only a unique marker to indicate the end of
a section of the MIME format. It's important to note that this string
can be anything reasonably unique. In this recipe, it means that
each section in the outgoing message will begin and end with "--8323328-1766922534-967595046=:27818".
Note that each actual boundary must begin with a "--".
Inside each section, a Content-type header must also be inserted.
This communicates to the MUA the type of data that exists in the
current section. This allows the MUA to determine the appropriate
software to display or execute the content to the user. In this
new recipe, we create a message with only two sections to our MIME-formatted
message. More than two sections can be used. This is done by using
the boundaries that were declared inside our first Content-type
header. One of our sections is "Content-type: TEXT/PLAIN" and the
other is "Content-Type: APPLICATION/MSWord". This means that we
have a MIME-formatted message with a body in simple plain text with
an attachment in MSWord format.
The body of the message (TEXT/PLAIN) was placed into a file called
"/home/reader/FreeInfoGreeting.txt". This was placed into the message
using the cat command. The MSWord document needed more work
in order to be included in the MIME format. A header "Content-Type-Encoding:
base64" was added to let the MUA know it needed to decode the message
using a Base64 encoding method.
Before the MUA can decode the message, however, we must first
encode it. So, once the "Content-Disposition" and "Content-Description"
headers are added along with an empty line, the recipe creates the
MIME-encoded attachment by using the command-line executable /usr/bin/mimencode.
This creates a flat-text version of the MSWord document that is
easily decoded by almost every MUA. Finally, we wrap the attachment
with a final boundary. Please note the trailing two dashes "--",
which indicate to the MUA that this is the final boundary of our
message.
The easiest way to learn how this is done would be to send yourself
an email with an attachment. Then, view the flat email file on your
Unix server to see how it was put together. Be sure that you experiment
with different attachment types and MUAs. Also note the boundary
strings and how they vary. If you have a lot of time, you could
also read RFC 2045.
Halting Viruses
The next recipe I will present is a trick I use to stop viruses
from reaching my users' inboxes. Anti-virus software packages can
typically protect your computer better than the average procmail
recipe. However, sometimes analyzing the message before the vendor's
update is available can allow you to create a procmail recipe that
will work in the interim. I have used this method numerous times.
There are also many good procmail sites that offer free recipes
for virus defense. I definitely recommend that users implement them
if it makes sense for their system. The best way to stop an incoming
virus is to get hold of a few copies of it. Using vi or your favorite
text editor on your Unix machine, examine the message in plain text
with the encoded MIME format. Typically, there is a common text
string, a signature if you will, that can be gleaned from its attachment.
Once you determine that common signature, a simple recipe can be
written to filter out the incoming message. If I don't have the
time to gather the information myself, I can usually find it by
searching the Internet.
The recipe:
:0 B
* > 20000
* < 35000
* (8HcggH2gd2Bl7w1Admz|di+Nn/wLVnu82DbWBlON \
fE1TBxRhYxk7Oot8JGkDgznr+FtGdb2FwHSjjMmFGMe2a7iApeie)
* filename=".*\.(pif|exe|scr|zip|bat|cmd)"
* ^ *Content-Disposition: attachment;
/tmp/netsky
This recipe illustrates an example of how to stop an inbound virus.
This particular recipe will only stop two forms of a variant of the
NETSKY virus that was launched in January 2004. This recipe should
be placed into your servers' global procmailrc file, which is typically
/etc/procmailrc. The recipe reads as follows: Match any message whose
body is larger than 20K yet smaller than 35K AND contains either the
string "8HcggH2gd2Bl7w1Admz" OR "di+Nn/wLVnu82DbWBlONfE1TBxRhYxk7Oot8JGkDgznr+FtGdb2FwHSjjMmFGMe2a7iApeie"
in the message body (indicated by the "B" after the declaration) AND
has a header with "Content-Disposition: attachment" AND an attachment
of file extensions "pif" OR "exe" OR "scr" or "zip" OR "bat" or "cmd".
The matched strings were taken from the MIME-encoded attachment
of the actual virus. If an incoming message matches all of these
attributes, it will be delivered to the file /tmp/netsky. This will
keep the incoming matched messages out of your users' inboxes. The
message can also be thrown away immediately by saving it to /dev/null.
I always prefer, however, to save the messages to a file in /tmp
to ensure I have the ability to extract a message in the event of
a false capture.
This file can be purged as often as you like. This recipe is only
a very small illustration of the many ways procmail can be used
to filter incoming viruses. I still recommend that you deploy a
professional grade anti-virus software to protect from incoming
email viruses.
Handling Spam
In the world of filtering email, filtering by date can be very
useful. There are many great uses for this concept. One of the ways
I have implemented this concept is with SpamAssassin. I use SpamAssassin
for filtering Spam on an email server. The following recipe grabs
all incoming messages that have been processed by SpamAssassin and
checks for the "^X-Spam-Status: Yes" header. It then files it away
upon a match. In my implementation, I file it to a dedicated filesystem
and into a mail file with the current date as part of the filename.
Then, each evening via crond, I run a script that parses the spam
directory structure that I devised and deletes any spam file older
than two weeks. This gives each user two weeks to verify that the
captured spam has no false positives. A symbolic link named "spam"
inside a user's mail directory gives that user access to their spam
folder on the separate file system. It also lessens my interaction
with maintaining the growth of the filesystem.
For this solution, I have used variables to set the folder name.
It is important that the folder name be in a directory that is already
created. This way, spooling into filenames of a pre-defined format
is easy. A pre-defined format enables you to write a script that
uses the find command to delete spam files older than a given
period. Place the recipe variables near the top of your .procmailrc
file and place the recipe just below your SpamAssassin recipe.
The variables:
USERNAME=reader
SPAMFOLDER=/spam/reader
DATE='/usr/bin/date +%Y-%m-%d'
The recipe:
:0:
* ^X-Spam-Status: Yes
$SPAMFOLDER/$USERNAME-$DATE
This is self-explanatory. Set the USERNAME variable to your username.
Set the SPAMFOLDER variable to the directory in which you will be
placing daily spam files and ensure that this path exists before deploying
the recipe. Set the DATE variable to the date command. Change
the arguments to /usr/bin/date to suit your needs. I set this one
to create a string of YYYY-MM-DD.
This concept is also great for role accounts. Email that arrives
for postmaster could be filtered and archived in the same manner.
It is often mail of this type that gets deleted without being read.
Archiving for a set number days, weeks, or months could be applied
and help put an end to wasting resources. It can also be distributed
with INCLUDERC commands in the global procmailrc file if you are
using SpamAssassin for all users on your server.
In conclusion, procmail is a great tool for systems administrators.
If you are creative enough, you can solve many problems using procmail.
I have only scratched the surface of some of the things that can
be done with it. There are many sites on the Internet that offer
free recipes to filter incoming viruses and spam. The procmail Web
site has many links at the bottom to get you started. Other resources
for information include the procmailex and procmailrc
man page, as well as your favorite search engine. I have been using
these resources and procmail for years and will continue to do so
for years to come.
References
IETF RFC 2045 (MIME) -- http://www.ietf.org/rfc/rfc2045.txt
Procmail -- http://www.procmail.org
SpamAssassin -- http://www.spamassassin.org
Sendmail -- http://www.sendmail.org
Man pages -- procmail(1), procmailex(5), procmailrc(5)
Kevin Shortt has been working in a Unix Systems Engineering
and Administration role with occasional programming for almost seven
years. He is a graduate of the University at Buffalo with two Bachelor
of Science degrees in Computer Science and Business Administration.
He enjoys computers, playing soccer, and spending time with his
wife and three children. He can be reached at: shortt@cgicafe.com. |