jun2005.tar

Implementing an Effective Abuse Management Process -- Part I

Luis Muñoz

As a systems administrator, you likely have an important role in keeping your network secure. This is a complex and often resource-intensive task in today's networks. Two important activities that are basic to ensuring security are enforcing the compliance of established site policies, and maintaining an incident response process.

Many security incidents are in essence a transgression of site policies, commonly referred to as an "abuse" in Internet parlance. Those instances of abuse must be dealt with, which is where an incident response process becomes relevant.

A good Abuse Management Process (AMP) combines compliance enforcement with incident response in a way that is scalable, reliable, cheap, and swift. In this series of three articles on abuse management, I will outline some of the lessons I learned about building an AMP from scratch, including some tips and tricks that fellow sys admins might find useful.

You should be aware, however, that there isn't a universal AMP. Different organizations have different resources, needs, skills, and objectives. If we break down the AMP, though, we'll find a few sub-processes that apply to a wide range of operations. Figure 1 shows the four sub-processes that I will highlight in this series.

One of the challenges often faced by sys admins and security professionals is the justification of proper resources to implement the company's AMP. The resources needed vary greatly from site to site, based on factors such as the size and purpose of your network, the number and expertise of your end users, and how long you have been operating without a proper AMP.

I will show you how to build the basic software infrastructure to support a reasonably automated AMP, which can easily be the basis for much more complex response processes. It can also serve to estimate the workload your staff would have to deal with. You're likely to have the required hardware resources in your network already, anyway.

Throughout this series, I will be using Mail::Abuse extensively. Mail::Abuse is a Perl package that I wrote to help with the AMP for a large ISP where, along with a few scripts and a mostly automatic process, it keeps the abuse under control. In this first article, I'll describe the Mail::Abuse package and focus on receiving and analyzing abuse reports.

Receiving Abuse Reports

This is the first and most important part of the AMP. After all, you cannot manage your abuse reports if you're unable to get them on time. This process, however, is not as simple as setting up an email account and reading your reports a couple of times a week.

To begin, there are the issues of consistency. Fellow sys admins and security staff will use two sources of information to report abuse originating from your network -- WHOIS information, and the well-known abuse@domain address. You do want to make it easy to talk to your anti-abuse team. If you're not convinced, go look at http://www.rfc-ignorant.org/ for an example of the prestige you can bring to your organization by having a contact address that bounces with a permanent error.

The WHOIS information for both your domain and your address space is relevant. Make sure that all the information is as current and accurate as possible. Have your RIR or ISP list proper WHOIS information for all the IP space you've been assigned. Make sure your domain has a working abuse@domain address as instructed by RFC 2142.

Using aliases to route the abuse reports to various sys admins is very common practice. Note, however, that this setup increases the chances of bouncing for the email sent to the abuse@domain address. For instance, if you have 10 recipients in the abuse@domain alias, then the likelihood of getting a bounce from a full mailbox is 10 times higher. Bounced mail tends to be interpreted as a need to improve the AMP of a given site.

Also, you must think about bypassing anti-virus software and other restrictions into your abuse@domain address. This will require increased security measures for the team members who have to handle the reports, but it is something that many abuse teams expect.

Remember to configure reverses (IN PTR DNS Resource Records) for all of your IP space. Your domain name must be used in those reverses so people wanting to talk to you about abuse can do so. Note also that, to many sites, operating your network without proper reverses configured is a sign of a severe clue deficit on your part.

I recommend setting up abuse@domain and the rest of your published contact addresses as aliases pointing to a real email account and setting a mail quota as high as you reasonably can. It might also be a good idea to have different email addresses for different sources (e.g., one for domain WHOIS information, another for IP space WHOIS information, etc.). However, always configure your abuse@domain address.

The email should be delivered to a mail folder without file locking for increased performance. Using procmail as the delivery agent (or through a .forward file) eases this task. Here is a .procmailrc file that I use to accomplish this:

SHELL=/bin/bash
DATEDIR=$HOME/archive/'date +%Y-%m'
DUMMY='test -d $DATEDIR || mkdir -p $DATEDIR'
UMASK=022

# Refuse mail from myself
:0
* ? formail -x"From:" -x"Reply-To:" -x"Errors-To:" \
         | egrep -i 'thisaccount@[^, ]*domain'
/dev/null

# Refuse bounces
:0
* ? formail -x"From:" | egrep -i MAILER-DAEMON
/dev/null

# I want to see cron mail
:0
* ? formail -x"From:" | egrep -i Cron
!my_real_email_address@domain

# Keep a backup of the message safely stored.
:0
$DATEDIR/

This creates a dated sub-directory in Maildir format within ~/archive, where messages can be easily processed by our friend, the find command. So far, this has been all very easy, so let's move on to more complex tasks.

Installing Mail::Abuse

Mail::Abuse is written in Perl. You will need a recent version, although any version after 5.6.0 will do. You can download Mail::Abuse from the Comprehensive Perl Archive Network (CPAN) using this Perl command:

$ perl -MCPAN -e shell
cpan> install Mail::Abuse

Installing modules with this procedure is usually easier because all dependencies are taken care of. If all goes well, you should have the module and its utilities installed at your machine after a few minutes. Then you will be ready to move to the rest of this article.

Analyzing the Abuse Reports

Abuse reports come in a wide variety of flavors. Some include lengthy descriptions, vast fragments of the logs, disclaimers, and text in various languages. Some are tailor-made from scratch by a fellow sys admin. Others are produced by automated processes. However, even with this complexity, many interesting things can be accomplished with a rather simple methodology.

To begin, let's define some terms and concepts. The first one will be the "report", which is the name we will assign to each and every message sent to our abuse mailbox. But let's face it; you will also get spam, clueless abuse reports, and other niceties in that mailbox. We need a bit more definition.

We will classify reports as "valid" or "invalid". A "valid report" contains, in essence, a set of tuples (attack source, type, timestamp, victim). The victim is really not that important, as we're concerned with identifying the source, which is the part within our network. By discarding this from the tuple, we're left with (attack source, type, timestamp). We will call each of these an "incident".

An "invalid report" will be any message from which we're unable to extract incidents. How to deal with these messages is up to you, according to your AMP.

Incidents can be further classified according to the type of activity to which they refer. This may become a very useful tool for investigations, mitigation, and trend reporting.

With all these taxonomy concepts now behind us, defining the analysis stage is much simpler. In this stage, we simply extract all possible incidents from the valid reports. To do this, we start with the abuse reports that were placed in a Maildir folder. Each report is stored in an individual file, under the ~/archive directory, which uses a date to separate reports according to the date in which they are received.

Before actually dealing with these reports, there are a few steps we need to take. I'll describe them while we allow more reports to accumulate in our archive.

Preparing the Data Sources

One of the challenges of automating the analysis of messages has to do with figuring out when a report really makes sense. In my experience, an automated tool can achieve impressive success rates if fed with adequate information about your operation.

The configuration file includes rules about what IP space you want to work with and on which time window. For example, this allows you to ignore reports for networks outside of yours (yes, sometimes people report abuse to the wrong party) or that refer to an incident that is so old there is no way to guess the real source anyway.

The Mail::Abuse distribution comes with two sample configuration files:

./etc/abuso.config
./etc/sample.conf

Abuso.config is a bare-bones configuration file you can use to get going. Sample.conf, as its name implies, is a sample that shows most of the interesting things that the different modules can do. As the comments in both files tell you, you should read the documentation for the modules you intend to use. This can be done easily enough, with a command such as:

$ perldoc Mail::Abuse

For reference, these are the modules we'll use in this article (but feel free to read the documentation for the rest of the modules and components):

Mail::Abuse -- This is base module, which includes a "start here" type of documentation. It is listed here just as a reminder of what a good thing reading the documentation really is, but we won't be using it directly in this article. Note, however, that it should be updated frequently, and features will likely be added. Always try to run the latest version available on CPAN.

Mail::Abuse::Incident::Log -- Mail::Abuse includes a number of "incident parsers", that is, specialized parsers that look for specific kinds of incidents. Code is included to check for Received: headers, SpamCop reports, etc. This is mostly interesting from a statistical point of view. But in practice, you want to extract the information from the reports and deal with it as soon as possible. This module is just for that.

Mail::Abuse::Incident::Log -- This module knows how to recognize an IP address, a timestamp in various formats, and a few keywords that give an idea of what this report is about. It then processes the report a few lines at a time and correlates all the timestamps and IP addresses found. Each correlation is then promoted to an incident and passed to other modules to work their magic.

Mail::Abuse::Incident::Normalize -- We've all been witnesses to the HTML-email debate, but no matter where our own preferences lie, we will be getting abuse reports in HTML. We'll also get forwarded abuse complaints. And abuse complaints within attachments. And, well, you get the idea.

This module tries to flatten the abuse report. If a report arrives in HTML, it will convert it to plain text. If the text is indented or improperly wrapped, it will correct it most of the time. You should always include this module.

Mail::Abuse::Filter::IP and Mail::Abuse::Filter::Time -- It is entirely likely that incidents found by the parsers are either bogus or unable to be processed because they do not belong to our network, are too old, etc. These two modules allow for the removal of incidents that occurred outside of a specific time window or that refer to a source outside of the desired network. Essentially, these modules eliminate some noise from this process.

Mail::Abuse::Processor::Store -- In many cases, you will want to archive the processed report, along with all the data about it, for later analysis. This is the job of this module.

With this module, you can place the processed reports, along all the incidents found and some more data, in a single file in the filesystem. This allows for later access to the analysis results and even a peek at some of the relevant data that was used.

Mail::Abuse::Processor::Mailer -- In many cases, abuse departments use a plain auto-responder to acknowledge the message. The message returned is static and does not inform the sender whether the time she took to make the report was well spent or not.

With this module, you will be informing the third party that the message was received and whether it allowed for the identification of an incident. This is a nice place to include some instructions about how to properly submit the report (do not use HTML, do not send attachments, etc.).

You will need two auto-responses. The first is needed to confirm that indeed the report was received and incidents were recognized. The second tells the submitter that you received a report but its data was not recognizable. I will refer to these responses as good-response.txt and bad-response.txt.

Mail::Abuse::Processor::Radius -- Some administrators are responsible for operating networks where RADIUS is used to grant access through VPNs, dial-up modems, DSL routers, etc. This module can read and interpret various versions of access logs and match them with the data from the current incidents to identify the source of the offending activity back to a username.

Mail::Abuse::Processor::Table -- In many cases, a network will be split into different subnets for different purposes, such as servers, dynamic allocations, Web hosting, etc. These allocations can be fed into a table, which will then be used by this module to match incidents against. This is useful to map abuse complaints to networks where you do not manage the IP address assignment and do not have access to more detailed logs.

Mail::Abuse::Processor::Explain -- In our application, we will assume that the flow of abuse complaints is small enough that we can forward the processed reports to an email address. This module places the information about the incidents found, as well as the processed report, conveniently in the process's STDOUT so other parts of our pipeline can grab it and forward to the proper email address.

Other options include automating other parts of the process via scripting. This is what I've done in networks where no effective AMPs previously existed.

Mail::Abuse::Reader::Stdin -- This module is responsible for bringing the report in. Mail::Abuse ships with a POP3 reader, a Google Groups reader, and this Stdin reader. This allows for the extraction of reports from various sources. In our setup, we will pass the reports using this module because there will be a shell pipeline feeding the reports from the archive into abuso.

For security reasons, you should not run this code as root in production. I recommend setting up a dedicated account for this whole process and then adjusting the permissions accordingly.

Conclusion

We now have the basics in place for a good AMP. The software is installed and the complaints are being archived conveniently for processing, which actually comes in handy because we'll need a few examples to work with during the next part of this article, where I'll discuss how to configure and customize the Mail::Abuse software to our network. Later, I'll discuss how to archive and extract some basic statistics out of the abuse reports to learn more about the kind of trouble that our network might be causing others.

Luis has been working in various areas of computer science since the late 1980s. Some people blame him for conspiring to bring the Internet into his home country, where currently he spends most of his time teaching others about Perl and fighting network abuse at the largest ISP there. He also believes that being a sys admin is supposed to be fun.