Implementing an Effective Abuse Management Process -- Part III

Luis E. Muñoz

In the two previous articles on abuse management, I described the Mail::Abuse package, discussed receiving and analyzing abuse reports, and showed how to configure the abuso script for the network. In this third and last part, I show how to test your setup so that Mail::Abuse automatically analyzes the abuse complaints sent to the contact addresses at your site.

You can download Mail::Abuse from the Comprehensive Perl Archive Network (CPAN) using this Perl command:

$ perl -MCPAN -e shell
cpan> install Mail::Abuse

Testing the Abuso Script

Testing is important, because the Abuse Management Process (AMP) is very visible and tells a lot about your organization. For starters, I suggest you triple-check the templates you use for all your answers. Make sure they do not contain typos or errors. Any email address you give must not bounce. All URIs in there must work. Do not use HTML for the auto-responses.

At this stage, we will do some simple testing. We will create a fake Radius detail file in /tmp for this and craft our own abuse reports to see how the abuso script responds with the configuration we set up previously. This will allow the results to be repeatable and make it easier to generate the test cases. Note that you can also grab a fragment of your real detail file for the testing, but then you would need to craft the test cases on your own:

#!/bin/sh

RADIUS_DETAILS=/tmp/radius/details

# Create the fake RADIUS detail file, with an entry for "now"
mkdir -p ${RADIUS_DETAIL}
echo 'date' > ${RADIUS_DETAIL}/detail
echo '          Acct-Session-Id = "35000004"
          User-Name = "bob"
          NAS-IP-Address =  192.168.6.1
          NAS-Port = 1
          NAS-Port-Type = Async
          Acct-Status-Type = Stop
          Acct-Session-Time = 144000
          Acct-Input-Octets = 22
          Acct-Output-Octets = 187
          Acct-Terminate-Cause = Host-Request
          Service-Type = Login-User
          Login-Service = Telnet
          Login-IP-Host = 192.168.6.91
          Acct-Delay-Time = 0
' >> ${RADIUS_DETAIL}/detail

Because the detail file was created under /tmp/radius/details, you must change the configuration file accordingly. You can then create a few "fake" reports to test all this, using this script, which will leave the reports in files named report-new.<n> and report-old.<n> under the ./rep directory:

#!/usr/bin/perl

use strict;
use warnings;
use IO::File;
use NetAddr::IP;

my $rep = 1;

my $headers = q{From: <some.abuse@tld.INVALID>
To: <abuse@your.domain>
};

for my $ip (map { NetAddr::IP->new($_)->first->addr }
              qw(192.168.6.91 192.168.0/24 192.168.1/24 192.168.2/24
                 192.168.3/24 192.168.4/24 192.168.1.36 192.168.5.0/25
                 192.168.5.128/25 192.168.6/24 10/8))
{
    my $fh = new IO::File;
    $fh->open(">./rep/report-new.$rep");
    print $fh $headers, "Subject: New abuse report for $ip\n\n\n";
    print $fh "$ip on ", scalar gmtime,
    " did something nasty (virus)\n\n";
    $fh->close;
    $fh->open(">./rep/report-old.$rep");
    print $fh $headers, "Subject: Old abuse report for $ip\n\n";
    print $fh "$ip on ", scalar gmtime(time - 15*24*3600),
    " did something nasty (virus)\n\n";
    $fh->close;
    $rep ++;
}

Each report produced by this script will use an IP address (actually, the first address in the subnet), a dummy text, and a timestamp to compose an abuse report. Note that we fed this with all the interesting subnets that we know about, plus one we don't. The timestamps will be "fresh" or "stale" for the new and old reports, respectively.

And finally, you can check the process manually. Note that we will be specifying the processors in the command line so that Mail::Abuse::Processor::Explain can tell us what happened. Here I assume that you have the data table ready and in its proper place.

For brevity, I will show only one test. You should test all the cases produced by this script and verify that the explanation produced matches your test data:

$ abuso -c /usr/local/etc/abuso.conf -R Stdin -P Table,Radius,Explain
< ./rep/report-new.1

#================================================================
#Incident explanation by Mail::Abuse::Processor::Explain
#$Id: Explain.pm,v 1.2 2004/11/21 02:44:14 lem Exp $
#================================================================

# 192.168.6.91/32 January 21, 10:50:51 2005 (-0400)
+-radius
| +-radius.{Acct-Authentic}=RADIUS
| +-radius.{Acct-Delay-Time}=0
| +-radius.{Acct-Input-Octets}=22
| +-radius.{Acct-Output-Octets}=187
| +-radius.{Acct-Session-Id}="35000004"
| +-radius.{Acct-Session-Time}=144000
| +-radius.{Acct-Status-Type}=Stop
| +-radius.{Acct-Terminate-Cause}=Host-Request
| +-radius.{Login-IP-Host}=192.168.6.91
| +-radius.{Login-Service}=Telnet
| +-radius.{NAS-IP-Address}= 192.168.6.1
| +-radius.{NAS-Port}=1
| +-radius.{NAS-Port-Type}=Async
| +-radius.{Service-Type}=Login-User
| +-radius.{User-Name}="bob"
+-table
| +-table.{corp}
| | +-table.{corp}.{device}=Corporate Dialup Pool
| | +-table.{corp}.{manager}=me@my.domain
+-type=log/virus

#================================================================
#No more incidents to explain. The recovered report body follows.
#================================================================

192.168.6.91 on Fri Jan 21 14:50:51 2005 did something nasty (virus)

With the "old" incidents, you should get nothing back, because the simulated incident is older than the threshold specified in the configuration file:

$ abuso -c ./abuso.conf -R Stdin -P Table,Radius,Explain <
./rep/report-old.1
$

Your results might be slightly different but, in general, you should observe incidents being recognized in reports matching your address space, within the time window allowed, and no incidents otherwise. If you find a situation that you do not understand, you can invoke the verbose and debug mode. Also, some modules allow the setting of a "debug" flag in the configuration file.

After this simple test, you can point your configuration files to the real detail files and "fake" some more reports to test as shown. Make sure that your real information can be used by abuso after going further.

Archiving the Abuse Reports

Regardless of the details of your specific AMP, it is likely that accessing the abuse reports is an important part of it. abuso can help you archive your reports along with all the relevant data gathered during the analysis phase.

Mail::Abuse::Processor::Store is the module to achieve this. Let's test this with an "old" report first:

$ abuso -c /usr/local/etc/abuso.conf -R Stdin -P
Table,Radius,Explain,Store < ./rep/report-old.1
$

In our site, this command stored a processed abuse report under arch/empty/. The filename is a hash made from the contents of the report, so processing the same report will overwrite the previous results. This is actually desirable in practice to reduce the chance of replicating the same reports.

The processed abuse report is actually a binary file that contains everything that was within the abuse report object managed by abuso. You can get to the information using the acat command. But before that, let's process a "fresh" report to get samples to work with:

$ abuso -c ./abuso.conf -R Stdin -P Table,Radius,Explain,Store <
./rep/report-new.1
...
$

This report should end up at the end of a hierarchy of directories depicting the current date under arch/. This is different, because, in the first case, no valid incidents passed the filters applied to the abuse report. We'll call this an "empty" report. The latter case produced one incident whose actual date was used to store the processed report.

Empty reports usually are spam or malformed messages, so you may want to take a look at those. You can do this with a tool provided with Mail::Abuse called "acat" (short for abuse-cat). The acat utility has a few options to get to specific information bits of the processed abuse reports. The easiest way to invoke it is like this:

$ acat arch/2005/01/21/b92aa9b9405b20be346febfebff8c291
From: <some.abuse@tld.INVALID>
To: <abuse@your.domain>
Subject: New abuse report for 192.168.6.91

192.168.6.91 on Fri Jan 21 14:50:51 2005 did something nasty (virus)

$

Without any options, the original report is sent to STDOUT. You can also inquire about the "incidents" found with the -i option, as in:

$ acat -i arch/2005/01/21/b92aa9b9405b20be346febfebff8c291
arch/2005/01/21/b92aa9b9405b20be346febfebff8c291: [0] Fri Jan 21
10:50:51 2005, Mail::Abuse::Incident::Log: data=192.168.6.91 on Fri
Jan 21 14:50:51 2005 did something nasty (virus)\n\n
ip=192.168.6.91/32 radius={ Acct-Authentic => RADIUS Acct-Delay-Time
=> 0 Acct-Input-Octets => 22 Acct-Output-Octets => 187
Acct-Session-Id => "35000004" Acct-Session-Time => 144000
Acct-Status-Type => Stop Acct-Terminate-Cause => Host-Request
Login-IP-Host => 192.168.6.91 Login-Service => Telnet NAS-IP-Address
=> 192.168.6.1 NAS-Port => 1 NAS-Port-Type => Async Service-Type =>
Login-User User-Name => "bob" } table={ corp => { device =>
Corporate Dialup Pool manager => me@my.domain } } time=1106319051

type=log/virus

It's all in there in a big chunk. But often, you will want to access a specific bit of information. Let's say you're interested in the specific username. For this, you can use the -m option. In this example, we will get the username associated with the cause for this abuse report:

$ acat -m radius.User-Name
arch/2005/01/21/b92aa9b9405b20be346febfebff8c291
   arch/2005/01/21/b92aa9b9405b20be346febfebff8c291 [1]:
radius.User-Name="bob"
$

You can access any other attribute, such as the type of incident, manager name, etc. Let's get the username, manager, and the device name:

$ acat -m radius.User-Name:table.corp.device
arch/2005/01/21/b92aa9b9405b20be346febfebff8c291
   arch/2005/01/21/b92aa9b9405b20be346febfebff8c291 [1]:
radius.User-Name="bob" table.corp.device=Corporate Dialup Pool
$

If you have proceeded this far and all has been well, then it's time to have abuso take a peek to your real abuse complaints. We previously left those piling up in a directory via a nice procmail recipe. If you really need Mail::Abuse, you're likely to have some "real" samples in that pile by now.

As a test, process a few real reports manually. Remember to change back your configuration to point to the real archive where you want to keep your real reports. We'll test as shown in the following example:

$ abuso -c /usr/local/etc/abuso.conf -R Stdin -P
Table,Radius,Explain,Store < ./archive/2005-01/msg.beef

#================================================================
#Incident explanation by Mail::Abuse::Processor::Explain
#$Id: Explain.pm,v 1.2 2004/11/21 02:44:14 lem Exp $
#================================================================

# 192.168.1.17/32 January 21, 08:29:23 2005 (-0400)
+-table
| +-table.{corp}
| | +-table.{corp}.{manager}=fred@my.domain
| | +-table.{corp}.{purpose}=1st floor workstations
+-type=log/spam
# 192.168.1.17/32 January 21, 08:31:41 2005 (-0400)
+-table
| +-table.{corp}
| | +-table.{corp}.{manager}=fred@my.domain
| | +-table.{corp}.{purpose}=1st floor workstations
+-type=log/spam
# 192.168.1.17/32 January 21, 08:31:42 2005 (-0400)
+-table
| +-table.{corp}
| | +-table.{corp}.{manager}=fred@my.domain
| | +-table.{corp}.{purpose}=1st floor workstations
+-type=log/spam
# 192.168.1.17/32 January 21, 08:32:37 2005 (-0400)
+-table
| +-table.{corp}
| | +-table.{corp}.{manager}=fred@my.domain
| | +-table.{corp}.{purpose}=1st floor workstations
+-type=log/spam
# 192.168.1.17/32 January 21, 08:38:37 2005 (-0400)
+-table
| +-table.{corp}
| | +-table.{corp}.{manager}=fred@my.domain
| | +-table.{corp}.{purpose}=1st floor workstations
+-type=log/spam


#================================================================
#No more incidents to explain. The recovered report body follows.
#================================================================
...

Great. For this example, it looks as if abuso could find likely sources of the problem. I omitted the actual abuse complaint to save space, but it would have followed the "explanation". Note that, in itself, this tool is already a time-saver provided that you have detailed information about your network that you can place in the source table for abuso to correlate automatically.

This report was also stored, so we can take a look at any field we care about. Say we want to know some more information:

$ acat -m table.corp.purpose:table.corp.manager:type:ip
archive/2005/01/21/5ce4e5946390a21012d5033907279029
   archive/2005/01/21/5ce4e5946390a21012d5033907279029 [1]:
table.corp.purpose=1st floor workstations
table.corp.manager=fred@my.domain type=log/spam ip=192.168.1.17/32
   archive/2005/01/21/5ce4e5946390a21012d5033907279029 [2]:
table.corp.purpose=1st floor workstations
table.corp.manager=fred@my.domain type=log/spam ip=192.168.1.17/32
   archive/2005/01/21/5ce4e5946390a21012d5033907279029 [3]:
table.corp.purpose=1st floor workstations
table.corp.manager=fred@my.domain type=log/spam ip=192.168.1.17/32
   archive/2005/01/21/5ce4e5946390a21012d5033907279029 [4]:
table.corp.purpose=1st floor workstations
table.corp.manager=fred@my.domain type=log/spam ip=192.168.1.17/32
   archive/2005/01/21/5ce4e5946390a21012d5033907279029 [5]:
table.corp.purpose=1st floor workstations
table.corp.manager=fred@my.domain type=log/spam ip=192.168.1.17/32

As you can see, any number of simple scripts can find this information, grep for relevant data, and route the results to the appropriate person.

At this point, we're ready to start processing the abuse reports. We will use cron for this task. Depending on how much data you have in your RADIUS detail files or actual tables, the speed of your machine, and the classes of your typical abuse reports, abuso will take more or less time to complete. As a general rule, you can assume that during U.S. business hours, you will get roughly three times more abuse reports. This information can help you decide how often to run the abuso process. You do not want more than one instance running at the same time, unless you can be sure they are crunching on different sets of abuse reports.

This script makes it easy to process pending abuse reports. We will assume that you will run abuso each hour. I recommend modifying the abuso.conf file to remove the email processor while testing is performed:

#!/bin/sh

ARCHIVE_BASE=$HOME/archive/
LIST=/tmp/complaint-list.$$
AGE=-60

find $ARCHIVE_BASE -type f -cmin $AGE > $LIST
cat $LIST | xargs abuso -c /usr/local/etc/abuso.conf && \
  cat $LIST | xargs rm -f
rm -f $LIST

We call this script "process-abuse", and we can add it to any reasonably modern crontab file:

7 * * * * $HOME/process-abuse

This will run the abuso process and get rid of the processed reports at the seventh minute of every hour. I like to run hourly tasks at different "minutes", so that they are easier to tell apart. Also, this spreads the load, reducing bottlenecks.

You could also arrange for periodic processing of the reports every hour and remove the processed reports periodically. This is entirely up to you, but I've found that the method shown works very well and tends to save more disk space.

Now you have two choices. Either flip a few pages in this magazine and read another interesting article for a couple of hours, or you can run $HOME/process-abuse manually. No matter which you choose, once the script has been run, pending abuse reports will have been analyzed and moved out of the way. The .procmailrc file will have forwarded us any errors, so they will be sitting nicely in our mailbox.

Double-check the response messages specified in your configuration file, and once you're satisfied, add Email to the list of processors in your abuso.conf file. Do not forget to test that email actually goes out of your box by processing a few reports manually.

If all has gone well, now you have a fully functional, automatic abuse report analysis tool that will help you do your job and also let your fellow sys admins know about your efforts.

Conclusion

All this work has accomplished something that has been proven many times to be a real life saver in an AMP: automated correlation of the incident source and the incident itself. Although important, keep in mind that this is only a part of the AMP. You still have to work at assuring reliable data about the incident source, such as the IP space maps for your network, DHCP/RADIUS logs, and other pieces of information.

You also need to devote time and effort to dealing with the incidents, which can now be correlated more easily. Keep in mind that the ultimate way to avoid abuse complaints is to stop the abuse as fast as you can. Check periodically for new versions of all the software you use, including Mail::Abuse. Remember that there are people using this same software, who may be solving the same problem you have or that you don't know you have.

Luis has been working in various areas of computer science since the late 1980s. Some people blame him for conspiring to bring the Internet into his home country, where currently he spends most of his time teaching others about Perl and fighting network abuse at the largest ISP there. He also believes that being a sys admin is supposed to be fun.