jan2003.tar

Autosniff -- A Sniffer-Starting Daemon

Ed Ravin

A network sniffer, a program that captures data on a network connection, is a wonderful tool for solving network problems. The only thing wrong with a sniffer is that someone has to start it before the problem happens, and stop it after the problem has been reproduced. This isn't an issue if the person with the network problem (i.e., the "customer") is in the same office with you or can call you at your desk when they're ready to test with you. In a support environment, however, your customers might be on the other side of the world, working while you're asleep or out of the office. Or, they might be using dial-up connections that have a different IP address every time they call, making it hard to focus on their traffic until they're actually dialed in.

One way to capture the network activity from these situations is to start the sniffer and have it capture an entire class of traffic (e.g., all mail traffic). Then, when you hear from the customer that they've reproduced the problem, stop the sniffer, and sift through the trace file looking for the customer's traffic.

This procedure is fraught with danger -- for example, you might capture too much data and run out of disk space, interfering with other programs on the sniffer machine. When you do find the customer's data, you might have ten packets showing the problem and ten thousand packets that are the customer's normal traffic, with no easy way to tell which is which. And then, there's the issue of privacy -- unless you can zoom in on one customer's traffic, and only the traffic that constitutes their network problem, you will end up sifting through data that you don't want to see, like your customer's passwords, mail, and other private information.

This article describes a set of shell scripts that will let your customer start the sniffer, record a trace of the problem, and automatically mail you the results when the data capture is complete.

Autosniff listens on your network for a "trigger event", which is a packet that is unlikely to appear in normal use. After the trigger event occurs, Autosniff starts the sniffer for a specific time interval, listening for packets from the host that sent the trigger (presumably the customer). When the time interval is over, Autosniff stops the trace and sends email to remind you to pick up the trace file.

Generating the trigger event presents a bit of a challenge: it needs to be something simple enough that it can be generated by a non-technical user, on any operating system, and with simple instructions from the sys admin. However, the trigger event also must be reasonably unique, something that doesn't usually appear on the network. I chose to have the customer open a connection to an unlikely port number on the destination host as the trigger event.

For example, suppose I am a network administrator at the "example.com" company and my customer is having trouble using the server "mail.example.com". Let's further suppose the customer is using a Windows or Mac program that supplies cryptic error messages that are of little or no help in diagnosing the problem. I would rather have a network trace to work with, so after starting Autosniff, I give the customer these instructions:

1. At the command line, type in "telnet mail.example.com 64000", or give your Web browser the URL "http://mail.example.com:64000".

2. You'll see an error message of some kind. Just ignore it.

3. Reproduce the problem with your mail program.

4. For the next five minutes, refrain from doing anything that might contact mail.example.com, so we don't accidentally capture normal mail or other irrelevant information.

Every user that has either a Web browser or a telnet client should be able to generate the trigger event. And unless mail.example.com is getting a full portscan every day, chances are that no one else will try to open up a connection to that host on port 64000. Nevertheless, the port number must be chosen very carefully -- make sure that you have no services running on that port and that it is not commonly scanned by those random probes that every Internet host receives on a daily basis.

Autosniff was designed to be sys admin friendly. Here's how the "example.com" sys admin might invoke it for a mail server problem:

# autosniff
Enter a short alphanumeric name for this job: JohnDoe
Enter an (optional) description: user can't authenticate to sendmail
Hostname that customer will use to trigger autosniff: mail.example.com
Port number to use for trigger: 64000
Capture filter to use once triggered [ip]: port 25
Number of minutes to run sniffer after trigger [5]:
Mail address for notifications [root]: helpdesk@example.com

Some of Autosniff's questions have default answers (shown inside the square brackets); hit a blank return to use the default. The job name and description are for identifying this particular job: you might have multiple Autosniff jobs running at the same time, and you'll need to be able to tell them apart. This information will be contained in all emails sent by Autosniff.

As described previously, you'll need to pick a trigger hostname and port number. Usually, you'll use the same host that the customer will be connecting to and an oddball port number that is not in use. If you have another Autosniff job waiting to be triggered, don't use the same port number again, because the first customer who calls in will trigger both jobs simultaneously.

You also need to pick a capture filter for focusing on the customer's traffic after the trigger event occurs. The default filter is "ip", that is, all IP traffic to and from the IP address that sends the trigger event. In this example (because we're trying to solve a Sendmail problem), we limit the traffic to port 25, the port that Sendmail listens on.

After you've typed in the trigger information and the capture filter, Autosniff will perform a "sanity check" to catch any syntax errors or bad hostnames. If an error is found, you will be asked to re-enter all the network-related information. Finally, you will be asked how long to run the sniffer once it is triggered and to which email address the notifications should be sent.

Once Autosniff is running, it will send you an email to let you know it is waiting for a trigger event. The email will also contain a reminder of what information to send the customer. Note that there is no timeout when waiting for a trigger -- Autosniff will keep running until you manually kill it or reboot the machine.

After Autosniff is finished, it will send another email describing what happened (i.e., whether any data was captured) and the name of the file that has the results (the packet capture file).

Installing Autosniff

Autosniff should run "out of the box" on any Solaris or Linux host, and with only minor tweaking on other UNIX platforms that have tcpdump installed. You can download Autosniff from:

http://www.samag.com/code/

After you've unpacked the Autosniff tarball, you'll find three files (see Listings 1-3):

autosniff
autosniffd
autosniff.conf

To install, put "autosniff" and "autosniffd" somewhere in your path, typically /usr/local/sbin. Then, copy "autosniff.conf" to the /etc directory and customize it as needed for your local site. There are only a few items at the top of the file to worry about, all marked with the comment "###CUSTOMIZE###" in the file:

The DEFAULTMAIL variable -- The default choice for where to send Autosniff's notifications.
The AUTOSNIFFD variable -- This contains the path to autosniffd. If you installed both autosniff and autosniffd in the /usr/local/sbin directory, then you can leave this alone.
The IPARG variable -- If autosniff complains of a "trigger IP address failure", it is probably because the output of tcpdump varies between versions, and autosniff was unable to find the field in tcpdump's output that has the source IP address. To fix this, run this quick test, which will sniff one IP packet:
```
# tcpdump -n -c 1 ip
```
and note which space-delimited field in the tcpdump output contains the source IP address. In the versions of tcpdump I've tested, it's either been in the fourth space-delimited field (on Linux), or the second (on NetBSD).
The TCPDUMP_OVERRIDE variable -- Solaris users who want to use tcpdump (rather than snoop, the sniffer program that is supplied with Solaris) should set this to the path of the tcpdump command. You might want to do this if you prefer your packet capture files to be in the pcap (tcpdump) format instead of the Solaris snoop format.

Which Host Should Run Autosniff?

When you run Autosniff, it should always be on a host that will be in a position to capture the traffic between the customer and the host(s) with which the customer is communicating. If the problem is limited to one UNIX machine, then the easiest method is to run Autosniff on that machine.

If the problem affects multiple machines, or if you're not sure which machine the customer will be communicating with, you can still use Autosniff. However, you must ensure that the machine on which Autosniff runs can "see" all the machines on the network that you are looking for. Before switched networks became common, this was never an issue -- all traffic on any particular LAN segment was visible to every computer on the segment. With modern switches, however, a host only sees traffic that is specifically destined for that host. Most middle and high-end network switches support a "monitor port", where you program the switch to "mirror" packets from other ports (or perhaps all ports) onto a particular port. You will need to be intimately familiar with your network gear to take full advantage of this feature.

How Autosniff Works

Autosniff has two parts, both shell scripts -- a user interface program and a shell script daemon. The user interface prompts you for the information needed to start the capture, double-checks your input, and then starts the daemon with the "nohup" command so it will continue to run after you've logged out.

The daemon first runs tcpdump (or snoop if you're on a Solaris host) with a filter expression for the trigger event (like "host mail.example.com and port 64000") and the "-c 1" option, which tells tcpdump to only capture one packet and then exit. When tcpdump exits, Autosniff tries to read the capture file left behind. If all is well, it will have in it one packet with the source IP address currently in use by your customer.

Then, armed with the customer's IP address, Autosniff starts tcpdump again, with a new filter expression formed of the customer's IP address and the filter that you specified for focusing on the customer's problem. Autosniff puts the tcpdump job in the background and then sleeps for five minutes (or whatever timeout you've given it). When the sleep is over, Autosniff kills the tcpdump, makes sure there's some data in the sniff file, and mails you the results.

Conclusion

At my shop, Autosniff has taken much of the pain out of using a sniffer to diagnose customers' networking problems. The simple interface makes it possible for most of our support staff to start the sniffer and later call in a more experienced technician for diagnosing the packet capture file.

References

Ethereal is a high-quality sniffer and packet display program -- you'll find it useful for decoding files captured by Autosniff: http://www.ethereal.com.

tcpdump is supplied with many operating systems (such as Linux and *BSD systems), but you should make sure you have the latest version because of the occasional security bug: http://www.tcpdump.org.

Ed Ravin has been helping computers talk to each other for the past 15 years or so. Currently, he is a systems administrator at Panix, a small but friendly Internet service provider in New York City that caters to shell users and other technically savvy customers. Ed is also the co-author of Using and Managing UUCP, published by O'Reilly & Associates. He can be reached at: eravin@panix.com.