jun2005.tar

Remote Logging with SSH and Syslog-NG

Hal Pomeranz

One of the points I make repeatedly in my training classes is the value of centralized logging. Keeping an off-line copy of your site's logs on some central, secure log server not only gives you greater visibility from a systems management perspective but also can prove invaluable after a security incident when the local copies of the log files on the target system(s) have been compromised by the attacker.

The difficulty is that the standard Unix syslog daemon uses unauthenticated UDP messages to transmit log messages to remote servers. This makes drilling holes in your firewalls to accept syslog messages from remote locations very undesirable, to say nothing of the security implications of having critical system log messages traveling in clear text over public networks. Use of IPSec or some other strong VPN product can certainly help mitigate these concerns, but if all you care about is obtaining logging information from some remote site, then firing up a full-bore VPN session may seem like overkill.

However, the fact that UDP is not a guaranteed delivery protocol also means that important log messages can be dropped entirely. While lack of guaranteed delivery can be a factor for syslog messages in LAN environments, the risk becomes much greater when trying to drive remote log messages across highly congested public networks. Simply using a VPN to protect the security of the remote log stream does nothing to address the guaranteed delivery concern. This is where Syslog-NG becomes attractive, because two Syslog-NG servers can share remote logging information using TCP rather than UDP. And, once you're logging via TCP, then it is also possible to tunnel this TCP communication via SSH rather than firing up a full VPN -- the "best of both worlds" if you're looking for a quick and dirty solution.

The rest of this article covers the basic configuration for establishing an SSH tunnel between two servers and configuring Syslog-NG at both ends to communicate log messages down this tunnel. Because Syslog-NG is capable of both accepting UDP-based log messages from standard Unix syslog daemons as well as forwarding those messages to another machine, it is possible to set up a single Syslog-NG server -- which acts as a collector and relay for the log messages generated by all machines at that location -- at a remote site. Such a configuration is largely outside of the scope of this article although I'll give you some pointers in that direction as we go along.

Start with SSH

The first step is to get the SSH tunnel set up between the two machines. My personal preference is to originate the SSH tunnel on my central "loghost" machine at the primary site and have it connect to the machine at the "remote" site from where I want to get logs. Typically, this involves drilling out through the firewall at the primary site -- often the site's default firewall rules will allow this connection without any reconfiguration -- and allowing the connection "inward" through the firewall at the remote end, which usually requires some firewall ruleset tweaks on the remote site's firewall.

However, since we want the remote log server to be sending logs back to the central loghost at the primary site, we need to use a reverse tunnel (that's the -R option on the SSH command line) to get things working properly. This is actually one of very few places where I find reverse tunnels to be useful. Figure 1 shows a high-level picture of how the traffic is flowing in this design.

We need to make sure that the SSH session and tunnel are set up automatically when the central log host boots. If the SSH session dies for some reason (intermittent network outage, systems administration "accident", etc.), we'd also like the connection to be re-established as quickly as possible. In situations like this, I like to have the init process fire off the SSH connection with a line similar to the following in /etc/inittab:

log1:3:respawn:/usr/bin/ssh -nNTx
   -R 514:loghost.domain.com:514
   remote.domain.com >/dev/null 2>&1

The example above must appear as a single long line in /etc/inittab -- I've just broken it onto multiple lines for clarity.

Let's examine the SSH command line first. The -R 514:loghost.domain.com:514 on the second line of the example sets up the reverse tunnel from 514/tcp on the remote server to loghost.domain.com:514 -- in other words, port 514/tcp on the central loghost machine. While it seems natural to use 514/tcp for Syslog-NG logging, remember that 514/tcp is the reserved port for the Unix rlogin/rsh service, so you're going to run into a port conflict if you still have these services enabled. I generally turn off unencrypted network protocols like telnet, FTP, and rlogin/rsh/rcp on my servers and use SSH instead, so it's not an issue for me, but you can run this tunnel over any free ports if there is a conflict at your site.

As for the other SSH command-line options shown in the example, the -n flag tells SSH to associate the standard input with /dev/null. There won't be any command-line input because we're essentially going to be running the SSH client as a daemon via init. As you can see at the end of the command line in the example, we're also sending the standard output and standard error to /dev/null as well (... >/dev/null 2>&1). Since we're never going to be issuing remote commands via this SSH connection (we only care about the tunnel), the -N option to SSH tells the SSH client just to set up the tunnel and not to bother preparing a command stream for issuing commands on the remote system. Option -T, meanwhile, says not to bother allocating a pseudo-tty on the remote system. The -x option disables X11 forwarding, just as a defense-in-depth gesture.

Turning our attention to the rest of the /etc/initab entry, the first field (log1) is just an identifier for this entry in the inittab file. These identifiers can be any sequence of 2-4 alphanumeric characters; the only requirement is that they be unique from all other identifiers used in the file. I've chosen log1 here because it's usually the case that I have multiple SSH tunnels set up to different remote log sources, and I typically name the inittab entries log1, log2, etc. The second field in the inittab file (3) is the run level where this entry should be fired. Make sure to start this SSH process after the network interfaces have been initialized but before the Syslog-NG daemon has been started.

The respawn option in the third field is the reason I like to use init for spawning processes like this. When the respawn option is enabled, the init process will automatically fire off a new SSH process if the old one dies for any reason. In other words, init acts a like a "watchdog"-type daemon and makes sure that the SSH tunnel is always up and running. This is an extremely useful technique, but one that a lot of sys admins seem to have forgotten.

For the SSH tunnel to work, you must set up public key-based authentication between the central log server where the SSH process will be spawned, and the remote log server where the connection will be established. Use the ssh-keygen command to generate the key for the root account and make sure that you do not set up a pass phrase for this key, since it will have to be used at boot time by the automated SSH process being run from init. Obviously, you must be extremely careful to protect this key file from unauthorized use -- restrict access to your central log server to only other administrators and make sure the permissions on the key file is mode 600 and that the file is owned by root. Copy the public half of the key to the authorized_keys file in root's home directory on the remote server.

Once you have your inittab entry and SSH keys all set up, HUP the init process (kill -HUP 1). This should cause the init process to re-read the inittab file and spawn the SSH connection. You should be able to verify that the SSH client is running with the ps command and verify the existence of the tunnel using netstat. Once all that is working, we can start configuring Syslog-NG.

Configuring Syslog-NG

In general, configuration of Syslog-NG is well covered by Balázs Scheidler's reference manual[1] and Nate Campi's excellent FAQ[2]. So allow me to just present complete configuration examples for the main loghost and remote log server and point out the critical bits.

Let's look at the configuration for the main loghost:

  options { check_hostname(yes);
            keep_hostname(yes);
            chain_hostnames(no); };

  source inputs { internal();
                  unix-stream("/dev/log");
                  udp();
                  tcp(max_connections(100)); };

  destination logpile {
  file("/logs/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR$MONTH$DAY"
        owner(root) group(root) perm(0600)
        create_dirs(yes) dir_perm(0700)); };

  log { source(inputs); destination(logpile); };

As far as the options go, check_hostname(yes) forces Syslog-NG to do a little bit of sanity checking on the incoming remote hostname in the log message. In our destination directive, we'll be creating directories for each system's logs by hostname, and it wouldn't be good if an attacker could embed shell meta-characters in the hostname to cause us problems.

The keep_hostname(yes) option means to use the hostname that's presented in the actual message from the remote log server rather than using the hostname we get by resolving the source of the remote syslog connection. After all, since we expect remote messages to be coming down our SSH tunnel, the source IP address of these messages will be the loopback address (127.0.0.1), and having all messages tagged with localhost is not what we want.

chain_hostnames(no) causes Syslog-NG just to show the original hostname in the message rather than a chain of all the hops the message has made in reaching its final destination. This becomes a lot more relevant when you start relaying messages through multiple servers.

The inputs cover all of the various places from where we can get logging information. internal() is internal messages from the Syslog-NG daemon itself. unix-stream("/dev/log") is the normal /dev/log device that Linux systems use for local logging. Note that if you're on a non-Linux platform like Solaris, HP-UX, or one of the *BSD operating systems, then your local log channel will probably be very different. (Examples of appropriate configurations for various operating systems can be found in the Syslog-NG source distribution.)

Some sites actually run the vendor syslog in parallel with Syslog-NG rather than having to deal with the problem of emulating the standard vendor syslog interfaces -- the vendor syslog daemon can just relay messages to Syslog-NG via the standard UDP Syslog channel, even within the same machine. The udp() line means to listen on the standard 514/udp Syslog channel, and tcp() means to listen on 514/tcp for messages from another Syslog-NG server (or in our case, the SSH tunnel). Note that both the tcp() and udp() options accept the port() option to specify a different port. For example, if you wanted your Syslog-NG server to listen on port 5014/tcp to avoid conflicts with the rlogin/rsh daemon, you would write:

tcp(port(5014) max-connections(100));

Note also the use of the max_connections() option to increase the number of simultaneous TCP sessions the logging daemon can handle.

The destination clause allows us to specify a "log sink", or place where we want our logs to end up. Here we're using some built-in Syslog-NG macros to force incoming log messages to be divided out into directories -- first by hostname and then by year and month. Within each directory, messages will go into log files named for the syslog facility to which the message was logged (mail, auth, kern, local0, etc.), with each file having a date stamp attached. Notice that with Syslog-NG automatically creating a new file for each day of logs, we don't even need a separate log rotation program! This is just one more useful feature of Syslog-NG. The other options to the file() directive make sure that directories will be created as needed and set sensible ownerships and permissions on the newly created files and directories.

Once we define inputs and destination directives, we combine them into log declarations to actually tell the Syslog-NG daemon what to do with the incoming messages. Here we're just doing the trivial rule that sends all of our incoming messages from all sources into the log file directory hierarchy we defined in the destination directive above.

With the basic configuration of the central loghost out of the way, let's look at a sample configuration for the remote log server on the other end of the SSH tunnel. It's actually not too much different from the configuration for the central loghost:

  options { check_hostname(yes);
            keep_hostname(yes);
            chain_hostnames(no); };

  source inputs { internal();
                  unix-stream("/dev/log");
                  udp();
                  tcp(max_connections(100)); };

  destination logpile {
  file("/logs/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR$MONTH$DAY"
        owner(root) group(root) perm(0600)
        create_dirs(yes) dir_perm(0700)); };

  destination remote { tcp("localhost"); };

  log { source(inputs); destination(logpile); };
  log { source(inputs); destination(remote); };

Basically, we have just added an additional destination directive and an additional log directive. The remote destination says to log via TCP to localhost using the default port 514/tcp (because we didn't specify an alternate port). localhost:514 should be the location of our reverse tunnel endpoint. Note that if you used an alternate port for the tunnel endpoint, you can specify it:

  destination remote { tcp("localhost" port(5014)); };

The first log declaration keeps a local copy of all log messages received in a directory structure on the remote log server that parallels the one on the central loghost. The second log directive also relays a copy of all messages back to the central log server via the SSH tunnel. It's up to you whether you keep a local copy of the logs on the remote log server, but most likely the admins at the remote site will appreciate having this copy of the logs.

Note that in the inputs section above, we've configured the standard udp() input for normal UDP syslog messages. This means that other hosts at the remote site can send syslog messages to the remote log server, and those messages will be relayed by the Syslog-NG server back through the SSH tunnel to the central log host at home base. We've also configured the remote log server to listen for messages on the tcp() input channel. Maybe there are other Syslog-NG servers at the remote location, or perhaps there is an SSH tunnel from the remote log server to some other remote site and we're chaining log messages through multiple hops!

Conclusion

I think you'll find this an easy little recipe to implement, yet it achieves a very large goal. Of course, once you have this big pile of logs, you're going to want some sort of tool that actually reads the logs for you and sends you the "interesting" events. You could use a simple tool like Logcheck[3] or Swatch[4], or investigate some of the newer, fancier tools out there like Logsurfer+[5], SEC[6], or Lire[7]. Regardless of which solution you choose, let me assure you that I never regret the effort I expend setting up centralized logging and log monitoring, because the visibility I get as far as what's happening on my networks is enormously useful.

References

[1] Syslog-NG Reference Manual -- http://www.balabit.com/products/syslog_ng/reference/book1.html

[2] Syslog-NG FAQ -- http://www.campin.net/syslog-ng/faq.html

[3] Logcheck -- http://sourceforge.net/projects/sentrytools/

[4] Swatch -- http://swatch.sourceforge.net/

[5] Logsurfer+ -- http://www.crypt.gen.nz/logsurfer/

[6] SEC -- http://kodu.neti.ee/~risto/sec/

[7] Lire -- http://logreport.org/lire/

Hal Pomeranz (hal@deer-run.com) has been doing IT for more than 15 years. His favorite activity is being up at midnight on New Year's Eve so he can hear the disk drives on his log servers spin as the logging directory hierarchy for the new year is created.