Remote
Logging with SSH and Syslog-NG
Hal Pomeranz
One of the points I make repeatedly in my training classes is
the value of centralized logging. Keeping an off-line copy of your
site's logs on some central, secure log server not only gives you
greater visibility from a systems management perspective but also
can prove invaluable after a security incident when the local copies
of the log files on the target system(s) have been compromised by
the attacker.
The difficulty is that the standard Unix syslog daemon uses unauthenticated
UDP messages to transmit log messages to remote servers. This makes
drilling holes in your firewalls to accept syslog messages from
remote locations very undesirable, to say nothing of the security
implications of having critical system log messages traveling in
clear text over public networks. Use of IPSec or some other strong
VPN product can certainly help mitigate these concerns, but if all
you care about is obtaining logging information from some remote
site, then firing up a full-bore VPN session may seem like overkill.
However, the fact that UDP is not a guaranteed delivery protocol
also means that important log messages can be dropped entirely.
While lack of guaranteed delivery can be a factor for syslog messages
in LAN environments, the risk becomes much greater when trying to
drive remote log messages across highly congested public networks.
Simply using a VPN to protect the security of the remote log stream
does nothing to address the guaranteed delivery concern. This is
where Syslog-NG becomes attractive, because two Syslog-NG servers
can share remote logging information using TCP rather than UDP.
And, once you're logging via TCP, then it is also possible to tunnel
this TCP communication via SSH rather than firing up a full VPN
-- the "best of both worlds" if you're looking for a quick and dirty
solution.
The rest of this article covers the basic configuration for establishing
an SSH tunnel between two servers and configuring Syslog-NG at both
ends to communicate log messages down this tunnel. Because Syslog-NG
is capable of both accepting UDP-based log messages from standard
Unix syslog daemons as well as forwarding those messages to another
machine, it is possible to set up a single Syslog-NG server -- which
acts as a collector and relay for the log messages generated by
all machines at that location -- at a remote site. Such a configuration
is largely outside of the scope of this article although I'll give
you some pointers in that direction as we go along.
Start with SSH
The first step is to get the SSH tunnel set up between the two
machines. My personal preference is to originate the SSH tunnel
on my central "loghost" machine at the primary site and have it
connect to the machine at the "remote" site from where I want to
get logs. Typically, this involves drilling out through the firewall
at the primary site -- often the site's default firewall rules will
allow this connection without any reconfiguration -- and allowing
the connection "inward" through the firewall at the remote end,
which usually requires some firewall ruleset tweaks on the remote
site's firewall.
However, since we want the remote log server to be sending logs
back to the central loghost at the primary site, we need to use
a reverse tunnel (that's the -R option on the SSH
command line) to get things working properly. This is actually one
of very few places where I find reverse tunnels to be useful. Figure
1 shows a high-level picture of how the traffic is flowing in this
design.
We need to make sure that the SSH session and tunnel are set up
automatically when the central log host boots. If the SSH session
dies for some reason (intermittent network outage, systems administration
"accident", etc.), we'd also like the connection to be re-established
as quickly as possible. In situations like this, I like to have
the init process fire off the SSH connection with a line
similar to the following in /etc/inittab:
log1:3:respawn:/usr/bin/ssh -nNTx
-R 514:loghost.domain.com:514
remote.domain.com >/dev/null 2>&1
The example above must appear as a single long line in /etc/inittab
-- I've just broken it onto multiple lines for clarity.
Let's examine the SSH command line first. The -R 514:loghost.domain.com:514
on the second line of the example sets up the reverse tunnel from
514/tcp on the remote server to loghost.domain.com:514 --
in other words, port 514/tcp on the central loghost machine. While
it seems natural to use 514/tcp for Syslog-NG logging, remember
that 514/tcp is the reserved port for the Unix rlogin/rsh
service, so you're going to run into a port conflict if you still
have these services enabled. I generally turn off unencrypted network
protocols like telnet, FTP, and rlogin/rsh/rcp on
my servers and use SSH instead, so it's not an issue for me, but
you can run this tunnel over any free ports if there is a conflict
at your site.
As for the other SSH command-line options shown in the example,
the -n flag tells SSH to associate the standard input with
/dev/null. There won't be any command-line input because
we're essentially going to be running the SSH client as a daemon
via init. As you can see at the end of the command line in
the example, we're also sending the standard output and standard
error to /dev/null as well (... >/dev/null 2>&1).
Since we're never going to be issuing remote commands via this SSH
connection (we only care about the tunnel), the -N option
to SSH tells the SSH client just to set up the tunnel and not to
bother preparing a command stream for issuing commands on the remote
system. Option -T, meanwhile, says not to bother allocating
a pseudo-tty on the remote system. The -x option disables
X11 forwarding, just as a defense-in-depth gesture.
Turning our attention to the rest of the /etc/initab entry,
the first field (log1) is just an identifier for this entry
in the inittab file. These identifiers can be any sequence
of 2-4 alphanumeric characters; the only requirement is that they
be unique from all other identifiers used in the file. I've chosen
log1 here because it's usually the case that I have multiple
SSH tunnels set up to different remote log sources, and I typically
name the inittab entries log1, log2, etc. The
second field in the inittab file (3) is the run level
where this entry should be fired. Make sure to start this SSH process
after the network interfaces have been initialized but before the
Syslog-NG daemon has been started.
The respawn option in the third field is the reason I like
to use init for spawning processes like this. When the respawn
option is enabled, the init process will automatically fire
off a new SSH process if the old one dies for any reason. In other
words, init acts a like a "watchdog"-type daemon and makes
sure that the SSH tunnel is always up and running. This is an extremely
useful technique, but one that a lot of sys admins seem to have
forgotten.
For the SSH tunnel to work, you must set up public key-based authentication
between the central log server where the SSH process will be spawned,
and the remote log server where the connection will be established.
Use the ssh-keygen command to generate the key for the root
account and make sure that you do not set up a pass phrase
for this key, since it will have to be used at boot time by the
automated SSH process being run from init. Obviously, you
must be extremely careful to protect this key file from unauthorized
use -- restrict access to your central log server to only other
administrators and make sure the permissions on the key file is
mode 600 and that the file is owned by root. Copy the public half
of the key to the authorized_keys file in root's home directory
on the remote server.
Once you have your inittab entry and SSH keys all set up,
HUP the init process (kill -HUP 1). This should cause
the init process to re-read the inittab file and spawn
the SSH connection. You should be able to verify that the SSH client
is running with the ps command and verify the existence of
the tunnel using netstat. Once all that is working, we can
start configuring Syslog-NG.
Configuring Syslog-NG
In general, configuration of Syslog-NG is well covered by Balázs
Scheidler's reference manual[1] and Nate Campi's excellent FAQ[2].
So allow me to just present complete configuration examples for
the main loghost and remote log server and point out the critical
bits.
Let's look at the configuration for the main loghost:
options { check_hostname(yes);
keep_hostname(yes);
chain_hostnames(no); };
source inputs { internal();
unix-stream("/dev/log");
udp();
tcp(max_connections(100)); };
destination logpile {
file("/logs/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR$MONTH$DAY"
owner(root) group(root) perm(0600)
create_dirs(yes) dir_perm(0700)); };
log { source(inputs); destination(logpile); };
As far as the options go, check_hostname(yes) forces Syslog-NG
to do a little bit of sanity checking on the incoming remote hostname
in the log message. In our destination directive, we'll be creating
directories for each system's logs by hostname, and it wouldn't be
good if an attacker could embed shell meta-characters in the hostname
to cause us problems.
The keep_hostname(yes) option means to use the hostname
that's presented in the actual message from the remote log server
rather than using the hostname we get by resolving the source of
the remote syslog connection. After all, since we expect remote
messages to be coming down our SSH tunnel, the source IP address
of these messages will be the loopback address (127.0.0.1), and
having all messages tagged with localhost is not what we
want.
chain_hostnames(no) causes Syslog-NG just to show the original
hostname in the message rather than a chain of all the hops the
message has made in reaching its final destination. This becomes
a lot more relevant when you start relaying messages through multiple
servers.
The inputs cover all of the various places from where we
can get logging information. internal() is internal messages
from the Syslog-NG daemon itself. unix-stream("/dev/log")
is the normal /dev/log device that Linux systems use for
local logging. Note that if you're on a non-Linux platform like
Solaris, HP-UX, or one of the *BSD operating systems, then your
local log channel will probably be very different. (Examples
of appropriate configurations for various operating systems can
be found in the Syslog-NG source distribution.)
Some sites actually run the vendor syslog in parallel with Syslog-NG
rather than having to deal with the problem of emulating the standard
vendor syslog interfaces -- the vendor syslog daemon can just relay
messages to Syslog-NG via the standard UDP Syslog channel, even
within the same machine. The udp() line means to listen on
the standard 514/udp Syslog channel, and tcp() means to listen
on 514/tcp for messages from another Syslog-NG server (or in our
case, the SSH tunnel). Note that both the tcp() and udp()
options accept the port() option to specify a different port.
For example, if you wanted your Syslog-NG server to listen on port
5014/tcp to avoid conflicts with the rlogin/rsh daemon, you
would write:
tcp(port(5014) max-connections(100));
Note also the use of the max_connections() option to increase
the number of simultaneous TCP sessions the logging daemon can handle.
The destination clause allows us to specify a "log sink",
or place where we want our logs to end up. Here we're using some
built-in Syslog-NG macros to force incoming log messages to be divided
out into directories -- first by hostname and then by year and month.
Within each directory, messages will go into log files named for
the syslog facility to which the message was logged (mail,
auth, kern, local0, etc.), with each file having
a date stamp attached. Notice that with Syslog-NG automatically
creating a new file for each day of logs, we don't even need a separate
log rotation program! This is just one more useful feature of Syslog-NG.
The other options to the file() directive make sure that
directories will be created as needed and set sensible ownerships
and permissions on the newly created files and directories.
Once we define inputs and destination directives,
we combine them into log declarations to actually tell the
Syslog-NG daemon what to do with the incoming messages. Here we're
just doing the trivial rule that sends all of our incoming messages
from all sources into the log file directory hierarchy we defined
in the destination directive above.
With the basic configuration of the central loghost out of the
way, let's look at a sample configuration for the remote log server
on the other end of the SSH tunnel. It's actually not too much different
from the configuration for the central loghost:
options { check_hostname(yes);
keep_hostname(yes);
chain_hostnames(no); };
source inputs { internal();
unix-stream("/dev/log");
udp();
tcp(max_connections(100)); };
destination logpile {
file("/logs/$HOST/$YEAR/$MONTH/$FACILITY.$YEAR$MONTH$DAY"
owner(root) group(root) perm(0600)
create_dirs(yes) dir_perm(0700)); };
destination remote { tcp("localhost"); };
log { source(inputs); destination(logpile); };
log { source(inputs); destination(remote); };
Basically, we have just added an additional destination directive
and an additional log directive. The remote destination
says to log via TCP to localhost using the default port 514/tcp
(because we didn't specify an alternate port). localhost:514
should be the location of our reverse tunnel endpoint. Note that if
you used an alternate port for the tunnel endpoint, you can specify
it:
destination remote { tcp("localhost" port(5014)); };
The first log declaration keeps a local copy of all log messages
received in a directory structure on the remote log server that parallels
the one on the central loghost. The second log directive also
relays a copy of all messages back to the central log server via the
SSH tunnel. It's up to you whether you keep a local copy of the logs
on the remote log server, but most likely the admins at the remote
site will appreciate having this copy of the logs.
Note that in the inputs section above, we've configured the standard
udp() input for normal UDP syslog messages. This means that
other hosts at the remote site can send syslog messages to the remote
log server, and those messages will be relayed by the Syslog-NG
server back through the SSH tunnel to the central log host at home
base. We've also configured the remote log server to listen for
messages on the tcp() input channel. Maybe there are other
Syslog-NG servers at the remote location, or perhaps there is an
SSH tunnel from the remote log server to some other remote site
and we're chaining log messages through multiple hops!
Conclusion
I think you'll find this an easy little recipe to implement, yet
it achieves a very large goal. Of course, once you have this big
pile of logs, you're going to want some sort of tool that actually
reads the logs for you and sends you the "interesting" events. You
could use a simple tool like Logcheck[3] or Swatch[4], or investigate
some of the newer, fancier tools out there like Logsurfer+[5], SEC[6],
or Lire[7]. Regardless of which solution you choose, let me assure
you that I never regret the effort I expend setting up centralized
logging and log monitoring, because the visibility I get as far
as what's happening on my networks is enormously useful.
References
[1] Syslog-NG Reference Manual -- http://www.balabit.com/products/syslog_ng/reference/book1.html
[2] Syslog-NG FAQ -- http://www.campin.net/syslog-ng/faq.html
[3] Logcheck -- http://sourceforge.net/projects/sentrytools/
[4] Swatch -- http://swatch.sourceforge.net/
[5] Logsurfer+ -- http://www.crypt.gen.nz/logsurfer/
[6] SEC -- http://kodu.neti.ee/~risto/sec/
[7] Lire -- http://logreport.org/lire/
Hal Pomeranz (hal@deer-run.com) has been doing IT for
more than 15 years. His favorite activity is being up at midnight
on New Year's Eve so he can hear the disk drives on his log servers
spin as the logging directory hierarchy for the new year is created. |