Cover V13, i07

Article
Table 1
Table 2

jul2004.tar

Keeping Data in Sync::rsync -- Part II

Chris Hare

Data management is an ongoing issue that plagues many companies. Regardless of the size of the enterprise, organizations are constantly trying to find ways to move data securely between systems in an automated fashion, keep file systems or data files synchronized, or simply ensure that a group of systems has common data.

Keeping files synchronized in the enterprise generally means copying the file from one system to another. The assumption most organizations make is the availability of high-speed, low-latency networks where copying a large file is performed expediently.

This article is the second in a two-part series on the data synchronization tool rsync. The first part examined what rsync is, how it works, and the interface between the client and the server. This part examines the configuration of the rsync server, access control, and security implications.

Configuring the rsync Server

Starting the rsync daemon can be done by either starting the rsync server with the -daemon option in the system startup scripts, or by adding a line to the inetd or xinetd configuration files. The rsync daemon requires root privileges if you want to:

  • Use chroot() to limit file system access
  • Bind to the default TCP/IP port of 873
  • Set file ownership

Otherwise, the rsync daemon only needs permission to read and write the appropriate data, log, and lock files.

If there will be a large number of connections or high volume of rsync traffic, it is recommended to start rsync in standalone mode, similar to other servers such as the Apache Web server. This removes the need for inetd to start the daemon for each connection. If the traffic volumes and number of connections is small, then typical inetd operation is satisfactory.

However, if the intent is to depend upon inetd to manage the connections, the following line must be added to /etc/services:

rsync 873/tcp
inetd installations use this entry:

rsync stream tcp nowait root /usr/bin/rsync rsyncd --daemon
Replace "/usr/bin/rsync" with the path to where rsync is installed on the specific system. Because many systems today, notably Linux, use the eXtended Internet Daemon, xinetd, the following file would be installed in the /etc/xinetd.d directory to activate rsync:

# default: off
# description: The rsync server is a good addition to
# an ftp server, as it allows crc checksumming etc.
service rsync
{
        disable = no
        socket_type     = stream
        wait            = no
        user            = root
        server          = /usr/bin/rsync
        server_args     = --daemon
        log_on_failure  += USERID
}
Note that when making changes to inetd.conf or adding a file to /etc/xinetd.d to activate the rsync daemon, you must restart inetd or xinetd as appropriate to re-read and activate the configuration changes.

Control and operation of the rsync daemon is managed through the file rsyncd.conf, typically found at /etc/rsyncd.conf. Once the configuration file is set up and the rsync daemon is started, any number of systems can connect to make backups, mirror file systems, distribute files, etc.

The rsyncd.conf file uses "modules" to define what a remote user can access. The file format is similar to the Windows INI files, or to the Samba configuration file. A very simple example would resemble:

[ftp]
        path = /home/ftp
        comment = ftp export area
With this simple configuration file, we can test the connectivity to the server:

[chare@gw etc]$ rsync localhost::
ftp             ftp export area
[chare@gw etc]$
The module name is "ftp" in this example. Using this rsyncd.conf file, a user would be able to connect and retrieve any files in the directory. It is possible for the user to list the available modules, but can a file be retrieved using this configuration?

[chare@gw chare]$ rsync localhost::ftp/
opendir(bin): Permission denied
opendir(etc): Permission denied
drwxr-xr-x        4096 2003/02/01 13:21:08 .
d--x--x--x        4096 2003/02/01 06:31:10 bin
d--x--x--x        4096 2003/02/01 03:15:59 etc
drwxr-xr-x        4096 2003/02/01 03:15:59 lib
drwxr-sr-x        4096 2003/02/01 13:42:51 pub
rsync error: partial transfer (code 23) at main.c(926)
The example lists the files in the ftp directory. Notice the trailing "/", which is required to list the directory contents. Failing to include the "/" will only list the directory information. The partial failure notice occurred because some of the data was restricted by file permissions so it could not be transferred.

[chare@gw chare]$ rsync localhost::ftp/pub/
drwxr-sr-x     4096 2003/02/01 13:42:51 .
-rw-r--r--      760 2002/12/04 23:51:50 HEADER.html
-rw-r--r--      187 2002/12/01 00:14:30 README.html
drwxr-sr-x     4096 2002/12/01 00:14:36 family
drwxr-sr-x     4096 2003/01/08 08:01:05 movies
drwxr-sr-x     4096 2002/12/01 00:14:40 music
drwxr-sr-x     4096 2003/01/14 22:58:02 slides
[chare@gw chare]$
This is a similar example, listing the contents of the pub directory.

[chare@gw chare]$ rsync localhost::ftp/pub/HEADER.html
-rw-r--r--      760 2002/12/04 23:51:50 HEADER.html
[chare@gw chare]$ rsync localhost::ftp/pub/HEADER.html HEADER.html
 [chare@gw chare]$ ls -l HEADER*
-rw-r--r--   1 chare   chare   760 Mar  1 02:00 HEADER.html
[chare@gw chare]$
The previous example illustrates retrieving the file HEADER.html. Verifying the file was retrieved is accomplished with the ls -l command as shown, or by using a Web browser to display the file contents. The rsync server is working properly. However, this example doesn't touch on the complexities and feature set of the rsync server.

From the small rsyncd.conf example, it is evident there are module parameters providing the definition for the module. Each line contains a parameter name followed by an equals sign and the parameter value. Lines beginning with the traditional Unix comment character (#) are ignored. Other special characters for the configuration file include the \ character indicating that a line is continued to the next. Otherwise, all values on the right side of the equals sign are assigned to the parameter as either a string (no quotes are required) or a Boolean, which are assigned either a 0/1 meaning true/false, respectively.

As with other configuration files, there is a global parameters section located at the top of the file before the first module definition. Defining parameters in the global part of the configuration file override any default for that parameter. The global value may be replaced by defining the value for a specific module later in the file. The global configuration is illustrated in this more complex example:

uid = nobody
gid = nobody
use chroot = no
max connections = 4
syslog facility = local5
pid file = /var/run/rsyncd.pid
motd file = /etc/rsyncd.motd

[ftp]
        path = /var/ftp/pub
        comment = whole ftp area
        read only = yes
        list = yes
        ignore nonreadable = yes

[movies]
        path = /var/ftp/pub/movies
        comment = Movies and Video Clips
The lines in the rsyncd.conf file before the first module form the global configuration section. The uid line specifies the user name or user id, file transfers to and from the given module take place when the daemon is run as root. The default is -2, which is the user "nobody", and may be changed on a module-by-module basis. The gid line performs the same function, but specifies the group name or group id.

The "use chroot" entry indicates whether the connection is limited to files and directory corresponding with the specific path name. When setting it "no", the user may navigate anywhere in the file system. The default action is "yes", which is the most secure and prevents exposing the system files to unauthorized users.

The "max connections" line establishes how many simultaneous connections to rsync are allowed. The default is no limit. This example defines a limit of 4, meaning the fifth connection is denied with a message indicating the user should try again later.

The "syslog facility" indicates logging is done using the Unix syslog service with the defined facility. The default facility is "daemon", although any of the available syslog facility levels maybe used.

The "pid file" entry tells rsync to write its process ID. The "motd file" defines a file with a message, typically of a legal nature, displayed when the user connects.

Once the global parameters are defined, the modules are next as shown in the example.

The "path" parameter defines the location on the disk where the module is found. When a connection is made to this module, the user is placed in this directory. The "comment" parameter is what is displayed when the user connects and requests a listing of the available modules. This parameter is required for each module.

The "read only" parameter indicates whether uploads are permitted to this module. If the value is "no", then uploads are permitted. The "list" parameter controls whether the user can see this module when requesting a listing from the rsync server. And finally in this example, the "ignore nonreadable" parameter, when set to true, causes files that are not readable by the user to be ignored. The "ignore nonreadable" parameter is useful when the directory contains files the user cannot access due to file ownership and the associated permissions. Without the "ignore nonreadable" parameter, rsync will try to retrieve the file, resulting in an error message the user will then try to resolve.

The available global parameters are shown in Table 1. As well as defining some of these module options in the global configuration, the table identifies the parameters, which can be altered on a per module basis. Note that the specific module options (shown in Table 2) do not include authentication options that are discussed later in the article. All appropriate module options should be applied to:

  • Provide the greatest amount of access control;
  • limit the exposure of the rsync server, and;
  • limit the data stored within to appropriate or unauthorized access.
rsync Authentication

Rsync has a built-in authentication protocol to provide user-based access control on a module-by-module basis. Typically, access to the rsync server is anonymous, meaning any user can connect and access the data stored in that module. Limiting access requires the authentication controls be defined to restrict global access.

Defining the authentication controls consists of defining a "secrets" file with username and clear-text password pairs, one pair per line. For example, the secrets file could be stored as /etc/rsyncd.secrets and contain:

chare:Al8haBets0up
cote:4fastXfer
hawkins:2muchFun
The secrets file must be stored in a protected location (ideally, /etc) and must be owned by root and readable only by root. Allowing any other user to read the password stored in this file eliminates any value gained from putting access control on the server. Once the secrets file is defined, the module must be configured to use it:

[slides]
        path = /ftp/pub/slides
        comment = Slides and Pictures (requires authentication)
        auth users = chare, hawkins
        secrets file = /etc/rsyncd.secrets
When the user connects to the authentication, required module, a "Password:" prompt is displayed and the user must provide the password. The default username provided is the username from the Shell environment, either USER or LOGNAME. The password is not echoed on the terminal. A sample session would resemble:

[chare@gw chare]$ rsync  localhost::slides//
Password:
drwxr-sr-x        4096 2003/01/14 22:58:02 .
-rw-rw-r--           3 2003/01/15 02:20:01 .count
-rw-rw-r--           3 2002/12/08 21:28:22 .count~
-rw-r--r--         592 2002/12/01 16:47:34 HEADER.html
-rw-r--r--         187 2002/12/01 00:14:47 README.html
-rwxr--r--    35404504 2002/12/01 01:29:07 TRAY10.ZIP
-rwxr--r--    30106988 2002/12/01 01:29:09 TRAY11.ZIP
-rwxr--r--    27389104 2002/12/01 01:29:11 TRAY12.ZIP
[chare@gw chare]$
The users specified in the rsynd.secrets file do not have to exist on the local server. This allows a remote user to have only rsync access to the data. Additionally, it is possible to use shell wildcards in the rsynd.conf file when specifying the list of authorized users. This simplifies managing the user list, although it can also allow access to modules for unintended users.

The authentication protocol used by rsync is a 128-bit MD4 challenge/response system. When the server receives the connection, a challenge is created and sent to the client. When the user enters the password, the rsync client then generates the MD4 hash and transits it back to the server. The server takes the challenge and generates the MD4 hash using the password stored in the "secrets" file, and the values are compared. If they are the same, the password is correct and the user granted access. If the password is wrong, the user gets a message indicating the authentication for the module failed.

The authentication transaction when sniffed over the network is illustrated here. In the following packet, the rsync server is telling the client that authentication is required for the requested module. The challenge is provided. The authentication required and challenge notifications are shown in bold:

03:35:58.508510 localhost.rsync > localhost.33002: P 14:55(41) ack 20 
win 32767 <nop,nop,timestamp 132874500 132874499> (DF)
0x0000   4500 005d 4ccf 4000 4006 efc9 7f00 0001        E..]L.@.@.......
0x0010   7f00 0001 0369 80ea 5cc0 cc40 5c5c dcdc        .....i..\..@\\..
0x0020   8018 7fff ac4e 0000 0101 080a 07eb 8104        .....N..........
0x0030   07eb 8103 4052 5359 4e43 443a 2041 5554        ....@RSYNCD:.AUT
0x0040   4852 4551 4420 6462 6746 6e5a 5a37 7641        HREQD.dbgFnZZ7vA
0x0050   3454 4334 4d68 7a53 4f4c 4341 0a               4TC4MhzSOLCA.
The client passes back to the server the username (chare) and the MD4 hash of the password provided by the user. The username and MD4 hash are shown in bold:

03:36:00.426270 localhost.33002 > localhost.rsync: P 20:49(29) ack 55 
win 32767 <nop,nop,timestamp 132875482 132874500> (DF)
0x0000   4500 0051 5c3e 4000 4006 e066 7f00 0001        E..Q\>@.@..f....
0x0010   7f00 0001 80ea 0369 5c5c dcdc 5cc0 cc69        .......i\\..\..i
0x0020   8018 7fff 3961 0000 0101 080a 07eb 84da        ....9a..........
0x0030   07eb 8104 6368 6172 6520 3636 565a 5843        ....chare.66VZXC
0x0040   5162 376f 6e59 6b63 4e39 724d 366e 5077        Qb7onYkcN9rM6nPw
0x0050   0a                                             .
The rsync server responds with "OK", indicating the authentication was successful and then the requested actions are performed:

03:36:00.427694 localhost.rsync > localhost.33002: P 55:67(12) ack 49 
win 32767 <nop,nop,timestamp 132875482 132875482> (DF)
0x0000   4500 0040 4cd0 4000 4006 efe5 7f00 0001        E..@L.@.@.......
0x0010   7f00 0001 0369 80ea 5cc0 cc69 5c5c dcf9        .....i..\..i\\..
0x0020   8018 7fff 66c6 0000 0101 080a 07eb 84da        ....f...........
0x0030   07eb 84da 4052 5359 4e43 443a 204f 4b0a        ....@RSYNCD:.OK.
While MD4 may be sufficient, stronger encryption may be required for some applications. MD4 is vulnerable to dictionary attacks [1] and cryptanalytic attacks against MD4 are published, so it has been abandoned in other applications in favor of MD5. Additionally, the MD4 hash is applied against only the authentication credentials -- none of the data is encrypted. Therefore, protection from MD4 brute force attacks and data encryption requires using rsync with ssh as the shell transport.

rsync Access Controls

Rsync provides additional access controls using the "host allow" and "host deny" directives in the configuration file. The hosts allow option specifically grants access to the rsync server for the identified hosts, while hosts deny specifically forbids connections. Both can be defined using a hostname or an IP address. If hosts allow is used and the connecting system does not match one of the allowed formats, the connection is rejected. For example, if the hosts allow parameter is defined as:

[slides]
        path = /ftp/pub/slides
        comment = Slides and Pictures (requires authentication)
        auth users = chare, cote
        secrets file = /etc/rsyncd.secrets
        hosts allow = 192.168.0.3
only connections from this IP are allowed:

[chare@gw chare]$ rsync  localhost::slides//
@ERROR: access denied to slides from localhost (127.0.0.1)
rsync: connection unexpectedly closed (72 bytes read so far)
rsync error: error in rsync protocol data stream (code 12) at io.c(150)
[chare@gw chare]$
[chare@gw chare]$ rsync  192.168.0.3::slides//

Password:
drwxr-sr-x        4096 2003/01/14 22:58:02 .
-rw-rw-r--           3 2003/01/15 02:20:01 .count
-rw-rw-r--           3 2002/12/08 21:28:22 .count~
-rw-r--r--         592 2002/12/01 16:47:34 HEADER.html
-rw-r--r--         187 2002/12/01 00:14:47 README.html
-rwxr--r--    35404504 2002/12/01 01:29:07 TRAY10.ZIP
For both the hosts allow and hosts deny parameters, the following definitions may be used to identify a system:

  • A dotted decimal IPv4 address of the form a.b.c.d, or an IPv6 address of the form a:b:c::d:e:f. In this case, the incoming machine's IP address must match exactly.
  • An address/mask in the form address/mask, where address is the IP address and mask is the number of one bit in the netmask. All IP addresses that match the masked IP address will be allowed or denied depending upon the parameter.
  • An address/mask in the form address/mask-address, where address is the IP address and mask-address is the netmask in dotted decimal notation for IPv4, or similar for IPv6 (e.g., ffff:ffff:ffff:ffff:: instead of /64). All IP addresses that match the masked IP address will be allowed or denied depending upon the parameter.
  • A hostname. The hostname, as determined by a reverse lookup, will be matched (case insensitive) against the pattern. Only an exact match is allowed.
  • A hostname pattern using wildcards. These are matched using the same rules as normal Unix filename matching. If the pattern matches then the client is allowed.

The hosts allow and hosts deny can both be used in the same module definition. If both are defined, the hosts allow parameter is checked first, and a match results in the client being permitted to connect. If the host is not in hosts allow, hosts deny is checked and if there is a match, the client is denied access. If there is no match in either parameter, the client is allowed to connect.

The example illustrated allows a single host to connect to this rsync server. Allowing two servers to connect requires using a comma-separated list of IP addresses. If the intent is to allow all machines on a given subnet access, then use of the IP address/netmask syntax is preferable. For example:

hosts allow = 192.168.0.3, 192.168.0.100
hosts allow = 192.168.0/24, 192.168.10/255.255.255.0
Bear in mind, the default is to allow all hosts to connect to the rsync server.

Making a Connection Aside from making a connection using rsync and the remote shell transport, connections to the rsync server itself can be made in a number of ways to suit the specific connectivity requirements. The connectivity options include the following:

  • Connecting to the rsync server without a remote shell transport
  • Connecting to the rsync server using a remote shell transport
  • Running an rsync server over a remote shell transport

Connecting to the rsync server without the remote transport has been illustrated in a number of examples; however, it will be explained here again for clarity. A connection to an rsync server occurs when two colons are included in the lost and pathname, such as:

[chare@gw chare]$ rsync  192.168.0.3::slides//
The two colons instruct the rsync client to connect using the rsync protocol to the designated server on TCP port 873. If there is no rsync server running, the connection will be refused. If required, you can run rsync through a proxy, provided the proxy is defined using the RSYNC_PROXY environment variable and the proxy has been configured to allow proxying to port 873.

If the configuration you require must be non-interactive and a password is required, the environment variable RSYNC_PASSWORD may be used to store the password:

[chare@alpha chare]$ RSYNC_PASSWORD="2bad4UandME"
[chare@alpha chare]$ export RSYNC_PASSWORD
[chare@alpha chare]$ env
TERM=xterm
SHELL=/bin/bash
SSH_CLIENT=192.168.0.12 3494 22
SSH_TTY=/dev/pts/2
USER=chare
MAIL=/var/spool/mail/chare
PATH=/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin:/home/chare/bin
PWD=/home/chare
HOME=/home/chare
RSYNC_PASSWORD=2bad4UandME
LOGNAME=chare
[chare@alpha chare]$
The password is only used for authenticating to the rsync daemon -- it is not used to provide a password for the remote shell. Use caution when using the RSYNC_PASSWORD environment variable because some systems allow users to see all environment variables. In this case, the rsync password would be exposed. Consequently, use of this environment variable is not recommended.

However, since rsync itself provides no data encryption capabilities, it is often desirable to connect to the remote rsync server and still have all the rsync server functionality while using a transport such as ssh. In this case, the user must provide the remote shell transport command, the remote user, and the rsync user. The format for this command is:

$ rsync -av -rsh="ssh -l ssh-user" rsync-user@host::module[/path]local-path
This command line uses the rsync -rsh option to specify ssh as the remote transport. Alternatively, the RSYNC_RSH environment variable can be used to set the remote shell transport information. The -l option to ssh indicates the specified user is to be used to log in to the remote system.

If the remote rsync user is different from the user executing the command, the account for the remote rsync server must be specified by preceding the hostname with the username. The distinction to note here is the ssh-user will be used to access the system through the ssh transport, while the rsync-user will be used to access the rsync server.

Note that if you need this functionality, you must use rsync 2.5.6 or higher. Rsync 2.5.5 did not support this operation.

Using the rsync server with a transport such as ssh eliminates the requirement for configuration of the rsync server within inetd. The ssh command model supports using the "command=COMMAND" entries in the remote user's SSH authorized_keys entries, making it possible to use single-use keys to establish the connection, start the resync server on the remote system, and transfer the files.

The "command" entry in the SSH authorized_keys file would be:

rsync --server --daemon .
The trailing "." in the command entry is important, because rsync is expecting it for proper operation and is a requirement for the SSH configuration.

Security Implications

As you may have noticed in this article, there are security implications when using rsync. These implications are:

  • Use of MD4 as the authentication challenge/response hash. While MD4 is not subject to brute force attacks (because 128-bit keys are still considered an "unreasonably hard problem"), there are cryptanalytic attacks capable of exposing the user's password. However, MD4 is subject to a dictionary attack, which is often the method of choice by attackers when the hash or ciphertext is available. Although MD4 is not prone to brute-force attacks, many applications have replaced it with MD5.
  • The authentication "secrets" are stored in clear text in a file. Passwords should never be stored in clear text. However, rsync can be configured to "force" the permissions on the secrets file. If the permissions are incorrect, specifically meaning "others" can read it, the authentication fails as shown in this log file entry:
    Mar  1 04:47:45 gw rsyncd[18585]: secrets file must not be other-accessible (see strict modes option)
    Mar  1 04:47:45 gw rsyncd[18585]: continuing without secrets file
    Mar  1 04:47:45 gw rsyncd[18585]: auth failed on module slides from gw.chare-cissp.com (192.168.0.3)
    
    Consequently, it is likely that incorrect permissions will be captured quickly, but not until the passwords in the secrets file have been captured.

  • There is no encryption of data as it is transmitted. Arguably, this may not be a function of rsync, although there may be some plans to include SSL/TLS-based session encryption in future revisions of rsync. At this time, however, any requirement for encryption to protect the data as it moves across the LAN, corporate wide area network, and especially the Internet, should be satisfied using Secure Shell (ssh).

  • Firewall operation. Using rsync through a firewall requires the addition of firewalls establishing connectivity from inbound requests to the rsync server. This can include several ports, depending upon how your organization will allow connections to the rsync server. The ports can include:

    Service Name Port
    Rsync server 873 TCP
      873 UDP
    Ssh 22 TCP
    Rsh 514 TCP

    Finally, the Sardonix Open Source auditing team (http://www.sardonix.org) has reviewed the rsync source code and no significant security vulnerabilities were found. Notably, the reviewers commented how the developers have approached their development with security in mind. However, like all development activities, there is still room for improvement, although no significant security exposures were identified.

    Summary of Benefits

    As discussed at the beginning of the article, there are extensive benefits for using rsync. It is open source and free, allowing the organization to include it in products or make changes as required for their specific applications. Rsync is small, and supports data transfers, even of large files, over very slow links, including analog modem connections.

    Rsync can be used over secure transports, allowing for its use in transferring data over the Internet and can provide file distribution capabilities like FTP, albeit with a more compact protocol. Consequently, rsync is a viable tool for a wide range of applications including data sharing, application distribution, and file synchronization.

    Conclusion

    Many data transfer methods are available in the marketplace today, depending upon budget and complexity. Rsync provides a viable alternative to the complex tools available in the commercial marketplace, while offering incredible reliability as a data transport.

    The security risks are well understood with rsync, and when used with encrypted transport tools such as SSH, there is incredible security available to the user or organization. Additionally, the cross-platform availability of rsync solves some problems typically associated with moving data from a Windows to Unix system, or vice versa.

    Since rsync is an open source tool, some organizations will be reluctant to use it. However, there is considerable discussion in various rsync mailing lists regarding features, bugs, enhancements, and general "how to" questions, for even the most avid reader to still learn something about rsync implementation and use.

    Resources

    There is a mailing list for the discussion of rsync and its applications. It is open to anyone to join. See the Web page at: http://lists.samba.org/.

    The main Web site for rsync is http://rsync.samba.org/. The main ftp site is ftp://rsync.samba.org/pub/rsync/. This is also available as rsync://rsync.samba.org/rsyncftp/.

    Acknowledgements

    The author thanks Mignona Cote, a trusted friend and colleague, for her support during the development of this article. Mignona continues to provide ideas and challenges in topic selection and application, always with an eye for practical application of the information gained. Her insight into system and application controls serves her and her team effectively on an ongoing basis.

    Chris Hare's experience encompasses more than 18 years in the computing industry. Chris is the co-author of New Riders Publishing's Inside Unix, Internet Firewalls and Network Security, Building an Internet Server with Linux and The Internet Security Professional Reference. He frequently speaks on Unix, specialized technology and applications, security and audit at conferences. Chris currently lives in Dallas, Texas and is employed with Nortel Networks as an Information Security and Control Consultant. He can be reached at chare@chris-hare.com.

    Footnote

    1 A dictionary attack against the challenge/response system used by rsync involves capturing packets between the rsync client and server. Also, an MD4 generator is used to create MD4 hashes of all possible dictionary words and combinations. The created MD4 hash values are then compared against the collected hash values to determine whether the value is correct. This is a similar process used by the Unix password attack program, Crack.

     

  •