Cover V12, I08

Article
Figure 1

aug2003.tar

NFSv4 -- An Early Implementation

Mark Perino

When RFC 2624 was published in 1999, people running NFS began looking forward to a significant change in NFS that would address what some saw as its major deficiencies. RFC 3530 followed in April of 2003 and placed NFS version 4 on the protocol standards track.

New Features

To address perceived deficiencies, NFSv4 aims to improve access and performance over the Internet, transit firewalls easily, and scale to a large number of clients per server. Version 4 adds strong security with negotiation built into the protocol and improves cross-platform interoperability with Windows. The protocol is also designed to accept standard extensions that do not compromise backward compatibility. Beyond even those high-level goals, there is a lot in the details of version 4 that make it an interesting update to the protocol.

The first thing I noticed when using NFSv4 was the difference in the way exports are mounted. NFSv4 uses the concept of pseudo-fs to build an export tree for all clients. A pseudo-fs tree pieces the exported directories into one coherent filesystem view. For example, if I wanted to export /vol1/home, /vol2/home, /projects, and /usr/local, a pseudo tree (Figure 1) would get built by gluing together all these filesystems and whatever is below them. Permissions ACLs and export restrictions still apply. Why introduce a pseudo-fs in version 4? Well, part of that is explained below by the changes to the insecure mount protocol.

One of the annoyances of NFS was trying to get it to work securely across a firewall. Because versions 3 and lower relied on RPC and the mount protocol, numerous ports needed to be opened. Most network security administrators had a good reason not to trust the security of opening up a hole for RPC on the firewall. With version 4, only the NFS port needs to be opened. All RPC and NFS calls for NFS are performed on the NFS ports.

This feature, combined with pseudo-fs, closes yet another security hole from the older NFS versions. If a filesystem /vol1/home/dexter is exported so that only the user dexter has access rights, under that directory there exists a file /vol1/home/dexter/secretplan, which is world readable. NFS versions prior to version 4 would allow a file handle to be created. Since version 4 cannot differentiate a mount attempt from some other access, it disallows access. Version 4 also uses volatile file handles to better support filesystems that do not have the concept of a persistent compact inode number (e.g., the High Sierra file system for CD-ROMs). This adds support for more varied filesystems but means that, in special cases, a file handle may change. Because one of the goals was to improve Internet performance, something had to be done to reduce the number of round-trip operations that are common in NFS. Reducing the number of round-trip operations reduces the penalty paid due to latency on a longer distance connection. Version 4 introduces the compound procedure to accomplish this. Instead of each procedure waiting for an acknowledgement, actions that could be grouped together are now done in one procedure with one acknowledgement.

Many of the improvements to version 4 are a result of an increased focus on security. NFSv4 uses GSS-API and LIPKey as its default public key system. Whereas previous NFS versions often supported only the NULL and Unix authentication methods, the version 4 NFS client for Linux and other clients support Kerberos 5. This is done via RPCSEC GSS in the same way that Solaris clients already do. Another Kerberos similarity is the way users and groups are identified by a global namespace. Instead of passing UIDs and GIDs back and forth, version 4 uses user@domain or group@domain. The client translates this back into UID and GID at the host.

One nice thing about most of these security improvements is that they move NFS closer to compatibility with Windows and CIFS. Because the security design is modular, it is also possible that, if a better or more compatible system were needed, it could be easily implemented.

Besides the security improvements, other notable features move version 4 closer to CIFS compatibility. Version 4 implements a feature called Delegation, which was inspired by CIFS opportunistic locks. Delegation allows the server to lease out all responsibility for locking and updating the file to the client when only one host is accessing the file. When the lease expires, the Delegation is revoked just like Windows oplocks. When a client requests a read-only access to a file, multiple clients may be delegated a read-only lease. If another client requests a write access, then the leases are revoked. This eliminates (from the clients) the need for periodic checks to see whether the file is consistent. Delegation is an efficient way to implement caching by the client with a minimum of client/server round-trip traffic.

Another feature with similarities to CIFS is mandatory locking. Version 4 implements leases for delegations and locking. The server leases files to the client, but the client is responsible for renewing the lease. When the lease expires on a file, the server assumes a communications failure and may allow other hosts to lease the file. To aid data integrity and reduce bogus file locks, each client is assigned a 64-bit client_id. If the client reboots it is assigned a new client_id. When the server sees a new client_id coming from a host with the same identity information, it frees up all the locks with the previous client_id after waiting for the client to reclaim any locks.

While these are most of the mandatory protocol attributes, version 4 also has other recommended attributes that may or may not be implemented depending on your server and client. Features such as ACL support, archive bits, and others fall into this category. I was unable to test some of these, although they may be available with our client and server set.

What's Available, What's Coming

So, what clients and servers are available at this time? In February 2003, Network Appliance released code enabling their filers to support NFSv4 with Data ONTAP 6.4. Just a few weeks later, a stable release of the Linux client for NFSv4 was back-ported for 2.4 kernels. Version 4 support is already incorporated as a part of the future Linux 2.5 kernels. At the time of writing, the only other publicly available client was for OpenBSD. Of the commercial vendors, both IBM and Sun claim to be testing NFSv4 server and client packages. While not yet available, IBM reports that their implementation will likely be a package for AIX 5L. Packages for Solaris were also not available at the time this article was written. However, Sun is working to incorporate server and client support for version 4 in the upcoming release of Solaris 10. It is also likely that a server and client package will be made available for Solaris 9. HP was unable to provide an answer to NFSv4 support for Tru64, OpenVMS, or HP-UX.

The Test Environment

The environment used for testing consisted of a NetApp FAS940c filer and a pool of Linux clients running Red Hat 7.3 in San Diego and two remote clients on the east coast. The LAN configuration was a filer connected to switches with dual 1Gb interfaces and clients throughout the LAN with individual 100Mb connections. The remote clients were across a corporate frame relay network and behind a firewall.

Enabling version 4 on the server was done by setting the nfs.v4 options on our filer console. The syntax of the command was "options nfs.v4.id.domain domain", where the domain against which I wanted ids checked was NIS. (Although NFSv4 supports Kerberos, I did not finish in-depth testing of that in time for the article.) This set the filer to use the NIS domain as its source for string ids (user@domain and group@domain). I also set "options nfs.v4.enable on" to allow the filer to negotiate version 4 connections.

I tested to ensure that once version 4 was enabled on the server, our version 2 and version 3 clients could still connect. Once I was sure of no ill effects, I began upgrading the clients.

To upgrade NFS on the Linux clients, I downloaded the 2.4 kernel modules from http://www.citi.umich.edu/projects/nfsv4/. The host running 2.5 kernels already had version 4 support complied in. The install was as simple as adding the mount-2.11n-12.nfsv4.i386.rpm, nfsv4-daemons-0.9-1.i386.rpm, and initscripts-6.67-1.nfsv4.i386.rpm packages to our systems. Just as the documentation accompanying the RPMs promised, there was a change in the options I could pass to mount. There was a slight change in the mount command when compared to previous versions of mount under Linux.

NFS Version Command
NFSv2 mount -o vers=2 server:/export /mnt/server
NFSv3 over UDP mount -o vers=3 proto=udp server:/export /mnt/server
NFSv4 mount -t nfs4 server:/export /mnt/server

Intuitively, you might expect the command to mount a version 4 filesystem to be mount -o vers=4 server:/export /mnt/server. However, this didn't work for the Linux client, nor did specifying proto=udp in conjunction with version 4, since TCP is the mandatory transport. Once I got beyond the syntax issues, both the 2.4 and 2.5 kernel machines saw the shared files as expected. Navigating through the pseudo filesystem is not much different from previous versions because of the layout of our exports. If I attempted to mount an invalid part of the pseudo-fs tree, the mount command gave an error of "mount: wrong fs type, bad option, bad superblock on server:/fakedir, or too many mounted file systems".

The Test Results

The upgrade of the server and clients was straightforward, and seemed to be without any problems for all the basic tests I ran. Jobs that ran before the upgrade still ran, and users on those systems couldn't tell the difference. I didn't run into any signs of problems until later.

One of the major goals mentioned previously was better compatibility. I tried to break a mixed CIFS and NFS share by having different hosts overwrite each other's updates and deleting files that were in use by an application. The results were that data integrity was protected and cache coherence was maintained. As encouraging as the CIFS to NFS compatibility was, I encountered some incompatibilities with applications on the host. Most seemed to be the result of applications being hard coded to expect NFSv2/3. An example of this was the nfsstat -m command on the clients. Because the stats for each version are separate, and the application was coded before version 4 was available, it does not report any of the v4 I/O stats. The good news is that all the bugs I encountered (and more) are already open and are planned fixes for the next Linux client release.

Beyond testing to see what worked and what didn't, I wanted to see whether version 4 delivered improved performance. To measure this, I used IOZone and some automated batch jobs to generate a load and measure throughput from the host. From the server side, I gathered statistics on NFS and monitored CPU usage. In initial testing with local clients, I did not see a large performance boost over NFSv3. However, as the number of clients increased, I noticed that CPU on the server did not rise as high with version 4 as with version 3. With six local clients, the CPU load was ~10% lower. When I went to the remote clients, there was a much larger improvement in throughput and latency.

I tested during both congested and uncongested times of day, and version 4 always matched or outperformed version 3. The best improvement was in the time to run a batch read/write job that did not use host caching very efficiently. The runtime for the job was reduced by 13%. Of all the performance observed, there was one instance where version 3 outperformed version 4. Tab completion and some directory listing operations were noticeably slower with version 4. I contacted the CITI group at University of Michigan that developed the Linux client and found that they already had a ticket open on that issue as well.

Conclusions

Like other early adopters, we encountered bugs that are common with a new version of a protocol. None of the bugs were serious enough to offset the potential improvements that version 4 offered. The improved security, performance, and features of version 4 are a great step forward.

References

RFC 2624 -- http://www.ietf.org/rfc/rfc2624.txt

RFC 3010 -- http://www.ietf.org/rfc/rfc3530.txt

The NFS version 4 Protocol -- http://www.netapp.com/tech_library/3085.html

NFSv4 -- http://nfsv4.org

Mark Perino lives happily in San Diego, California. He has a B.S. IE from Rutgers College of Engineering, and worked there for 6 years. He specializes in data storage technology and is currently an Open Systems Engineer for Hitachi Data Systems. When not working, he enjoys time spent with his fiancè and stepson, and Yosemite in the winter. He can be reached at: mark.perino@hds.com.