LDAP Directory Basics
Lizmari Brignoni
The single sign-on dream is becoming a reality thanks to the trend for applications to use directory servers, commonly referred to as LDAP servers. In this article, I will describe some of the basics of a directory that uses the LDAP protocol (Lightweight Directory Access Protocol). I will also illustrate an implementation using the Netscape Directory server, which is available for Solaris, HP-UX, AIX, Compaq UNIX, IRIX, NT, and Linux.
Directory Basics What is a directory server and when should I use it? A directory server contains a database backend that has been tuned for read-only transactions. Its front end is the LDAP protocol. This type of server is very useful for applications that perform authentication functions, mail routing, certificate storage, information lookups (white and also yellow pages), storage of preferences that do not change too frequently, and other uses. A common misuse of directory servers is to use them as a transactional database. Using the directory for a large volume of write/modify transactions can cause performance issues.
Directory servers that use the LDAP protocol are normally configured to use port 389. This is the default port used by applications such as the Netscape Communicator Address Book, which is a common way of searching the directory and can be considered as an example of white pages.
A secured LDAP (LDAPS) port can also be configured; the standard port is 636. The administration port seems to be randomly selected, so choose a number you'll remember. This is the port used by the GUI to access your server to stop/start it and make configuration changes. By the way, the administration server must run as root if the directory server is going to use the standard ports mentioned above.
Configuration Servers A configuration directory server was introduced in Netscape 4.x. The configuration server stores configuration information about servers on its database. Without it, servers cannot start up. This type of server needs to be installed prior to installing a directory server with user information. A couple of decisions need to be made when installing:
Will the configuration server reside on the same physical machine as the user directory?
Which port will be used for the configuration server?
If you have the hardware, I recommend installing the configuration server separately. This way, other applications using the configuration directory server, such as the Netscape Messaging server, will not be affected if there is a hardware or network problem with the user directory server. A smaller machine can be used, since the server experiences less traffic than the user directory server.
As for the port number, I would use one that is easy to remember and still reminds me it's a directory server. The vendor recommends not using port 389 for the configuration server instance, because it can prevent you from installing a user directory server on the same host. I usually use port 489 for the configuration server for this reason. Just make sure that the port you choose is not currently being used (see Figure 1).
An administration domain will also need to be selected. There are different ways of doing this, and for a detailed description I suggest reviewing the installation guide. Basically, choose one domain unless there is another organization within your company that is interested in managing their own servers.
DIT First, start with the Directory Information Tree (DIT). The tree is where the entries will be stored. This can be compared to tables in a database, but they're not quite the same because of DIT's hierarchical nature as well as its object-oriented design. For simplicity, I'll say that the tree starts at the Organization level (O), which is the parent of all the OrganizationalUnits (OU). It's common practice to use the company name as the value for Organization (O).
There are different approaches for designing a tree:
Regional
Business Function
Other
When following the Regional approach, the information is divided by areas, such as Americas, Europe, etc. It looks like this:
o=4sitesoftware
ou=Americas
ou=Europe
In this environment, the master server contains all entries, but the replicas can be configured to contain the data specific to their region. The advantage of this approach is that users in Europe do not have to come across the network to log in. A disadvantage is that every time a user from Europe wants to send email to someone in the Americas, they must come across the network for the required information.
Another commonly used method for designing a DIT is by Business Function. The replica for the Engineering department could be located in the Engineering area. However, if employees are moving around the company (i.e., the administrative assistant for Engineering now works in the Marketing department), the update application must perform an add transaction to the new branch and a delete transaction on the old branch. This works fine, it's just something we want to minimize for performance reasons.
The approach I took was based on categories. Here's an example of a tree:
o=4sitesoftware
ou=Employees
ou=ConferenceRooms
ou=Groups
I plan to store employee records under ou=Employees and use the description attribute to specify whether it's a permanent employee or a consultant. I will use the ou=ConferenceRooms to store conference room information, which is useful when using a calendaring application. The ou=Groups will be used to store entries for groups of employees. By using groups and Access Control Instructions (ACIs), I will be able to control which members of which groups can access particular information.
For example, I will probably want to create a Help Desk group that can only change passwords. I can control access in such a way that the Help Desk group members are not able to add entries, ensuring that they do not add unauthorized people to their group.
Schema I'll now cover the specifics of a record. What do I want to use to identify each record? Do I want to use an employee number, or maybe their last name?
The first part of the object (record) is its Distinguished Name (dn). The dn will contain a unique identifier for the record as well as the branch of the tree where the record is located (its Relative Distinguished Name (RDN)). By looking at the list of attributes on the slapd.at.conf file, I can see that there are various options that can be used to identify a person.
Let's say that Ross Geller is an employee of 4sitesoftware. Among the different attributes that exist, I decided to use employeeNumber (see Figure 2). His dn would look like this:
dn: employeeNumber=123, \
ou=Employees,o=4sitesoftware
Choosing the attribute to be used in the dn is a matter of preference; remember that you are choosing a unique identifier that will allow you to retrieve other pieces of information from that record. I prefer the employeeNumber because it's easier to ensure uniqueness and there's less maintenance. For example, if someone gets married and changes her last name, the employeeNumber wouldn't change but her name would. So, I need to issue an add/delete transaction to reflect this change if her name is part of her dn. If the person's name is not part of the dn, I simply perform a modify transaction to change the particular attributes affected (her common name(cn attribute) and surname).
Each record will have at least one object class assigned to them. An object class might require that one of its attributes has a value before it can add a record. Other attributes can be optional. This can be customized, as I'll show later. Here's an example of an entry:
dn: employeeNumber=124, ou=Employees,o=4sitesoftware
objectclass: person
sn: Geller
cn: Monica Geller
userpassword: {SHA}5ABC723dfsg=
Notice that in a database, the database administrator chooses the most appropriate name for each field. In a directory, I can and should use the attributes that vendors have already agreed to in the RFCs. In the previous example, I used the person objectclass, which is a standard. To view a list of all the predefined attributes and objectclasses on the Netscape Directory server, view the slapd.at.conf and slapd.oc.conf, located in /installationdirectory/slapd-yourservername/config/.
Extending the Schema There will be times when there isn't a predefined attribute I can use. I can define an attribute and then define an objectclass to contain the attributes I need. In tree and schema design, I prefer to add attributes and minimize the number of branches in a tree. In other words, use an attribute called companyName, instead of creating a branch (OU) for each company you will be dealing with.
To illustrate this, imagine I am defining an attribute to store the technical skills each employee has; this way when I need resources, I can quickly get a list of all the people that know Perl, for example. I can add a line like this one on the slapd.user_at.conf:
attribute skills skills-oid cis
Here I am defining an attribute by assigning it a name, a unique identifier (skills-oid), and a case ignore string type (cis). At this point, there's no need to provide the details about OIDs; they can be numbers, and they must be unique. It is also very convenient to work with case-insensitive strings instead of case-exact strings and other types available.
I will probably have to add a few custom attributes based on what the directory is going to be used for. These attributes must be assigned to an objectclass, which can be done by editing the /installationdirectory/slapd-yourservername/config/ \
slapd.user_oc.conf. Define an objectclass for the attribute above:
objectclass 4sitesoftwareEmployee
oid 4sitesoftwareEmployee-oid
superior person
allows
skills
This new objectclass (4sitesoftwareEmployee) has the person object class as its parent. This means that when defining a record that has this class, it must also have the parent class defined on the record and meet all criteria for that class (must include all the required attributes for that class). This is the reason why you'll see a record with multiple object classes.
Indexes I'll now discuss directory searches. To assist in searching, the directory uses indexes. You can edit the indexes to suit your needs. In this example, I want quick response when I perform a search on the skills attribute. I need to edit the slapd.ldbm.conf file (used to be slapd.dynamic_ldbm.conf on 3.1) and add a line like this one:
index skills pres,eq
There are different types of indexes, and for the sake of brevity I'll just say that I chose presence and equality. In other words, I will be able to search and see if there is any employee that is skilled on Perl or search for skills=Perl and get a report of all the Perl experts in the company.
Adding Entries Via LDIF There are different ways of performing updates to the entries in the directory. There are C libraries, Perldap, Java libraries (JNDI), and others, which are out of the scope of this article. However, for the occasional update of a few records, a simple text-based method can be useful. By using an LDAP Data Interchange Format (LDIF) file, entries can be added, modified, or deleted.
To add a new employee to the directory, I can use an editor such as vi to create a text file. I'll call it thefile.ldif:
dn: employeeNumber=125, ou=Employees,o=4sitesoftware
changetype: add
objectclass: person
objectclass: 4sitesoftwareEmployee
sn: Green
cn: Rachel Green
userpassword: test
skills: perl
To add this user to the directory, I need the directory's root or another authorized account. The person that installed the directory server will be the one configuring this priviledged user and assigning it a password. A common entry for the root is cn=Directory Manager,o=4sitesoftware. By performing the following command, this user will be added to the directory:
./installationdirectory/shared/bin/ \
ldapmodify -D "cn=Directory \
Manager,o=4sitesoftware" -w mypassword -f thefile.ldif
If the directory server is set up to use Secure Hash Algorithm, it will store the password from the ldif file in an encrypted format. Now that I have added the entry, I can perform searches to ensure the entry really exists:
./installationdirectory/shared/bin/ \
ldapsearch /-b o=4sitesoftware
employeenumber=125
or
./installationdirectory/shared/bin \
/ldapsearch -b o=4sitesoftware \
"cn=rachel green"
Note that I can perform the searches in lower case since those attributes are defined as case ignore string (cis). Also, I can search for any attribute defined in the entry.
Groups and Multivalued Attributes Using groups provides the advantage of controlling access in a centralized manner. Imagine that I want to add a programmers group with Monica and Rachel as the members. This can be accomplished via the Netscape Web interface or by using an LDIF file:
dn: cn=Programmers, ou=Groups, o=4sitesoftware
changetype: add
objectclass: top
objectclass: groupOfUniqueNames
cn: Programmers
uniquemember: employeeNumber=124, ou=Employees,o=4sitesoftware
uniquemember: employeeNumber=123, ou=Employees,o=4sitesoftware
uniquemember: employeeNumber=125, ou=Employees,o=4sitesoftware
As you can see, I am using an attribute multiple times. This is called a multivalued attribute. It can come in handy when I need to store things like common names:
dn: employeeNumber=126, ou=Employees,o=4sitesoftware
changetype: add
objectclass: person
objectclass: 4sitesoftwareEmployee
sn: Tribbiani
cn: Joey Tribbiani
cn: John Tribbiani
userpassword: {SHA}4XYZ869djrv=
skills: nt
The caveat is maintaining multivalued attributes. Because they are referred to as the same name (cn), I need to specify which one of them I am changing or deleting.
Access Control I'll now cover access control basics. Access Control Instructions (ACIs) can be set up via the Netscape interface (Console in 4.1, browser in 3.1). They can also be done via LDIF. For simplicity, I will mention the different types of access and some common practices. It is common to allow anonymous access for all attribute searches except userpassword. If other attributes need to be protected, they need to be specifically listed. I prefer to group them into one aci rather than creating an aci for each attribute to be protected.
Another example is the help desk or administrator's group that should be able to change passwords. Also, a user should be able to change his/her own password but not see it. It is also common to define a directory administrator's group that has root access. This allows privileged users to make changes and be traced by their unique identifier. It can be useful when trying to determine who made changes.
Replication The entries in the directory will be stored in the master directory. However, a copy of all these entries is stored in a replica (also commonly referred to as consumer). The master directory also stores the data related to a replication agreement. Frequency, subtrees to be replicated, and server names are stored in the master directory. This is configured via the GUI, the Netscape console.
Replicas can push or pull information. My experience has been using the push mechanism in which an application (let's say a messaging server), sends a password change up to its directory replica, which in turn sends it to its master (which has the writeable copy of the database). The master then sends the change back down to all its replicas. This is referred to as Supplier-Initiated Replication (SIR).
Replication intervals vary and depend on the uses of the directory. In certain environments, for security reasons, replication might occur as soon as a change is received. (Deletions need to be immediate, and deleting a user on the master will send an immediate deletion transaction to its replicas.) In other environments, replication can be planned to occur at intervals. It is the supplier server's responsibility to update the replicas (also known as consumers). Figure 3 illustrates how a configuration and a user directory server can be replicas of each other for redundancy.
In my experience, replication problems are one of the most common problems experienced, and are usually fixed by rebuilding replicas from an ldif file obtained from the master. This is not complicated but it is time consuming, since usually the applications need to get redirected via DNS or /etc/hosts to point to another server while the directory server it was using is rebuilt. The out of sync condition usually occurs when the network is unstable and a replica cannot figure out what is the last record processed.
Synchronization and Maintenance Synchronization status can be viewed via the Netscape console. However, you can only view one replica at a time. By performing a search on each server, I can look for the copiedfrom attribute and determine which servers are not in sync. This is the query for the master server:
ldapsearch -p 389 -h supplierdirectoryname \
-b "cn=monitor" -s base
This is the query that would need to be performed for each replica:
ldapsearch -p 389 -h replicadirectoryname \
-b o=4sitesoftware -s \
base "copiedfrom=*" copiedfrom
These commands can be put in a script that would perform all the searches from a central location, parse the results, and send an alert to the administrator if the replicas were not synchronized by a delta of 200, for example. Note that a change in one record does not mean that the number will increase by 1 necessarily. What's important is that the master and replica numbers are relatively close.
I strongly advise backing up the master directory in case it has to be rebuilt. This can be done as a cron job by using the db2ldif tool that is part of the Netscape directory package. I take the extra precaution of transferring a copy of this file via ftp to another server to ensure my data is easily accessible if needed and backed up twice. This is done via a cron job also. The vendor has also suggested to set up the server to write a copy of the database to another disk.
Metadirectories Metadirectories usually means that someone had to write a program to interface with dissimilar directories. Such a program can be something you purchase or a home-grown program. Netscape has some products, such as Solaris Extensions for Netscape Directory Server, that allow NIS information to be stored in their directory server. The Netscape Meta-Directory is another one of their products, which provides two-way synchronization between their directory server and messaging systems, network operating systems, and databases.
An example of a home-grown metadirectory is where the HR group owns a lot, if not all, of the entries you will need in your directory. Imagine they agreed to provide you with an ASCII file of the changes that happen each day (additions, changes, deletions). A program has to parse the file, make the modifications (or write an ldif), and create reports to ensure that the loading program ran successfully.
Another example is the opposite situation from above, in which there is a Web page where the employee changes his phone number. Then, a program writes to the directory and pushes it to the HR system, which might be using a relational database.
Conclusion Designing a directory can be an interesting challenge, and I am glad that use of LDAP directories is spreading. It is a fairly simple technology that gets the job done, if used correctly.
About the Author
Lizmari Brignoni has been a systems administrator for six years and a directory administrator for the last two and a half years. She can be reached at: Lbrignoni@netscape.net. |