The
OpenLDAP Proxy Server
Reinhard E. Voglmaier
Most people think of proxy servers only as servers that access
resources on behalf of their users, but they can do much more. In
this article, I will discuss LDAP proxy servers in particular and
describe the functionality of this type of proxy. They can add access
control, serve resources to their users, verify that users are who
they claim to be, restrict access to resources, and rewrite requests
using regular expressions. LDAP proxy servers also provide attribute
mapping; this means they can map one attribute to another or hide
an attribute altogether. These servers are frequently used for load
balancing and fault tolerance, and can also have a cache to store
results of frequently requested queries.
Another strong point of LDAP proxy servers is that the proxy can
hide the complexity of the directory architecture from the end user.
This allows administrators to offer different views of the same
directory to different users. All these services, obviously, come
at a cost, so it's no wonder that software vendors offer their LDAP
proxy implementations for a high price.
Fortunately, there is an open source choice -- OpenLDAP. The OpenLDAP
implementation is based on the former "University of Michigan LDAP
Server" and can be downloaded along with complete documentation
from the official OpenLDAP Web site [1]:
http://www.openldap.org
The OpenLDAP group, headed by Kurt Zeilenga, is the main player in
the development of the LDAP standards and provides a reference implementation
of a LDAP server along with OpenLDAP. The Web site offers a searchable
archive of how-tos, links to related software projects, and a number
of LDAP-related discussion groups.
In this article, I will give an overview of what LDAP proxy servers
do, then jump directly to the architecture of the OpenLDAP implementation
and, in particular, how the proxy fits into this design. I will
also explain how to build and configure the OpenLDAP proxy. I will
conclude with a few real-world scenarios. I think that the OpenLDAP
framework is applicable not just to little projects with a small
budget; it also scales very well, and I've done performance tests
with several hundred thousand entries.
You can find out more about the OpenLDAP proxy server in the documentation
shipped with OpenLDAP; detailed instruction can also be found in
The ABC's of LDAP (Voglmaier, Auerbach Publications [2]).
This article assumes you are already familiar with the LDAP protocol.
All the configuration files can be downloaded as usual from the
Sys Admin Web site (http://www.sysadminmag.com).
The OpenLDAP Proxy Server
As stated previously, an LDAP proxy server accesses services (in
our case, LDAP services) on behalf of a client's request. This architecture
is used frequently if the user is behind a firewall and wishes to
access resources outside, normally on the Internet. More generally,
the LDAP proxy provides a way of giving controlled access via the
LDAP protocol to resources outside the actual domain; therefore,
you may use it to join different domains in your intranet (e.g.,
different LANs located in different countries of your enterprise
intranet).
Every LDAP server consists logically of two parts: a frontend
and a backend. The frontend speaks the LDAP protocol [3] with the
LDAP clients; meanwhile, the backend accesses the repository actually
holding the data. Figure 1 shows the OpenLDAP architecture. The
frontend speaks the LDAP protocol and contacts the backend upon
the client's requests. The backend actually provides the data.
This architecture offers enormous flexibility. You can access
different data stores just using different backends. If you need
to access a different data store, not yet contained in the OpenLDAP
framework, you can roll your own backend (and, I hope, make it available
to the open source community).
OpenLDAP itself is shipped with a number of backends. The most
frequently used ones seem to be the database backends BDB and LDBM.
There is not much documentation available about which of these performs
better under which conditions, so I won't address that here. OpenLDAP
offers a backend that can store data in a RDBMS. You can also store
the data in flat files using a shell backend or Perl functions.
More about the available backends can be found in the documentation
shipped with the OpenLDAP server.
Finally, there are the two backends this article is all about:
the "ldap" module and the "meta" module. Both modules provide proxy
services -- the "ldap" module is the basic module, and the "meta"
module works on top of the "ldap" module and offers more sophisticated
proxy services. The idea is to use the backend not to access a repository
directly but to contact another LDAP server that holds the data.
Figure 2 shows this architecture.
The OpenLDAP Proxy Cache
Directory services are designed for fast data retrieval; the speed
of update processes, however, is not an important issue, since a
directory is not meant to be updated frequently. To speed up searches,
you can use the query caching extension of the OpenLDAP proxy server.
This extension holds the result of a query in a cache in memory
and looks at each query first to see whether it could recycle previous
query results. Figure 3 shows this architecture. The "Cache Manager"
makes the decision as to whether the result of a previous query
can totally or partially be reused to answer the current query.
This solution is new with the OpenLDAP 2.2.x version.
Unlike static Web content cached by Web proxy servers, the entries
cached in an LDAP proxy server are obtained from queries. Different
queries can produce similar result sets. The result set of a query
B can also be a subset of a result set from previous query A. Query
B is said to be contained in query A. The "cache manager", therefore,
is responsible for understanding this "query containment". You can
find more about query containment in the resources by Apurva Kumar
[4-6].
The job of the cache manager is therefore to handle the following:
1. Query Containment: Determines whether the current query is
contained in a previous one.
2. Cache Replacement: As new query results are added to the cache,
it continues to grow. Once the cache size grows over a high-threshold
watermark, the cache manager begins to cancel cache entries. It
does this following a simple "least recent used" (LRU) policy. The
oldest query and entries belonging to this query are cancelled.
The cache manager does this until the cache reaches a low-threshold
watermark.
3. Consistency Control: The cache content should not be held forever;
the cache manager uses a policy to decide when to clean up results
from the cache. OpenLDAP uses a parameter "TTL" (Time to Live) that
allows you to configure how long items should be held in the cache.
Please note that the cache feature is still considered a work
in progress, at the time of this writing not every feature works
as you may expect from the documentation. Also, the documentation
for the cache features is not yet complete -- some changes may still
be made.
Installing the OpenLDAP Proxy Server
Now that we have seen a lot of theory, let's begin with practical
work. As I mentioned previously, after you unpack the OpenLDAP distribution,
you must compile your server. Many operating systems already contain
a ready-to-use version of the OpenLDAP distribution. This comes
in handy, but will not give you the latest release and will not
necessarily result in a system tailored for your needs. Therefore,
I suggest you compile it yourself. I will limit my description to
the Unix platform, but note that there is a group porting OpenLDAP
in the MS world.
As with most open source programs, the first step is to personalize
the distribution via the "configure" script followed by a number
of parameters. You do this to adapt the installation to your particular
environment. Then, you type make depend and, finally, make.
There is also a make test available, which runs a number
of tests against your compiled binaries. The make test, however,
works well only if you compile the complete distribution. In this
example, it won't work, because we won't compile in things we don't
need.
The parameters determine the configuration of your LDAP server;
the configure -help command gives a listing with all available
options. I will not cover the compilation of the LDAP server in
general; you can find more information in the OpenLDAP documentation.
Listing 1 shows the parameters used to customize the compilation
of an OpenLDAP proxy server with the functionality required for
this article. The meaning of each of these parameters is as follows:
--prefix -- Prefix where to install the software
--enable-slurpd -- Enable build of replication daemon
--enable-bdb -- Enable Berkeley DB backend
--enable-ldbm -- Enable DBM database backend
--enable-ldap -- Enable LDAP backend
--enable-meta -- Enable metadirectory backend
--enable-rewrite -- Enable DN rewriting in back-ldap and
back-meta
--with-proxycache -- Enable caching
Note that the configure command shown pulls in a number of modules
that may not be required for a simple proxy server. In fact, the
simplest configuration would be just to use the --enable-ldap=yes
switch, which would provide a simple proxy without any rewriting
or cache functionality.
I recommend saving your configure command in a file (e.g., ConfigureProxy.sh),
so you can reuse it later and keep track of how you compiled the
LDAP server. Next, launch the configure script, the make depend,
and the make commands. If everything proceeds successfully,
install the server with the make install command.
Using the OpenLDAP Proxy Server
Now that we have successfully installed the OpenLDAP proxy server,
let's look at a simple example. We will configure the proxy server
simply to forward all requests to the LDAP server holding the data.
Listing 2 shows the slapd.conf file configuring this behavior. As
you can see, the configuration is very easy.
Next, we need to declare that we are using the LDAP backend. In
the second line, we define the baseDN that the proxy server should
handle. And at the end, we tell the LDAP proxy the uri that it must
contact for requests to this baseDN. (The syntax of the uri follows
the LDAP standard uri as defined in LDAP URL Format [4].) The proxy
sends every LDAP request that has the baseDN "dc=LdapAbc,dc=com"
to the LDAP server installed on host LdapServer.LdapAbc.com and
listening on port 349. Meanwhile, requests for the baseDN "dc=LdapAbc,dc=org"
are sent to the LDAP server installed on LdapServer.LdapAbc.com
but listening on port 749. All requests (i.e., search, modify, add,
and delete) are proxied.
As you may notice, you can define more than one backend, which
allows load balancing. You can install different subtrees of the
directory on different servers and glue them together with the OpenLDAP
proxy server. You could have, for example, an OpenLDAP server, a
Sun ONE server, and a Novell LDAP server, and speak to these three
different servers via the OpenLDAP proxy server. The client does
not even notice the complexity behind.
Mapping Attributes and objectclasses
Until now, the proxy passed every request straight through from
the LDAP client to the LDAP server, and every response straight
back again from the LDAP server to the LDAP client. We can, however,
map attributes and objectclasses in such a way as to give the client
a different view of the data hosted on the LDAP server. See Listing
3. Here we want to provide only the name, the surname, and the telephone
number; the rest of the attributes should be hidden from the client.
The attributes "givenName", "sn", and "telephoneNumber" are mapped
to themselves in the first three lines, "cn" is mapped to "sn" (line
4), and every other attribute is removed with the last instruction:
map attribute *
We can also map entire objectclasses. The following instruction:
map objectclass groupOfNames group
maps the groupOfNames objectclass used by the LDAP client to the objectclass
"group", which is used, for example, by the Active Directory of Microsoft.
The Rewrite Engine
The rewrite engine is a very powerful tool. Consider the example
configuration shown in Listing 4. The suffix "dc=LdapAbc, dc=org"
now points to the same LDAP server that "dc=LdapAbc, dc=com" points
to. To begin, we need to switch on the rewrite engine. Then, we
must provide the rewrite rule; in this case, we change the "dc=org"
domain to the "dc=com" domain that the directory server really hosts.
The syntax is very similar to that used by the well-known Unix tool
sed; indeed, the rewrite engine uses regular expressions. You will
notice also the "rewrite context" clause; let's see what this means.
Every LDAP server executes a number of operations (bind, search,
compare, add, etc.). Each of these operations can be executed in
a particular context. The operations, and thus the context are divided
into client->server and server->client operations. You can
define a rewrite rule individually for every such context. The abovementioned
operations are all client->server operations (i.e., the data
flow is from the client to the server). A server->client operation
is shown in the last lines of our example configuration file --
searchResult. The default context is a general context for server->client
operations. If we didn't configure the searchResult context, the
results would return the "com" suffix and not the "org" suffix we
requested. Therefore, we rewrite the results, and the client does
not see that it has been served from the "com" domain.
The rewrite engine can also do much, much more. Explaining all
of its functionality, however, is far beyond the scope of this article.
Please consult the OpenLDAP documentation for more details.
The Proxy Cache Engine
The proxy cache extension is available from version 2.2.x. The
proxy cache engine works with every backend; however, it's intended
to be used with the "ldap" or "meta" backend. The proxy server holds
its cache in a special database on disk. Thus, you must configure
which database you want to use (LDBM, BDB, or HDB). Listing 5 shows
an example configuration file.
Let's look at the details. There are four new instructions that
configure the behavior of the cache. The first instruction is: "overlay
proxycache"; it's just the beginning of the configuration of the
proxy cache.
The second instruction is the proxyCache instruction. The syntax
is:
proxyCache <DB> <maxentries> <nattrsets> <entrylimit> <period>
where DB is the database type you will use (BDB, LDBM, or the
recent newcomer, HDB), maxentries is the maximum number of
entries the cache may hold, and nattrsets is the maximum number
of attribute sets that will be defined (you define them with the proxyAttrset
parameter shown later). Entrylimit is the maximum number of
entries of a cacheable query, and, finally, period is the time
in seconds that each consistency check is executed (i.e., the cache
manager controls for every entry whether it's time to live is over).
The third instruction needed is the proxyAttrset instruction.
It has the syntax:
proxyAttrset <index> <attrs . . .>
where index is the index that identifies the set and attrs
is a number of attributes. As the name suggests, it defines an attribute
set. You will define as many different sets as you have declared in
the proxyCache statement. The proxyAttrset is used to define cacheable
templates. Let's look at cacheable templates together with the next
instruction, the proxyTemplate.
Remember when I mentioned query containment? The more specific
query is contained in the more general one; for example, the query
("sn=Voglmaier") is contained in the query ("sn=Vogl*"). Or, a search
in the subtree "ou=people,dc=LdapAbc,dc=org" is contained in the
search in the subtree "dc=LdapAbc,dc=org". To simplify things a
little, the OpenLDAP group uses the concept of templates to determine
query containment. This is done with the proxyTemplate instruction.
A template is just a search filter that does not have an assertion
value. Examples are:
Filter: (sn=Voglmaier), Template: (sn=),
Filter: (&(sn=Vogl*)(givenName=Rein*)), Template: (&(sn=)(givenName=))
The proxyTemplate has the syntax:
proxyTemplate <prototype_string> <attrset_index> <TTL>
where the prototype string is the template mentioned above, the TTL
is the time to live (i.e., the time an entry may be kept in the cache
without being used), and the attrset index, indicating the
attribute set, is the same as defined before. Therefore, the attribute
set and the proxyTemplate together define which entry is to be stored
for how long in the cache.
Enough theory -- let's look at the example in Listing 5. Here
we use an LDAP backend; the suffix is our usual LdapAbc group, and
we will contact the LDAP uri we used in the example previously.
Thus, the first three lines are the ones we just have seen. The
overlay proxycache does nothing other than begin the
definition of the whole thing. The proxycache instruction
defines a BDB cache that holds up to 100000 entries. Furthermore,
the cache holds only one attribute set and caches only results that
return less than 1000 entries.
Every 100 seconds, the cache manager checks whether any of the
entries in the cache are older than their TTL permits; if so, those
entries are cancelled. The proxyAttrset defines the only
attribute set (with index 0) our cache will hold. We have two templates
combined with this attribute set -- one searching for the sn,
the other combining sn and givenName with an "&"
relation. From the cachesize instruction on, there are instructions
to configure the BDB database that are therefore BDB specific.
Chaining
I will close this article mentioning a further important concept
used in LDAP. When an LDAP server is asked about data that it does
not hold, but it knows who does hold the data, the server will return
this information to the client in the form of a referral. This technique
is used, for example, in partitioning the LDAP directory tree. One
reason for partitioning may be to improve performance. Instead of
sending a referral to the client, the directory server could also
proxy this request to the server holding the data. This technique
is called "chaining". To do this, you must configure the proxy backend
to all subtrees you will proxy, and the subtrees must be configured
before their parents. Listing 6 shows an example configuration file.
Conclusion
In this article, I presented the OpenLDAP proxy server and its
configuration. I showed that it is not only useful to let queries
go out of domain boundaries, but that it can do much more. I described
how the proxy can be used for load balancing as well as for rewriting
LDAP queries and LDAP answers. I also briefly discussed chaining,
showing that it can be combined with other backends as well.
OpenLDAP can be a powerful tool used in very different situations
to address problems that can be difficult to solve another way.
And, because the OpenLDAP suite is open source, it is free, including
the source code. So, if you run into problems, you can consult the
source code yourself and are free to extend the software to better
fit your reality. If you have any doubts or questions, as always,
I will be glad to help you.
Resources
1. OpenLDAP project -- http://www.openldap.org
2. Reinhard E. Voglmaier, The ABC's of LDAP, Auerbach Publications,
November 2003
3. Lightweight Directory Access Protocol (v3), RFC 2251 -- http://www.ietf.org
4. The LDAP URL Format, RFC225 -- http://www.ietf.org
5. Apurva Kumar, IBM India Research Lab: The OpenLDAP Proxy Cache
-- http://www.openldap.org/pub/kapurva/proxycaching.pdf
6. Draft-apurva-ldap-query-containment-01, Apurva Kumar: Schema
to Support Query Containment in LDAP Directories
Reinhard Voglmaier studied physics at the University of Munich
in Germany and graduated from Max Planck Institute for Astrophysics
and Extraterrestrial Physics in Munich. After working in the IT
department at the German University of the Army in the field of
computer architecture, he was employed as a Specialist for Automation
in Honeywell and then as a Unix Systems Specialist for performance
questions in database/network installations in Siemens Nixdorf.
Currently, he is responsible of LDAP Services at GlaxoSmithKline,
Italy. He's also the author of The ABC's of LDAP (Auerbach
Publications, November 2003). He can be reached at: rv33100@gsk.com.
|