April 28, 1999
Executive Summary
Directory technology has become increasingly vital
to corporate IT departments and the ubiquitous implementation of LDAP
as an access protocol holds out the promise of directory information
integrated or extended across applications. Rapidly evolving metadirectory
technology revolving around LDAP and the benefits it offers for corporate
Intranets have captured the attention of the press and analysts. However,
LDAP technology is actually much more important to the Internet.
Web applications rely on directory information in four principal
ways: (1) user authentication; (2) access control; and (3) customization
of content; (4) user and server configuration management. Directory
technologies that have been developed to meet the needs of corporate
customers do not have characteristics that will make them successful
as Internet directory services.The most commonly cited issues are
performance, scalability, ease of application development, and appropriateness
for web applications. Other key issues are openness, flexibility,
extensibility, and compliance with mainstream standards. A directory
technology capable of providing a foundation for web applications
and electronic commerce must have all of these qualities.IT and development
managers, as well as developers, evaluating directory technologies
for web applications and electronic commerce often consider using
relational database systems (RDBMS). However, RDBMS technology is
not appropriate for web applications. A detailed technical comparison
of RDBMS technology and native LDAP technology implemented in Netscape
Directory Server shows unequivocally that LDAP technology is far superior
to RDBMS technology for web and electronic commerce applications.
Directories for Web Applications: The Right Tool
for the Right Job
Web and electronic commerce applications on the web require users
to authenticate themselvesto gain access to services, to control user
access, to customize content, to manage user configuration information
for each user. One implementation that demonstrates this is http://my.netscape.com.
On the My Netscape web site each user is authenticated and content
is customized based on each configuration information stored for each
user.
Web applications require a data repository to store user account,
access control, and configuration information. Historically, existing
database technologies have been used for business applications but
these applications typically did not extend across companies and none
scaled up the Internet. Traditional database technologies provide
data repositories rather than directory services.
Requirements for web applications on the Internet
are very different and call not only for better directory technologies
but also for database technologies that are scaleable, flexible, extensible,
and capable of scaling to perhaps tens of millions of users. For web
applications, directory services are a superior alternative to legacy
data repositories.
Netscape’s native LDAP database directory is optimized
for directory clients (read access) and is extremely well suited for
web applications. Using Netscape Directory Server web developers can
easily authenticate, control access in a with a high degree of granularity,
and customize web content for very large numbers of users. In contrast
an RDBMS is optimized general purpose SQL clients typically used for
data manipulation.
RDBMS implementations are much more complex
than directories.RDBMS vendors have extended their database technologies
to provide directory services. While extensions of existing database
technologies are useful for many purposes they are not appropriate
for web applications. Developers of web applications for the Internet
must choose the right tool for the right job. The LDAP-centric directory
technology implemented in Netscape Directory Server offers much more
to web application developers.
Analyzing Directory Technologies for Web Application
Development
There are many business and financial benefits to
moving from paper-based business transactions to paperless electronic
communications whether for intranet (within a company), extranet (between
businesses), or Internet (between a business and its customers). Many
of the applications that make this possible demand unprecedented levels
of performance, scalability, extensibility, and manageability.
While a majority of web applications require directory
services some require a combination of directory and database capabilities.
The requirements of a specific application dictate the most appropriate
solution. Several factors are involved in evaluating directory and database
technologies for web applications.
Performance
Performance is the most critical aspect of a directory when used for
web applications. The nature of the Internet and of web applications
demand an extremely scaleable and fast repository of account information.
A repository of data representing user preferences or configuration
information must be able to sustain a continuous high volume of accesses.
The HTTP protocol is stateless and thus does not maintain sessions
between accesses to information. However, the action of requesting
a single page may result in many HTTP requests between client and
server. Where user authentication and granular access controls are
in place the data repository used must be able to sustain extremely
high rates of access.
Scalability
Increasingly, Internet-based web applications access large amounts
of data such as customer account information for e-commerce, inventory
or asset management, billing and payment information, bank account
information, or telephone charges. The scalability of the directory
service is the most critical factor for these kinds of applications
and for e-commerce in general.
Manageability
The Internet creates unique requirements for applications available
to external customers, one of which is the hours of operation. Most
Internet-based web applications must operate 24 hours a day, 7 days
a week. As a result directory services must be highly available
for these applications and all management functions must be done
while the service is online, including operations that require modification
to the directory service itself.
Programmability
Web applications demand that developers rapidly create and update
applications with zero down time. Advanced architecture and open technology
combined with readily available tools and APIs make Netscape Directory
Server an excellent foundation on which to build applications.
Netscapes LDAP Database Technology
Netscape Directory Server was designed to serve the needs of web servers
and web applications where there may be many different applications
that need to access the same data. There are several unique features
of Netscape Directory Server that makes it particularly well-suited
for account management and profile storage.
-
Optimized Native LDAP Database
-
Sophisticated Directory Search
-
Superior Performance
-
Scalable to Millions of Users>
-
Standardized Access Protocol>
-
Secure Communication with Business Partners
-
Open Development Tools and APIs
-
Online Management
-
Multi-Valued Attributes
-
Flexible Directory Schema
-
Simple but Powerful Data Model
-
Dynamic Triggers
-
Specialized Account Management Capabilities
-
Configurable, Reliable Replication
-
Ease of Administration
Optimized Native LDAP Database
The Netscape data store is optimized for LDAP clients performing typical
directory queries in which reads outnumber writes by an order of magnitude
or more. In Netscape's internal deployment of the directory (which
supports messaging as well as phonebook lookups), the distribution
is 99.99% reads to writes. In contrast, RDBMS systems are optimized
for heavy, concurrent, record-oriented access from SQL clients where
there is more of a balance of both read and write operations.
Sophisticated Directory Search
Netscape Directory Server is optimized for directory lookups and consequently
supports fast lookups across the full range of LDAP queries including
substring and phonetic queries. Netscape Directory Server supports very
efficient full substring queries (find names that contain the string
"smith") as well as phonetic queries (find names that sound
like "McCartey"), while most relational databases (Oracle
is an exception) do not.
In contrast, RDBMS systems support rich, complex SQL queries. A directory
is not a good replacement for a relational database for client-server
application that using complex SQL queries against a relational data
model. A directory, however, is optimal for straightforward, attribute-value
based queries, including Boolean combinations, on a hierarchical data
model.
Superior Performance
Netscape Directory Servers performance and scalability have been
independently tested and compared to RDBMS systems. Testing has shown
that the performance that can be achieved from Netscape's Directory
Server far exceeds that of the equivalently configured database. Typical
results are shown in the table below.
There are many aspects of a directory that will
affect the actual performance that can be obtained from the directory
such as the number of attributes returned with each entry, the type
of search, ex. exact match or sounds like, and the size of the database.
In the most ideal conditions, all indexes fit into memory and searching
for exact match on userid and returning just the dn, uid and password,
we have seen performance as high as 7000 search operations per second
on a 4 processor Xeon chip NT machine. While the performance of Netscape's
Directory Server is extremely high, actual mileage may vary depending
on your application.
Scalable to Millions of Users
Scaling to very large data sets is something that databases are good
at, although Netscape Directory Server can scale as well or better.
Netscape's Directory Server has been tested independently with over
25 million entries and internal Netscape testing has shown scalability
to 50 million entries using data representative of a typical user
account profile (a firstname, lastname, address, phone numbers and
password field).
There are many issues with storing a data set
this large in a single repository, directory or database, but it has
been done and is an area where Netscape focuses considerable effort
to ensure that the Directory can scale to the sizes required by these
large web applications.
Standardized Access Protocol
There are several why multiple web applications may need to share data
or where subsets of data used by applications need to be exposed to
customers or business partners. LDAP is especially suited to this function
since its access protocol is worldwide standardized.
Secure Communication with Business Partners
Access to data across companies can be simplified through the standardization
of the data access protocol to that data repository. Since LDAP is defined
to typically run on port 389 and can be run over SSL, it is easy to
configure communication across companies, for example, SSL communication
over the Internet through corporate firewalls. Netscape Directory Server
can protect the data repository by requiring access from a certain IP
address or hostname as well as requiring access over SSL for business
partners.
Open Development Tools and APIs
Netscape Directory Server is easily programmable using open and readily
available tools and APIs. Netscapes LDAP SDK requires programmers
to learn only a few simple but powerful commands in order to access
any LDAP accessible data source as compared with the varying number
of database commands which is dependent on the database vendor. Compared
with other data access protocols, LDAP provides simplified interface
to data while providing enough functionality and control to meet the
demands of the most sophisticated applications.
Many database programmers write to an abstraction
layer in order to insulate themselves from the vendor specific driver
for a particular database. These abstraction layer drivers, ODBC, typically
reduce the performance of the application since the data access must
first go through this abstraction layer then to the vendor specific
driver, then to the data source. In contrast, Netscape Directory Server
support native LDAP queries thus increasing the performance of applications
that use LDAP rather than proprietary APIs.
Online Management
The Internet operates around the world and around the clock thus commercial
web applications must remain available 24 hours per day, 7 days per
week. This requirement can be met with Netscape Directory Server through
its unique features for online data and service management. Netscape
Directory Server supports online management of directory data through
built-in support of online schema modification, online configuration
changes, and online backup.
Online schema modification - Modifications
to the schema can be done by any authorized user by issuing standard
LDAP operations to the schema. Dynamic index creation and update -
Netscape Directory Server supports the dynamic creation and management
of indexes while the server is running.
Online configuration changes - The configuration of the Directory
service can be modified while the server is online.
Online Backup - The directory can be backed up in several ways and
the internal process that backs up the data can be done while the
server is running. Alternatively, directory replication or an external
application can keep a backup server's data in sync with the master
for server failover operations. Transactional data store is used to
ensure that the internal database does not get corrupted due to any
adverse condition that may arise, thus ensuring that any modify operation
that was acknowledged will be preserved.
Multi-Valued Attributes
Database fields typically have a single value stored and if the application
needs to have multiple values for any field, then a new field must
be created to store the value. A typical example is the phone number
field where a person may have many phone numbers and may have multiple
business numbers depending on where they are currently.
Netscape Directory Server offers a solution through multi-valued attributes.
The Directory Server allows any attribute to have more than one value
associated. Additionally, these multiple values can be given an identifier
so as to allow the application the ability of requesting which value
to return. This can be used for many purposes including multiple representations
of a specific attribute such a text or binary representation, ex. a
voice or text greeting or vacation responder, or to simply represent
a different language spelling of a person's name, for example, English,
German, French, or Japanese.
Flexible Directory Schema
A database represent data in a relational model, while most directories
typically represent their data in a hierarchical model, trading off
simplicity for unnecessary richness. It is easier to administer (load,
modify, manage access control rights, define owners for, replicate etc.)
directory-type data with the attribute-value, hierarchical data model.
Simple but Powerful Data Model
Netscape Directory Server supports the flexible LDAP data model which
is typically (though not necessarily) hierarchical -- just like a file
system. A hierarchical, attribute-value based data model allows you
to store and retrieve information such as user/group data, preferences,
configuration data, and many other data types simply. As a simple rule
of thumb -- if you can visualize your data as a single table with rows
and columns or a phonebook, the directory is a perfect fit.
A database's relational data model is needlessly complex for these straightforward
applications. Think about users, groups, membership relationships, application
preferences, and other typical data that is stored in a directory: an
attribute-value, hierarchical data model is perfect for these applications.
Of course for complex applications that need a relational model (such
as a Oracle Financials-type application), a database is the right choice.
It's simply a matter of choosing the right tool for the right job.
Dynamic Triggers
Databases have been provided a means of executing business logic based
on an action to the data repository through triggers. In the same way,
Directory Server provides a persistent search mechanism that will provide
the developer a way to setup a dynamic trigger based on an application
level search. This persistent search can be used to keep other data
sources in sync with any changes or to simply notify some application
or person when a particular item or group of items change in the Directory.
Specialized Account Management Capabilities
Since the majority of uses for a Directory are for account management
and access control, extra effort has been invested in designing Netscape
Directory Server to support account management, users, passwords and
groups. This design focus has lead to several features that are unique
to directories as described below.
Password Policy - Directory Server supports
a password policy so that it will be possible to enforce a common
policy on how passwords are managed. This function is very specific
to a directory service and is not provided for in a typical database
server.
Dynamic Groups - Since Directory Server must manage accounts and access
controls, dynamic groups can help ease the management overhead commonly
associated with groups and access control rules. Dynamic Groups are
a way of defining a group of people or objects based on a search criteria,
ex. an email address for all people in building 12 would be a dynamic
group of all people who's building attribute is equal to 12. IETF
Schema standards - The IETF has defined standards for account objects
so as to ease account management. Compare this to a typical database
application where account information storage is left totally up to
the developer and varies from application to application.
Client Failover - LDAP clients can consult a list of backup directories
in case the main directory is unavailable, this is done automatically
by the application's LDAP client library.
Configurable, Reliable Replication
Unlike a database, where data integrity must be maintained at all
times, even when replicating to another database, directory servers
are designed to have multiple copies of the directory available in
multiple places on the network. This provides directory applications
with multiple directory sources where they can read the same data
in the event that one system becomes unavailable.
Administrators can create multiple redundant copies of the data store.
We replicate copies of the directory to keeping track of changes in
a master directory, then replaying those changes (using LDAP) against
the relevant consumers. Work is already underway in the IETF to standardize
replication so that this mechanism works between directories from multiple
vendors. In contrast, database replication solutions are proprietary.
You cannot replicate between Oracle, Informix, Sybase, or IBM databases
without 3rd party (or additional cost) products.<![endif]>
Netscape Directory Server can continue replication in the face of a
wide variety of failure conditions, including network outages, server
crashes, supplier crashes, and consumer crashes. When systems resume
normal operation, the directory simply picks up where it left off, making
sure changes are propagated to all servers.
Netscape Directory Server support replication between servers using
LDAP over SSL, thus allowing this replication process to operate over
untrusted networks like the public Internet. This will allow organizations
to distribute the directory data to any other server regardless of location
or connection type, this allows organizations to replicate to remote
office over the Internet thus reducing the network requirements needed
to support these remote locations.
Ease of Administration
RDBMS systems must be managed by experienced and usually expensive database
administrator (DBA). Tuning the Directory Server is easy by comparison.
Administrators tune the directory by deciding (1) which attributes to
index and (2) how large caches should be. By following a series of straightforward
recommendations, moderately experienced IS professionals can manage
a directory. You do not need an expensive, seasoned DBA to manage your
directory service.
Conclusion
For directory services and web application development Netscape Directory
Server offers many advantages over traditional RDBMS systems. Netscapes
native LDAP database directory is optimized for these demanding applications.
Netscape Directory Server offers superior performance, scalability,
manageability, and programmability. At the same time, Netscape Directory
Server provides a rich set of capabilities designed and optimized for
web applications. In contrast an RDBMS systems are optimized for general
purpose SQL clients that typically perform data manipulation.
Choosing the right tool for the right job depends on the application.
In most cases the need for fast, scaleable directory services and the
demands of web applications make Netscape Directory Server the ideal
choice for Internet, extranet, intranet, and e-commerce applications.