Examining the LDIF File Format

by David Friedland

LDIF (Data Interchange Format) is a standard, plain text-based file format representing LDAP (Lightweight Directory Access Protocol) data for address books, spreadsheets and other structured data forms that can be easily manipulated with a text editor.

LDIF is typically used to import and export directory information between LDAP base directory servers, or to describe a set of changes which are to be applied to a directory. LDIF is also frequently used by e-mail agents as a file format for directory information, such as address books.

LDIF was designed in the early 1990s by Tim Howes, Mark C Smith, and Gordon Good at the University of Michigan. LDIF was updated and extended in the late 1990’s for use with LDAP Version 3.

This later version of LDIF is called Version 1 and is formally specified in RFC 2849, an IETF Standard Track RFC. Mr. Good authored RFC 2849, a standard published in June 2000. It remains the current proposed standard.

A number of extensions to LDIF have been proposed over the years, one of which was formally specified by the IETF and published; RFC 4525, authored by Kurt Zeilenga, extended LDIF to support the LDAP Modify-Increment extension.

LDIF stores everything in an ASCII text file to mitigate many cross-platform issues stemming from the early days of its creation. However, modern spreadsheet software, e.g. OpenOffice.org Calc and Gnumeric, offer more character encoding to export/import.

Each content record is represented as a group of attributes, with records separated by blank lines.

The individual attributes of a record are displayed as single logical lines (represented as one or more multiple physical lines via a line-folding mechanism) comprising “name: value” pairs. Values stored in multi-value attributes cannot be replaced directly. Attributes values must be deleted and “add:”must be used repeatedly to feed all the required values.

The following example, provided by IBM, is an LDIF file containing three record entries. Notice that the fields, as with XML, are prepended with a tag, and unlike most sequential formats, begin on new lines:

dn: cn=John E Doe, o=University of High
er Learning, c=US
cn: John E Doe
cn: John Doe
objectclass: person
sn: Doe

dn: cn=Bjorn L Doe, o=University of High
er Learning, c=US
cn: Bjorn L Doe
cn: Bjorn Doe
objectless: person
sn: Doe

dn: cn=Jennifer K. Dow, o=University of High
er Learning, c=US
cn: Jennifer K. Doe
cn: Jennifer Doe
objectclass: person
sn: Doe

For more information on IRI solutions for LDIF file conversion, transformation, reporting and test data, click here.

Sources for this article include Wikipedia, IRI, Inc., and IBM.

Please send your feedback on this article, and any suggestions for future newsletter business articles, to news@iri.com.

Data Risk Mitigation via Data Masking

IRI Reflects on 40 Years of Unix