Over the past 30 years the Domain Name System has become an integral part of the operation of the Internet. Due to its ubiquity and good performance, many new applications over the years have used the DNS to publish information. But as the DNS and its applications have grown farther from its original use in publishing information about Internet hosts, questions have arisen about what applications are appropriate for publication in the DNS, and how one should design an application to work well with the DNS.

The DNS basically is a client-server system that provides a map from a tree-structured name-space into arrays of typed records. That is, a client sends a name and a record type to a server which has authoritative data for that name, and the DNS server returns a set of records. The set may be empty (often known as NOERROR from the name of the status code), or it may be nonexistent (NXDOMAIN) which is semantically different.

The names are a sequence of string labels representing a path from the root of the DNS to a node. Although the DNS protocol is almost 8-bit clean (other than case folding the 26 upper and lower case letters), by convention all labels are composed of 7-bit ASCII printing characters. Names are conventionally written with a dot separating the labels, although a dot is a valid character in a label.

The records returned have types known as RRTYPEs. Each RRTYPE has a well-defined format for the fields in the record. Common RRTYPEs are A, which includes an IPv4 address, and MX, which includes the name of host that handles mail for a domain and a number that gives the relative priority of multiple hosts handling mail for a domain. Although it is possible to make an "any" query, because of the way that DNS caches work, the only reliable way to determine if a particular kind of record is present at a name is to make a query for that record type.

Delegation and caches

The DNS divides up its work using delegation and caches. A DNS server can delegate the tree at and below any node, known as a zone, to one or more other servers. In response to a query for a name in the delegated tree, the delegating server returns a list of the servers to which the tree is delegated. The client then resends the query to one of those servers. Chains of three or four levels of delegation are common. Each zone can be (and typically is) managed separately from the zones above it and below it.

Rather than send queries directly to the authoritative servers, most DNS clients instead query a local cache which queries the authoritative servers on the client's behalf. This has two practical advantages. One is that the cache handles all of the delegation following, as well as some other name indirection due to CNAME and DNAME records, allowing a much simpler query library in the client. The other is that the cache remembers and reuses the results of previous queries, notably including queries that provided delegation information. DNS caches are believed to be highly effective, although actual date on their effectiveness is surprisingly sparse. Hence applications with high query rates and few repeated queries may cause problems for DNS caches, both because of the amount of DNS traffic caused directly, and because the results will replace other entries in caches, causing more DNS traffic to re-fetch results that would otherwise have been available in the cache.

Provisioning systems

The set of records in each authoritative server is provided via some kind of provisioning system. The original model in RFCs 1034 and 1035, the 1987 documents that are the basic definition of the DNS, assumed that the data for each zone would be hand edited into a text file known as a master file which is then loaded into one of the authoritative servers, known as the master for the zone. The rest of the zone's servers, known as slaves, use AXFR queries to copy over the zone, and recopy whenever the zone changes. A more recent modified version of AXFR called IXFR allows slave servers to copy just the changes to modified zones.

Although there are still plenty of zones that work this way, it doesn't scale very well for servers with very large zones, or that serve many different zones. Since the details of server provisioning, including the distinction between master and slave servers, are not apparent to DNS clients, many servers are provisioned in other ways, such as storing the zone data in a data base, using a web-based front end to manage data base or text zone data, or in some cases such as rbldnsd (the server used for more DNS blacklists and whitelists) to store the zone data in an entirely different form and create the DNS responses on the fly.

Although provisioning systems have not gotten a lot of attention from the DNS community, they are a key part of the DNS system. Most notably, provisioning systems are the main reason that it is difficult to deploy new DNS RRTYPEs, since the provisioning systems tend to have specific support for each kind of record, and need to be upgraded to handle each new record type. Limitations of provisioning systems are also a major reason that non-ASCII domain names are encoded as ASCII A-labels rather than putting the UTF-8 U-labels directly in the DNS.

In the following installments, we'll look in more detail at some of the ways the DNS works, and how that affects its suitability for different kinds of applications.

Part II - Exact and Approximate Name Matching

Part III - Name Structure and Delegation

Part IV - Global Consistency

Part V - Large Data

Part VI - Overloaded Record Types

Part VII - Related Names Are Not Related

Part VIII - Names Outside the DNS