Troubleshooters.Com and T.C Linux Library Present

djbdns Intro

CONTENTS

Introduction

There's something wonderful about a simple machine. My first car was a 1959 Plymouth with a flathead 6 engine. You could do a complete tuneup in 20 minutes, using only an adjustable wrench and a gapping tool. Thermostat replacement was another 20 minutes. Almost any job was 20 minutes or less. No computers, no smog control, no frills. A simple car made for simple maintenance. My car felt, for want of a better word, healthy.

I open the hood on modern cars and feel sick to my stomach. Changing the thermostat requires disassembly of an hour's worth of smog and computer control. Special tools for every job. More and more, cars are getting to be black boxes. Sometimes I long for that healthy feeling of my '59 Plymouth.

Not to brag, but once upon a time I was somewhat conversant with Bind, the most common DNS facility in the Linux world. As a matter of fact, I wrote the DNS chapter in Red Hat Linux Unleashed 6 and 7. Bind isn't simple. The /etc/named.conf file lists all the domains (zones), and has a certain (non-trivial) file format. The files in /var/named each represent a single zone, and have a different (but confusingly similar) file format to that of /etc/named.conf.

Those files are the maximum extent I "work on" Bind. I would no more think of messing with Bind source code or making other "tweaks" than I would consider changing the head gasket on a 2004 Chevy Astro.

Then I encountered the djbdns DNS system, and I remembered that 1959 Plymouth feeling. Every file format is trivially simple. Many files implement a key-value pair, with the filename being the key and its (one line of) contents being the value. With djbdns, I get that healthy feeling that I can do anything, in 20 minutes or less, using nothing but a simple set of installation instructions.

If djbdns resembles my 1959 Plymouth, it also resembles VimOutliner. VimOutliner is the outline processor I first cobbled together after converting to Linux. Being only one guy, I didn't have time to write a big, fancy, monolithic app. So instead, I took the existing Vim program, wrote some vim scripts and Perl scripts to give it outlining capabilies,

I didn't have much room to brag, because VimOutliner wasn't really my code -- it was a melding of features from other people's code. VimOutliner used Vim's folding for headline expansion. It used ctags for interoutline linking. Headline promotion and demotion were done by VI using VI keystrokes. Even body text was simply an adaptation of Vim's syntax sensitive comment handling. VimOutliner is very maintainable, performs remarkably well, with few bugs, because it's assembled from heavily tested code in the community.

Kind of like djbdns. Djbdns makes heavy use of UNIX -- its filesystem, its FIFO pipes, its networking, its pipe/filter philosophy. Djbdns is comprised primarily of a build tree. Configuration options requiring key/value pairs are implemented as files, with the filename being the key and the file's contents being the value. For more complex configurations, such as the setup of SOA, NS, A, PTR and MX records, djbdns uses a simple line-centric, colon delimited format with a column 0 flag. It's not ultimately extensible like XML, but it's instantly recognizable to a human, easy to document, easy to edit with an editor, easy to parse, easy to create with front-ends, fast and efficient. The last step in configuration is often use of the UNIX make utility -- why reinvent the wheel.

If you're a guy who likes things simple, who requires only that your systems work every time, without gratuitous fanciness to slow you down, djbdns just might be for you. The only thing is, it doesn't come on any Linux distro that I know of. You need to install it yourself.

NOTE

The preceding paragraph was written in 2005, along with most of this document. Today, in 2011, djbdns is now in the public domain so it's legal to include it with a distribution, and many distros have djbdns packages. And many of those don't work. For what it's worth, here in 2011 I still think it's better to compile your own djbdns.

Fortunately, installing it yourself isn't hard. The purpose of this document is to make it easy.

The djbdns DNS server is a wonderful replacement for bind/named. Note the advantages:

Secure
Modular
Simple (considering what a DNS server must do)
Simple configuration
Easy to create your own configuration scripts

Djbdns has a reputation for being hard to set up. That's not really comparing apples with apples. If we had to compile and configure named/bind, most of us would pull out our hair. The fact that bind/named comes preconfigured on most Linux distros give us a false sense of ease. Yet even with it configured, just setting up our name resolution with bind/named can be tricky, and navigating the different files tiresome.

This document walks you through downloading, compiling, installing, setting up and configuring djbdns. I think you'll find it easier than you imagined.

This document also elaborates on the format of the djbdns data file that defines name resolution for the server.

Points to Keep In Mind

The djbdns server is relatively easy if you just remember a few points:

djbdns is written by Daniel J. Bernstein
Beware of distro-based djbdns packages
With djbdns, the caching dns server is completely separate from the authoritative dns server
dig is your friend
djbdns runs on daemontools
Set up the caching server first
Enable and disable are done by symlink
Set up in the proper order
Troubleshoot modularly

djbdns is written by Daniel J. Bernstein

This is important because, judging from his programs, Daniel J. Bernstein (usually abbreviated "djb") thinks very differently than everyone else. Having little respect for the Unix division of files into /etc's config, /usr/bin's executable files, and /var's constantly updated files, djb instead tends to put all files related to a given application in one tree. Configuration, scripts, logs -- all in one tree.

And you know what? It kind of makes sense. It's like OOP for executables, with all methods (executables) and properties (config) bunched together. If OOP makes sense for source code, why not for executables? Also, remember back in the DOS days you could move a program from one computer to another just by copying the tree it was in? How cool was that? I'm not saying you can necessarily do that with djbdns, because not all of it is statically linked. But, for instance, if all your computers had 64 bit Ubuntu 11.04, you could probably compile up the whole thing on one machine and copy the tree between machines. Then all you'd need to do is the dnscache-conf and tinydns-conf steps, the /service directory linking, and maybe a few other easy tasks.

Djb's installation methods are also unorthodox. Most Linux/Unix/BSD installations are ./configure;make;make install. The make install part copies all needed files elsewhere on the computer, and then you can totally delete the tree in which you built the application. Don't do that with djb apps. Much of his runnable code is contained in the build tree, with various symlinks pointing to that build tree. Delete the build tree, and your djb applications fail.

Now here's your challenge: Seeing as djb puts everything in one build tree -- where do you put that tree? What djb recommends is to put the build tree under a new directory called /project you create under the root directory. That's cool because you know it will be mounted from the moment of boot, so you can use djb's svscan software in place of init. But many people don't like to pollute their root directory. Djb recommends putting, for want of a better word, their executable objects, under the /etc directory. Same thing -- it's mounted the instant you boot, and of course all config is in the right place. And there are ways, not discussed in this document, that you can move the log files to a point under /var.

Besides the one-tree build, djb's daemontools app creates a manditory /command directory, and a manditory /service directory, both under the root directory. You can't change either of these without considerable mucking around with the way djb has set everything up, and personally, I wouldn't recommend trying to move them.

I'll leave it to others to argue whether the djb way is the right way or the wrong way. Instead, I'll simply say you need to be aware of how he does it.

Beware of distro-based djbdns packages

Your mileage may vary, but my experience on Mandriva and on several Ubuntu versions is that the djbdns packages in their repositories just don't work. Oh, maybe you can troubleshoot them into working, but isn't the whole point of packages to simplify installation of software that "just works"?

I recommend downloading djb's tarballs from http://cr.yp.to/djbdns.html and following djb's instructions. Yeah, it will take some time. Yeah, you'll have to troubleshoot. But in my experience, you'll get to a working djbdns faster that way than trying to whip distro-supplied packages into working.

By the way, this document should help quite a bit.

The caching dns server is completely separate from the authoritative dns server

Djbdns uses the dnscache program as a caching server. This typically listens to a port on the IP address of the computer hosting it, and is what that computer's /etc/resolv.conf points at. Other computers on the subnet may also point to it.

Djbdns uses the tinydns program as an authority server that translates between names and IP addresses on the subnet. This typically listens to a port on 127.0.0.1 and typically, but not always, is queried only by the caching DNS (that's dnscache) on the same machine. You can get dnscache to query tinydns by putting tinydns' IP address (127.0.0.1) as the contents of a file with a filename like domain.cxm, if this tinydns is authoritative over domain.cxm. This domain.cxm file is placed inside the dnscache/root/servers directory. When dnscache asks about a URL ending with domain.cxm, it queries the IP address of domain.cxm to find the IP address of the URL. There's a similar provision for reverse lookup on the subnet of the domain (in this case 192.168.100), where 127.0.0.1 is placed in a file called 100.168.192.in-addr.arpa.

In other words, the caching server (dnscache) and the authoritative server (tinydns) are completely distinct. They work quite well without each other, and they can be troubleshot independently of each other! That's a huge advantage when debugging.

dig is your friend

Your number one troubleshooting aid in debugging djbdns is the dig executable. Its best feature is that when the first argument is the IP address of the DNS server you're testing (preceded by an @ sign), it looks ONLY on that server for an answer. This means you can pinpoint your testing on this server. For instance,

dig @192.168.100.2  troubleshooters.com

The preceding asks the DNS server at 192.168.100.2, which happens to be dnscache, the IP of troubleshooters.com. It doesn't matter what's in your /etc/resolv.conf file. It doesn't matter what other DNS servers on the subnet are doing. It pinpoints your dnscache at 192.168.100.2.

Here's another possibility. For whatever reason, pings on your LAN computer wincli.domain.cxm fail to resolve, but pings on its 192.168.100.5 IP address succeed. It's a DNS problem. What now? What you have is a mixed up jumble of one or more local caching servers, nameservers listed in /etc/resolv.conf, who-knows-what authoritative name servers -- oh geez, what a mess! Now apply dig to your tinydns:

dig @127.0.0.1 wincli.domain.cxm

If the preceding succeeds, you can look at /etc/resolv.conf, at your local dnscache server, and whether your local dnscache server has a dnscache/root/servers/domain.cxm containing 127.0.0.1. in order for the dnscache server to query the tinydns server. Any way you look at it, your job has become easier because you know there's nothing wrong with the ability of your authoritative server to serve out local DNS.

But what if the preceding command had failed? Your job is still much easier, because you now know for sure you have a busted tinydns, so you can work full time fixing it. After all, without tinydns, the most perfect /etc/resolv.conf and the most perfectly configured caching server couldn't answer a query about a local domain.cxm computer.

dig is such a good tool you'd do well to memorize all its varied usages. However, if you need to prioritize your time, learn these two commands:

Forward lookup (Domain name to IP address)		dig @dnsserverIP domainname
Reverse lookup (IP address to Domain name)		dig @dnsserverIP -x IPaddress

djbdns runs on daemontools

Mankind has come up with many innovative, and sometimes problematic, ways to automatically run programs in the background. Certainly the whole system in /etc/init.d and /etc/rc.<number>, where <number> is the runlevel, is an old mainstay of Linux. Then there's the inetd system and its successor, the xinetd system. And Ubuntu's upstart system, with config files in /etc/init.

Given djb's unique way of thinking, it's not surprising that he wrote his own system to run programs in the background. It's called daemontools, and daemontools is what runs djbdns. The executable that does most of daemontools' work is called svscan. Watch this:

slitt@mydesk:~$ ps ax | grep svscan | grep -v grep
 1026 ?        Ss     0:00 /bin/sh /command/svscanboot
 1031 ?        S      0:00 svscan /service
slitt@mydesk:~$

What happened here is the svscanboot program runs on boot. It runs svscan after setting svscan's default directories, i/o type, environment variables and the like. svscanboot defaults the "service directory" to /service.

Once svscan is running, every few seconds (no more than five) it scans the directory that's it's argument (/service in all default djb setups) for symlinks to subdirectories. If a new symlink is found, it is started by svscan assuming the symlinked directory's program hasn't been stopped and everything's written correctly.

What this means is that defining which daemontools daemons run is as easy as creating or deleting symlinks to a properly formed application definition directory.

Typically such directories have subdirectories called root, env, and log. They also have a shellscript called run, which runs the actual executable.

More will be written about daemontools later in this document.

Enable and disable are done by symlink

This is not documented well enough on djb's website or anywhere else. The actual point of enablement occurs when you symlink a service's directory to the /service directory. It is at that point where daemontools explores that new service's tree, placing a properly configured supervise directory below any directory containing a run command.

This will be explored in detail in the "/service directory structure, and quick reinstallation" section of this document. The main thing to remember is that if you have unhelpful error messages speaking of "text busy" or "file not found" when the named file is there, or if it gripes about something missing in the supervise directory, or if the svc and/or svstat commands repeatedly yield error messages, especially if the problem survives svc -d and svc -u or even reboots, you need to reinstall as discussed in the "/service directory structure, and quick reinstallation" section of this document.

Set up in the proper order

A prime determinant in how quickly and easily you get djbdns running is the order in which you do things. Here's the order I recommend:

From DJB's website, download the source tarballs for daemontools, ucspi-tcp and djbdns.
Download all necessary documentation so you can view it locally.
Shut down and disable bind.
Install and test daemontools.
Install and test ucspi-tcp.
Install djbdns (tinydns and dnscache).
Configure dnscache on host's IP

dnscache-conf dnscache dnslog <dnscache-DIR> <Host's IP>
Do not continue until caching DNS works perfectly

Configure tinydns on 127.0.0.1

Fill the /service/tinydns/root/data file either with an editor, or from bind data
tinydns-conf tinydns dnslog <tinydns-DIR> 127.0.0.1
Fill the /service/tinydns/root/data file either with an editor, or from bind data, or from previously made djbdns data
cd <tinydns-DIR>/root; make

# This compiles data into data.cdb

svc -t /service/tinydns
Test forward dns with dig @127.0.0.1 boxname.domain.cxm
Test reverse dns with dig @127.0.0.1 -x ip_address_of_boxname
Do not continue until authoritative dns works perfectly from server 127.0.0.1

Link dnscache to tinydns

echo 127.0.0.1 > /service/dnscache/root/servers/domain.cxm

Assuming this computer has authority over domain.cxm

Test

dig @127.0.0.1 wincli.domain.cxm

Tests your tinydns

dig @192.168.100.2 wincli.domain.cxm

Tests your dnscache ability to call your tinydns

Test reverse DNS

dig @127.0.0.1 -x 192.168.100.5

Test reverse dns on your tinydns

dig @192.168.100.2 -x 192.168.100.5

Tests your dnscache ability to call your tinydns

Also test forward with dnsip
Also test reverse with dnsname

Set /etc/resolv.conf to point to your box's IP address (192.168.100.2 in this case)

Troubleshoot modularly

Know your modules, and troubleshoot accordingly. Test dnscache alone, without worrying about tinydns. Use the dig @ command so to test directly to the IP address assigned dnscache (probably that computer's IP address). Likewise, test tinydns alone, once again using dig @ so as to test directly to the IP address assigned dnscache (probably 127.0.0.1). Only after that works will you test at the address in resolv.conf.

Remember both dnscache and tinydns have their own logs:

/service/dnscache/log/main/current
/service/tinydns/log/main/current

A typical small office setup looks something like this:

Authoritative query process block diagram

Cache query process block diagram

As you install, configure, test and troubleshoot, try to keep these block diagrams in mind.

/service directory structure, and quick reenablement

As mentioned previously, actual enablement of a service occurs at the moment when you symlink the directory containing all the app information to the /service directory tree. For instance, assuming your dnscache-conf command listed /var/service/dnscache as the directory, the following command would actually install it:

ln -s /var/service/dnscache /service

Once you do that, on its next loop through /service, the svscan program notices the new dnscache subdirectory, configures a supervise directory in every new subdirectory containing an executable run command, performs some more setup work, and then turns the new service on. Performing that symlink is not a passive action -- there are (desireable) side effects.

In order to troubleshoot any DJB app, you should be aware of the structure of the /service tree...

In the listing to the right, /service/dnscache is the directory for service dnscache. It contains four directories:

env
log
root
supervise

env stores environment variables for the service, with the filename being the environment variable name, and the contents being its value. The log directory stores the service's rotating log files. The root directory stores various necessities, but is not absolutely necessary for every type of djb service. The supervise directory stores state for control of the service, and is entirely created and maintained by daemontools. You should NEVER try to directly create or change files within the supervise directory. Only daemontools itself should mess with any supervise directory.

The supervise directory is created when you symlink the actual service tree to the /service directory. For instance:

ln -s /var/service/tinydns /service/tinydns

In the preceding, /var/service/tinydns is an actual, physical directory created by you. /service/tinydns is the symlink -- a name refering to /var/service/tinydns.

As daemontools loops, it checks the /service tree for any new subdirectories. If it finds one, it looks for an executable named run in that subdirectory, and if found, daemontools creates a properly configured supervise directory.

If you look carefully, you'll see a supervise directory inside the log directory. That's because service logging is, by itself, a service, with its own run command.

For any directory, the run command typically grabs environment variables from the env directory, then runs exec on the real executable, possibly assigning a different user and group to the execution.

Ponder this setup for awhile, and you'll see its brilliant modulatity. You can take any app that runs on the command line, call it in a run command, issue any special variables or state within the env directory, set up any special logging in the run command in the log directory, and you're pretty much done. But there are a few issues...

/service/dnscache
|-- env
|   |-- CACHESIZE
|   |-- DATALIMIT
|   |-- IP
|   |-- IPSEND
|   `-- ROOT
|-- log
|   |-- main
|   |   |-- @4000000041e4052f2f098eac.s
|   |   |-- @4000000041e41ced00c0ae2c.s
|   |   |-- @4000000041e43bba10c2c38c.s
|   |   |-- @4000000041e45bf32cc426d4.s
|   |   |-- @4000000041e47bcb0702cacc.s
|   |   |-- @4000000041e483d31899c484.s
|   |   |-- @4000000041e487b918df939c.s
|   |   |-- @4000000041e48a5a191efa14.s
|   |   |-- @4000000041e48cbc1a000f2c.s
|   |   |-- current
|   |   |-- lock
|   |   `-- state
|   |-- run
|   |-- status
|   `-- supervise
|       |-- control
|       |-- lock
|       |-- ok
|       `-- status
|-- root
|   |-- ip
|   |   |-- 127.0.0.1
|   |   `-- 192.168.100
|   `-- servers
|       |-- 100.168.192.in-addr.arpa
|       |-- @
|       `-- domain.cxm
|-- run
|-- seed
`-- supervise
    |-- control
    |-- lock
    |-- ok
    `-- status

8 directories, 34 files

The main issue is that sometimes that wonderful directory structure gets garbled. When that happens, you need to undo the link, turn off the services, and redo the link. Let's say the service involved is dnscache. Here's what you'd do to "uninstall" the dnscache system:

```
cd /service/dnscache
```
```
rm /service/dnscache
```
```
svc -dx . log
```
cd ..

Step 1 gets you into the dnscache directory. Step 2 deletes the symlink, but the services for dnscache and its logging system are still running in memory. Step 3 downs the dnscache and its log in memory (hopefully). Step 4 gets you into a directory that still exists.

Sometimes, when things are really busted, step 3 doesn't kill the app from memory. Use ps ax to view all processes, and kill the ones that run the service. For instance, these processes run dnscache, and when things go wrong step 3 might not kill them:

24048 ?        S      0:00 supervise dnscache
24058 ?        S      0:00 /usr/local/bin/dnscache

The following are processes for tinydns:

 3121 ?        S      0:00 supervise tinydns
 3129 ?        S      0:00 /usr/local/bin/tinydns

While the processes for a service are running, reinstalling that service yields problematic results. If you're lucky they'll be so problematic that the service cannot be started, stopped, or svscan'ed. If you're unlucky, they'll be subtly unnoticable, leaving you open to intermittent problems or even security breaches. Always make sure these processes are not running before trying to reinstall.

Once you've "uninstalled" the dnscache system, "reinstalling" it is as simple as recreating the symlink:

ln -s /var/service/dnscache /service/dnscache
sleep 5

Step 1 recreates the link. Step 2 is just a reminder that you should not do ANYTHING with the dnscache system for at least 5 seconds, because that's how long it will take daemontools to find that dnscache is a new directory under the /service directory, create all necessary supervise directories, perform other logistical work, and lastly, run dnscache and its logging system. Many wierd problems occur when someone ups or downs the service within a few seconds of creating the symlink, or when someone creates the symlink before application configuration is complete. Remember, the symlink is more than a directory redirection -- creating it actually causes the app to be enabled.

Once you're done, you can confirm the contents of any supervise directories with the ls command, as shown in these examples:

[root@mydesk dnscache]# ls -ldF /service/dnscache/supervise/*
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/supervise/control|
-rw-------  1 root root  0 Jan 12 11:26 /service/dnscache/supervise/lock
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/supervise/ok|
-rw-r--r--  1 root root 18 Jan 12 11:26 /service/dnscache/supervise/status
[root@mydesk dnscache]# ls -ldF /service/dnscache/supervise/log/*
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/log/supervise/control|
-rw-------  1 root root  0 Jan 12 11:26 /service/dnscache/log/supervise/lock
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/log/supervise/ok|
-rw-r--r--  1 root root 18 Jan 12 11:26 /service/dnscache/log/supervise/status
[root@mydesk dnscache]#

Fortunately, reenablement is quick and easy. As a matter of fact, it can be performed by a simple shellscript, which deletes the symlink, deletes supervise directories so that daemontools can install them correctly, sets run scripts to the correct permissions, downs the service, shows processes so you can verify that all are down and manually kill them if not, and then re-symlinks, waits 5 seconds, and shows you the results:

#!/bin/sh

#### DEFINE APP AND DIRECTORIES HERE
APP=dnscache
LOCALSERVICEDIR=/var/service
SERVICEDIR=/service

#### DOWN THE DJB SERVICE
cd $SERVICEDIR/$APP
rm -f $SERVICEDIR/$APP
svc -dx . log

#### IN CASE THE DJB DOWN DIDN'T WORK, MANUALLY KILL IF NECESSARY
while test "$input" != "c"; do
	echo
	echo
	ps ax | grep $APP
	echo
	echo In the preceding processes, if you see either supervise $APP
	echo or /usr/local/bin/$APP
	echo or any other process running $APP
	echo 'you must kill it before continuing (open another terminal)'
	echo
	echo -n 'Press c then Enter to continue (after any necessary killing)==>'
	read input
done
echo '   Continuing...'


#### REMOVE THE supervise DIRECTORIES
rm -rf $LOCALSERVICEDIR/$APP/supervise
rm -rf $LOCALSERVICEDIR/$APP/log/supervise

#### SET THE run FILES TO 755 FOR PROPER REINSTALLATION
chmod 755 $LOCALSERVICEDIR/$APP/run
chmod 755 $LOCALSERVICEDIR/$APP/log/run

#### REINSTALL
ln -s $LOCALSERVICEDIR/$APP $SERVICEDIR/$APP
sleep 5

#### PRINT THE RESULTS
mycommand="svstat $SERVICEDIR/$APP"
echo
echo $mycommand
$mycommand 
echo
echo If the preceding svc and svstat commands give no error messages, 
echo your supervise directory is probably OK.

When to reenable

Unhelpful error messages

svc: warning: unable to control /service/dnscache/: file does not exist

Scuse me? /service/dnscache/ certainly does exist. Could you please be more explicit?

Unfortunately no. For whatever reasons, most djbdns error messages fail to name the exact file that's missing or wrong. In this case it was /service/dnscache/supervise/control, which I deleted just to make this point. Messages containing "text busy" and "file not found" without naming the file, or naming a directory that exists, are definitely unhelpful. If things are really crazy, perform this quick and easy reenablement.

Note that if a service constantly restarts itself (svstat will show this), that's probably due to a directory being in an env directory, which is illegal. Delete or move the directory, and the problem will probably clear itself up without further action on your part.

Installation procedures for a small office

For a terse, helpful, outdated and not totally accurate version of installation procedures, please see the Installation Outline. It can be helpful.

Installation Overview

Your djbdns installation can be easy or hard. If you want it easy, it's important to do it in the correct order:

Download source tarballs for daemontools, ucspi-tcp and djbdns
Install daemontools
Install ucspi-tcp
Install djbdns
Configure dnscache on an alias to the network card
Configure tinydns on 127.0.0.1
Link dnscache to tinydns

Daemontools and ucspi-tcp are systems that launch most DJB software, including djbdns. Daemontools is a system for launching daemons, very similar to the scripts in the /etc/rc.d tree. ucspi-tcp is a system for running background software, very similar to the inetd and xinetd systems on a normal Linux system. Daemontools and ucspi-tcp coexist with /etc/rc.d, inetd and xinetd perfectly.

Daemontools includes the svc command, which can turn tinydns and/or dnscache on or off (svc -u, svc -d), or can restart it (svc -t). The svcstat command

Command	Action
svc -u /service/tinydns	Turn on tinydns. It will turn on after reboot.
svc -d /service/tinydns	Turn off tinydns. It will not turn on after reboot.
svc -t /service/tinydns	Restart tinydns if it is on, but not if it's off. This is used to restart a service after making a configuration change.
svc -t /service/*	Restart all daemons under the /service directory -- typically both tinydns and dnscache.
svstat /service/*	Check the on/off status of all daemons under the /service directory - typically both tinydns and dnscache.

Download source for daemontools, ucspi-tcp and djbdns
		Install daemontools
		Install ucspi-tcp
		Install djbdns
			Which consists of:
				tinydns: The authoritative resolver
				dnscache: The recursive (caching) resolver
		Configure dnscache on an alias to the network card
			Do not continue until external name resolution works
		Configure tinydns on 127.0.0.1
			127.0.0.1 is best for SOHO where only subnet clients ask
		Link  dnscache to tinydns

djbdns Data Line Definitions

The important fact to remember in the data file (typically /service/tinydns/root/data) is that many lines create more than one DNS record. For instance, lines starting with a dot create an SOA record, an NS record, and an A record. This might seem confusing at first, but in fact it enables you to create fairly complete DNS setups, and such setups are incredibly readable. Once you create such a file, you'll never want to mess with bind's /etc/named.conf and /var/named/* again.

Another important fact is that djbdns has many intelligent defaults, meaning often you needn't specify anything. This is true right down to the TTL (Time To Live). If TTL isn't specified, djbdns sets a short default, and as time goes on, if nothing changes, that default is set longer and longer.

Lines in the data file look like this:

xproperty1:property2:property3:property4:property5

In the preceding, x is a single character that defines the line type, which in turn defines the DNS record types it creates. Each property defines a property of the record(s). Different line types take different numbers of properties, up to 12 for the Z type.

On any given line, you needn't include any colons or anything else, after the last property specified to a non-default value. If you need to specify a property, not at the end, to a default value, simply put nothing in the place it would normally be specified. In other words, it would look like two colons together:

xproperty1::property3

The preceding sets property 2 to default, and properties 4 and above (if any) to default values.

The following table shows the properties of each line type:

1	2	3	4	5	6	7	8	9	10	11	12
.	fqdn	ip	x	ttl	timestamp	lo
&	fqdn	ip	x	ttl	timestamp	lo
=	fqdn	ip	ttl	timestamp	lo
+	fqdn	ip	ttl	timestamp	lo
@	fqdn	ip	x	dist	ttl	timestamp	lo
#	comment
^	fqdn	p	ttl	timestamp	lo
Z	fqdn	mname	rname	ser	ref	ret	exp	min	ttl	timestamp	lo
-	fqdn	ip	ttl	timestamp	lo
'	fqdn	s	ttl	timestamp	lo
C	fqdn	p	ttl	timestamp	lo
%	lo	ipprefix

The following list shows the syntax of each line type:

. fqdn:ip:x:ttl:timestamp:lo

&fqdn:ip:x:ttl:timestamp:lo

=fqdn:ip:ttl:timestamp:lo

+fqdn:ip:ttl:timestamp:lo

@fqdn:ip:x:dist:ttl:timestamp:lo

#comment

^fqdn:p:ttl:timestamp:lo

Zfqdn:mname:rname:ser:ref:ret:exp:min:ttl:timestamp:lo

-fqdn:ip:ttl:timestamp:lo

' fqdn:s:ttl:timestamp:lo

Cfqdn:p:ttl:timestamp:lo

%lo:ipprefix

The following table shows the meaning of each property type, including the line types, which of course are the first field:

.	A line starting with a dot maps to 3 records. This line is replaced by an SOA record, an NS record, and an A record. .fqdn:ip:x:ttl:timestamp:lo The SOA is for domain fqdn, with name server name x, time to live ttl, timestamp timestamp, and location lo. The contact email defaults to hostmaster@x (see x for special handling of dots).( The NS record is for domain fqdn, with name server ip address ip, name server name x, time to live ttl, timestamp timestamp, and location lo. The A record associates name server x with ip address ip. If you don't like the defaults for dot lines, substitute line types such as &, Z, = and the like, which map to fewer record types.
&	Maps to 2 records. This line is replaced by an NS record and an A record. &fqdn:ip:x:ttl:timestamp:lo The NS record is for domain fqdn, with name server ip address ip and name server name x (see x for special handling of dots). You can also specify ttl, timestamp and lo.
=	Maps to 2 records. This line is replaced by an A record and a PTR record. In other words, it defines forward and reverse name resolution for a specific host. =fqdn:ip:ttl:timestamp:lo Unlike in . and & lines, in = lines fqdn represents a machine, not just a domain. In other words, if the machine is wincli, and the domain is domain.cxm, fqdn would be wincli.domain.cxm. The IP address it maps to is ip. Remember, this mapping is both forward and reverse with this type of line. You can also specify ttl, timestamp and lo. This is a very easy shorthand because it eliminates the need to reverse the IP address and add in-addr.arpa for reverse DNS.
+	Maps to 1 record, an A record. This is exactly like an = line except there's no PTR record, so it doesn't define reverse name DNS. +fqdn:ip:ttl:timestamp:lo
@	Maps to 1 record, an MX record. Acts just like = record except it defines a single MX record. @fqdn:ip:x:dist:ttl:timestamp:lo fqdn, ip, and x all act just like an = record. The dist parameter represents email distance. You can also specify ttl, timestamp and lo.
#	This line is a comment, so the second field is the text of the comment. #comment Remember this is visible only in the source. It is not a comment about a domain, visible in DNS. See single quote lines for that.
^	Maps to 1 record, a PTR record. ^fqdn:p:ttl:timestamp:lo Here, fqdn is an in-addr.arpa name, like 5.100.168.192.in-addr.arpa, which would represent IP address 192.168.100.5. p is the fully qualified domain of the host, such as wincli.domain.cxm. You can also specify ttl, timestamp and lo.
Z	This line maps to a single SOA record. Zfqdn:mname:rname:ser:ref:ret:exp:min:ttl:timestamp:lo You would use this instead of a . line if you wanted non-default values for the properties. For instance, the dot line defaults rname to hostmaster@fqdn. Click the links to learn about the rest of the Z line's properties, most of which cannot be specified on a . line.
-	A line starting with a minus sign maps is exactly like one starting with a plus sign, except it has been commented out temporarily. This is used by some automated tinydns config tools.
'	A line starting with a single quote maps to a single TXT record. 'fqdn:s:ttl:timestamp:lo Here s is a string containing the desired text.
C	This line maps to a CNAME record. Cfqdn:p:ttl:timestamp:lo Here fqdn is the fully qualified hostname of the host, while p is the fully qualified host name of the main name you want to point to. The idea is that you can change the IP address of all an IP address's CNAMEs by changing it for the main name. CNAMEs are falling out of favor these days -- so don't use C lines unless you have a good reason to. Instead, use a + instead.
%	This line defines a location, which can be used to specify a location (lo) in other lines. %lo:ipprefix
comment	The comment property is the text of a # line.
dist	This is the mail exchanger distance, which helps mail exchanger algorithms decide how best to route a email.
exp	Expire time, used only in Z lines.
fqdn	In lines defining an SOA, NS or MX record ( . , &, Z, @) this is the full domain name of the domain. It does not specify a specific host. Such a specific host is specified by another property, typically the x or mname properties. This is true even if the line also defines an individual host as the name server. In lines that do not define an SOA, NS or MX record, it is the fully qualified domain name of the host.
ip	The IP address. For =, + and - lines, it is the IP of the specific host. For @ lines it's the IP address of the primary mail exchanger. For . , &, and Z lines it's the IP address of the name server.
ipprefix	An IP prefix defining a subnet. For instance, IP prefix 192.168.100 would include 0 through 255 in subnet 192.168.100. IP prefix 192.168 would include subnets 0 through 255 in subnet 192.168.
lo	Location. Most lines can be restricted to queries coming from a certain location, where location is typically an IP address range or a domain or server. If a line has a lo property, that location must have been created with a % line.
min	Minimum time, used only in Z lines.
mname	Primary name server fully qualified domain name (full hostname), used only in Z lines.
p	This is a property indicating a fully qualified domain name in a reverse DNS situation, or in a CNAME. Therefore, it's used only in ^ lines and C lines.
ref	Refresh time, used only in Z lines.
ret	Retry time, used only in Z lines.
rname	Email of the DNS contact person, used only in Z lines, defaults to hostmaster@domainname (fqdn) in the . line.
s	A string used to represent the text in a TXT record, this is used only in a ' line (a singlequote line).
ser	Serial number, used only in Z lines.
timestamp	A property marking a time. If ttl is zero, it represents the time at which the current line will end (time to die). If ttl is non-zero or omitted, timestamp represents the time before which the line will be ignored. This property is expressed in hexidecimal. See the djbdns website for details.
ttl	Time To Live. The time after which the line "expires" and must be reloaded by clients, slaves and the like.
x	This is the hostname of a server. On . , & and Z, if it contains no dots it is prepended to ns.fqdn. On @lines, if it contains no dots it is prepended to mx.fqdn. On all lines, if it contains one or more dots, it is used literally as the name of the name server or mail server.

Troubleshooting djbdns

Tools for troubleshooting djbdns

DNS client tools

ping is your best tool for forward resolution, as it gives the least false defective readings. Simply run it like this:

[slitt@mydesk djbdns]$ ping www.yahoo.com
PING www.yahoo.akadns.net (216.109.118.65) 56(84) bytes of data.

In the preceding, note that the important part was that it resolved the IP address, *NOT* whether or not the ping ultimately succeeds.

Perhaps the most reliably useful tool is the dig command. This command can query both forward and reverse dns, and can send that query to a DNS server at any IP address you desire. With djbdns, that means that you can query tinydns without involving either resolv.conf or dnscache. Using this isolation, you can greatly reduce both the number of variables and the time to solution. Here are some dig examples, using wincli.domain.cxm defined at 192.168.100.5 in my authoritative dns:

To test this:	Do this:
Forward resolution using resolv.conf and dnscache. The question section queries on wincli.domain.cxm, and the answer section delivers 192.168.100.5. Note that just below query time, the server is listed at 192.168.100.103, port 53. This is the IP address of the alias to which I bound dnscache.	[slitt@mydesk djbdns]$ dig wincli.domain.cxm ; <<>> DiG 9.2.3 <<>> wincli.domain.cxm ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11723 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;wincli.domain.cxm. IN A ;; ANSWER SECTION: wincli.domain.cxm. 79024 IN A 192.168.100.5 ;; Query time: 0 msec ;; SERVER: 192.168.100.103#53(192.168.100.103) ;; WHEN: Mon Jan 10 10:43:44 2005 ;; MSG SIZE rcvd: 51 [slitt@mydesk djbdns]$
Reverse resolution using resolv.conf and dnscache The question section queries on 5.100.168.192.in-addr.arpa, which means query the reverse dns for IP address 192.168.100.5, and the answer section delivers wincli.domain.cxm.	[slitt@mydesk djbdns]$ dig -x 192.168.100.5 ; <<>> DiG 9.2.3 <<>> -x 192.168.100.5 ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 4853 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;5.100.168.192.in-addr.arpa. IN PTR ;; ANSWER SECTION: 5.100.168.192.in-addr.arpa. 86400 IN PTR wincli.domain.cxm. ;; Query time: 1 msec ;; SERVER: 192.168.100.103#53(192.168.100.103) ;; WHEN: Mon Jan 10 10:42:42 2005 ;; MSG SIZE rcvd: 75 [slitt@mydesk djbdns]$
Forward resolution querying tinydns directly at 127.0.0.1 Here we're performing the same task as the previous forward resolution, except we're cutting /etc/resolv.conf and dnscache out of the loop by directly querying 127.0.0.1, which is the IP address to which I've bound tinydns.	[slitt@mydesk djbdns]$ dig @127.0.0.1 wincli.domain.cxm ; <<>> DiG 9.2.3 <<>> @127.0.0.1 wincli.domain.cxm ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 31681 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1 ;; QUESTION SECTION: ;wincli.domain.cxm. IN A ;; ANSWER SECTION: wincli.domain.cxm. 86400 IN A 192.168.100.5 ;; AUTHORITY SECTION: domain.cxm. 259200 IN NS ns.domain.cxm. ;; ADDITIONAL SECTION: ns.domain.cxm. 86400 IN A 192.168.100.103 ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Mon Jan 10 10:45:38 2005 ;; MSG SIZE rcvd: 84 [slitt@mydesk djbdns]$
Reverse resolution querying tinydns directly at 127.0.0.1 Once again, we've cut resolv.conf and dnscache out of the loop, for more precision troubleshooting.	[slitt@mydesk djbdns]$ dig @127.0.0.1 -x 192.168.100.5 ; <<>> DiG 9.2.3 <<>> @127.0.0.1 -x 192.168.100.5 ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 44423 ;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 1, ADDITIONAL: 1 ;; QUESTION SECTION: ;5.100.168.192.in-addr.arpa. IN PTR ;; ANSWER SECTION: 5.100.168.192.in-addr.arpa. 86400 IN PTR wincli.domain.cxm. ;; AUTHORITY SECTION: 100.168.192.in-addr.arpa. 259200 IN NS ns.domain.cxm. ;; ADDITIONAL SECTION: ns.domain.cxm. 86400 IN A 192.168.100.103 ;; Query time: 0 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) ;; WHEN: Mon Jan 10 10:53:55 2005 ;; MSG SIZE rcvd: 108 [slitt@mydesk djbdns]$

The djbdns distribution itself gives some handy clients, although these djbdns provided clients sometimes give unexpected results. So use dig and ping frequently, but beyond that the djbdns provided clients often speed up troubleshooting.

The dnsip program resolves a domain name into one or more IP addresses (multiple addresses are space delimited) with no extra fluff. This is great for scripts. Going the other way, the djbname program resolves an IP address into an single domain name.

The dnstrace program appears to enable one to trace a query through all its steps -- a great aid in troubleshooting. Unfortunately, it often hangs for me. Likewise, the dnsq program hangs so often for me that I don't use it -- I don't trust its results. I'm sure these hangs are due to a lack of understanding on my part, but I like easy to use tools that yield unmistakable results.

Restarting tools

After making changes, those changes need to be written to the system before they'll take effect. This is much like most other servers -- you change a config file and then restart the service.

The svc command is used to restart all software coming from DJB. Here are some examples:


svc -d directory	Takes down the service defined by the directory	svc -d /service/tinydns takes down tinydns
svc -u directory	Brings up the service defined by the directory	svc -u /service/tinydns brings up tinydns
svc -t directory	This sends a term signal to the service defined by the directory. Normally this would kill the app, but due to the functionality of daemontools the app starts right back up again, so this is a restart and is much more convenient than svc -d /dir; svc -u /dir.	svc -u /service/tinydns takes down tinydns
tinydns-data	This reads the data file in the current directory (meaning you must be in /service/tinydns/root, and puts it into memory. This must be done any time the data file is changed, or the changes won't be recognized. After doing this, perform a svc -t directory to restart tinydns itself.	cd /service/tinydns/root tinydns-data Reloads data from data file

Symptoms and Solutions

Here are some common symptoms that can drive you nuts, and quickie solutions...

Can't resolve outside websites

Verify internet connectivity

You can't resolve if you can't connect. Sometimes, especially with dialup machines, your connection drops and what looks like a DNS problem is really a general connectivity problem. Here's an excellent script for verifying connectivity:

#!/bin/sh
SERVERFILE=/service/dnscache/root/servers/@
MYLINE=$1
if test "$MYLINE" = ""; then
	echo
	echo Syntax is ./verify_connectivity.sh lineno
	echo where lineno is the line of the list of IP addresses at
	echo $SERVERFILE. The top line is lineno 1.
	echo
	echo For your convenience, this time I\'ll set lineno to 1.
	echo
	MYLINE=1;
fi
ping `head -n$MYLINE $SERVERFILE | tail -n1`

Verify dnscache is up

svstat /service/dnscache
svc -t /service/dnscache

If svstat says it's up, notice for how long. If it's regularly up for only 0 or 1 seconds, you have the looping restart problem. If it's down, restart it with the svc -t command. If either gives errormessages about can't read file, can't write file, "text busy", look at the supervise directories.

A service constantly restarts itself

When svstat shows a service always up for less than 10 seconds, you have a constantly restarting service. Most of the time this is due to a directory in the env directory. If there's a directory in the env or log/env directory, move it or remove it, and the problem should go away without further action on your part.

Can't resolve local addresses

If you can resolve Internet addresses (www.yahoo.com), but can't resolve your own addresses, round up the usual suspects:

Verify that tinydns is running
Test the dnscache to tinydns link
Test with a simple data file
Quick reinstall

Test the dnscache to tinydns link and /etc/resolv.conf

The way local (authoritative) resolution typically works is that the request is sent to the IP address reserved for dnscache, but then dnscache forwards it to the IP address for tinydns. In small office situations where the IP address isn't actually registered with a domain name registrar, the IP address used by tinydns is often 127.0.0.1. If the link that forwards from dnscache to tinydns breaks, that prevents local authoritative resolution. Fortunately, there's a simple test.

Assume the IP used by tinydns is 127.0.0.1. In that case, you can cut dnscache out of the system with the following commands:

dig @127.0.0.1 myclient.mydoman.mytld
dig @127.0.0.1 -x 192.168.100.5

In the preceding, 192.168.100.5 is assumed to be the IP address for myclient.mydomain.mytld. Substitute as necessary. Likely, substitute the fully qualified domain name of one of your hosts for myclient.mydomain.mytld.

If the preceding commands yield correct resolution, the problem is either in the IP address contained in your /etc/resolv.conf, or in your dnscache to tinydns link. In the former, check /etc/resolv.conf for the address you configured to work with dnscache. If /etc/resolv.conf is OK, look in /service/dnscache/root/servers. It should contain a file called @, which is a list of all the root dns servers. It should also contain a file called mydomain.mytld, which contains the IP address with which you configured tinydns -- in a small office 127.0.0.1. That is the dnscache to tinydns link for forward resolution. It should also contain a file something like this 100.168.192.in-addr.arpa, where 100.168.192 is 192.168.100 backwards -- substitute numbers for your own subnet. Once again, that file's contents should be the IP address configured to work with tinydns. That is your dnscache to tinydns link for reverse resolution.

Once again, if you find a problem with /etc/resolv.conf or the dnscache to tinydns link, fix it. Otherwise, you have an actual problem with tinydns.

Test with a simple data file

If tinydns is running, and there's no problem with the dnscache to tinydns link or /etc/resolv.conf, the problem might be in your data file (/service/tinydns/root/data). The best way to test that is to back it up, and create a new file (call it data.hello) containing the following:

.mydomain.mytld:192.168.1.100               # SOA record for mydomain.mytld
.1.168.192.in-addr.arpa:192.168.1.100       # NS record for mydomain.mytld
=mybox.mydomain.mytld:192.168.1.11          # A and PTR records for mybox
=anotherbox.mydomain.mytld:192.168.1.22     # A and PTR records for anotherbox

Then copy your current data file to data.bup, and run the following script:

#!/bin/sh
cp -fp data.hello data
cd /service/tinydns/root
tinydns-data
svc -t /service/tinydns
svc -t /service/dnscache
echo 'digging...'
dig @127.0.0.1 mybox.mydomain.mytld
dig @127.0.0.1 -x 192.168.1.11

In the preceding, if tinydns is set to an IP address other than 127.0.0.1, change it in the script. If the script gives correct answer sections for both forward and reverse, your problem was due to something in your old data file, so troubleshoot it. Otherwise, there's some other problem with your tinydns.

WARNING

Authoritative DNS created from the data.hello file listed above works only at the tinydns IP address, NOT at the dnscache IP address, regardless of how you configure your links. That's because the domain name is different, as is the nameserver address. Therefore, be sure to use the @127.0.0.1 (or whatever IP address is assigned to your tinydns setup) in order to perform this test.

Quick reinstall

If you're still having problems that appear not to be caused by the data file, the link, or resolv.conf, try the quick reinstall as discussed in the /service directory structure, and quick reinstallation section earlier in this document. But first, investigate the supervise directories with these two scripts:

showdir.sh

#!/bin/sh
echo $1
ls -ldF $1/*
echo

look_at_supervise.sh

#!/bin/sh

STARTDIR=$1
if test "$STARTDIR" = ""; then
	echo 'Syntax is ./look_at_supervise.sh <startdir>'
	echo 'where <startdir> is tinydns or dnscache or'
	echo 'any other directory under /service'
	echo 'for your convenience, setting startdir to "."'
	STARTDIR='.'
fi

STARTDIR='/service/'$STARTDIR
find $STARTDIR -follow -name supervise -exec ./showdir.sh {} \;

For instance, to find the contents of all supervise directories in the /service/dnscache tree, you'd perform this command:

[root@mydesk root]# ./look_at_supervise.sh dnscache
/service/dnscache/log/supervise
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/log/supervise/control|
-rw-------  1 root root  0 Jan 12 11:26 /service/dnscache/log/supervise/lock
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/log/supervise/ok|
-rw-r--r--  1 root root 18 Jan 12 11:26 /service/dnscache/log/supervise/status

/service/dnscache/supervise
prw-------  1 root root  0 Jan 12 14:22 /service/dnscache/supervise/control|
-rw-------  1 root root  0 Jan 12 11:26 /service/dnscache/supervise/lock
prw-------  1 root root  0 Jan 12 11:26 /service/dnscache/supervise/ok|
-rw-r--r--  1 root root 18 Jan 12 14:22 /service/dnscache/supervise/status

[root@mydesk root]#

The results in the preceding box are the correct ones. You get fifos named control and ok, you get a permission 600 file called lock, and a permission 644 file called status. If you get any other results, it's probably defective and needs a reinstall as discussed in the /service directory structure, and quick reinstallation section earlier in this document.

Predefined Diagnostic

Predefined diagnostics are one of the quickest ways to find the root cause in the majority of problems. Even if you don't find the root cause, at least you'll have the root cause scope narrowed, making it easier to use the Universal Troubleshooting Process to find the root cause.

Click here to see a predefined diagnostic.

Master/Slave DNS Relationships

Bind has a very nice facility by which many authoritative slaves can receive their configuration from a single master, without any administrative effort other than changing the master's config files and incrementing its serial number. These are called "Zone Transfers". This is one of the facilities that makes the DNS system so scalable. Djbdns can do this too, as is discussed at http://cr.yp.to/djbdns/tcp.html#intro-axfr.

However, Daniel J. Bernstein makes the point that such no-admin zone transfers have, in his words, "terrible compression and no security". Bernstein recommends a more manual approach employing rsync and ssh. Basically, change the Makefile on the master server to automatically transfer the