Today I put online two diagrams depicting the architecture of the Unix operating system, one for the 1972 First Research Edition and one for FreeBSD, one of its direct descendants. Here are the details on how I created these diagrams.

The basis of my study was the evolution of Unix facilities over the past five decades. I catalogued the facilities as they appeared in the manual pages in the form of nine tables, one for each section of the Unix manual. For early editions of Unix I hand-cleaned the printed OCRd manuals, while for later versions I wrote scripts to extract the available manual pages from the source code tree. I obtained the source code trees for successive releases from the Unix history repository, a Git repo hosted on GitHub, which contains the history of Unix from its unnamed 1970 PDP-7 edition until today. The corresponding data and code are available online.

Then came the problem of depicting the sophisticated architecture in a diagram. Whenever possible, I try to express complex diagrams in a declarative fashion, as text, and then automatically generate their graphical form. For example, I've devised UMLGraph, which allows the declarative specification and drawing of UML class and sequence diagrams. For this case I didn't know a suitable tool that would allow me to draw nested block diagrams. A query on tex.stackexchange.com didn't turn out any useful results, so I decided to create a new small domain specific language for this purpose. The language allows you to specify nested horizontal and vertically labeled boxes. The existence of this language helped me a lot when I created and perfected the FreeBSD diagram over many iterations, and also when I tore the diagram down to bits in order to create the First Edition one. I'm also sure it will prove beneficial for accepting corrections and contributions. The data and code for drawing the diagrams are also available online.

I implemented the language as a Python script that parses an input file using regular expressions. As the language grew more complex I found that associating each box type with a specific Python class made the code more clear and easier to maintain. For example, when I rotate a box 90 degrees counter-clockwise its built in vertical text needs to rotate 180 degrees so that it will not appear upside down. A small subclass allowed me to succinctly express this behavior.

Here is the architecture diagram of the 1972 First Research Edition Unix. As you can see, although the system consisted only of a few thousand lines of code, it already had the form and function of the system we recognize today as Unix.

And this is the architecture diagram of the modern FreeBSD system. This system (without the ports) is tens of millions of lines large. Many of the boxes that in the first edition contain individual functions, now contain complete subsystems. Many areas, such as networking, are entirely new. However the system's architecture still carries in it the key ideas of its ancestor.