My first real job, at the actual company called AIRS and the original owner of the domain airs.com, was working on what was then called a fourth generation language system. Nowadays we would simply call it a database. It was pretty powerful, and we had customers, but we only ran on a system called the Alpha Micro which ran a proprietary operating system called AMOS. So that’s why you’ve never heard of AIRS, or if not the only reason at least a sufficient one.

Anyhow, what I wanted to say here is, we had a database, but it was not a relational database. It was what then called a network database. Nowadays, we would simply say “huh?” But in fact I thought then and continue to think now that relational databases are basically a mistake.

The idea behind relational databases, as invented back in the ’70s by E.F. Codd at IBM, is that you can access them via a structured query language. We now know that language at SQL. A relational database is conceptually just a set of tables. SQL lets you express a database search by conceptually joining tables together in a few different ways (the most common way being matching some index field) and then describing some condition to be satisfied on the joined table. SQL is nice because you can compose complex queries in a fairly logical manner.

The problem is that data in the real world doesn’t typically come in table format. In order to put it into table format you have to insert all various key fields. That’s not so bad, actually, but when you have one-to-many or many-to-many relationships, SQL requires that you describe them as table joins. When the tables get large, actually joining a table becomes expensive–prohibitively expensive. So instead the database has to figure out an efficient way to do the query which doesn’t involve doing an actual join, but instead involves traversing the tables in some manner collecting the desired results. When the query is at all complicated, that becomes a hard problem. So for practical use relational database implementations require you to define various indexes which they can use, as well as more sophisticated solutions which I won’t go into.

The end result is that when you anticipate a user’s query and create the appropriate indexes, the query is fast. When the user does something you didn’t anticipate, the query is slow–for a large database, prohibitively slow. The user who doesn’t know which indexes you created has no idea which queries will be fast and which will be slow. So SQL gives you this logically powerful query language which is supposed to be simple to use, but in actual practice has traps for the unwary user.

In a network database, you have to express your one-to-many and many-to-many relationships explicitly. And when you do a query, you have to follow the links defined by the database–there are no joins. The result is that an efficient query is easy to write, and an inefficient query is hard to write. This makes it harder to do data mining, but it’s actually not that bad. You can still support a structured query language, even SQL itself, you just use a system which rewrites it into something which can be implemented before it gets to the database rather than after. (At AIRS we had a programming language for queries which we called Logic, but most people used a visual query creator, with saved standard queries, etc.).

I can’t argue that SQL is really nice to use. But I can and do argue that relational databases are not the right way to organize data for efficient access.