Archiveopteryx

Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

Your editor, like many LWN readers, deals in large quantities of electronic mail. As a result, tools which can help with the mail flood are always of interest. One tool which has been on the radar for some time is Archiveopteryx , a database-backed mail store which is meant to deal with high mail volumes. Archiveopteryx does not seem to have a hugely high profile, but it does have a dedicated user base and a steady development pace; Archiveopteryx 3.1.3 was released on March 10.

The idea behind Archiveopteryx is simple enough: build a mail store around the PostgreSQL database, then provide access to it through the usual protocols. Installation is relatively easy for a site which already has PostgreSQL in place; a simple "make install" does the bulk of the work. A straightforward configuration file allows for control over protocols, ports, etc., and there is an administrative program which can be used to set up users within the mail store.

On the protocol side, Archiveopteryx supports POP and IMAP for access to email. It can handle mail receipt directly through SMTP, but that is not normally how one would do things; there is still value in having a real mail transfer agent in the process. The preferred mode is to use the LTMP LMTP protocol to accept mail from the MTA; there is also a command-line utility which can be used for that purpose if need be. The installation instructions include straightforward recipes for configuring Archiveopteryx to work with a number of MTAs. Archiveopteryx also supports the Sieve filtering standard and the associated protocol for managing scripts.

Those who set up a large-scale mail store can be expected to have some archived mail sitting around. Archiveopteryx provides an aoximport tool for importing this email into the system. Your editor found it to be overly simple and inflexible, though. It is unable to create subfolders when importing an entire folder tree (they must already be in place or the import fails), and it failed to import the bulk of the messages when working with a Dovecot-managed maildir mailbox. The importer, perhaps, is like the Debian installer: users tend to only need it once, so it gets relatively little work once the basic functionality is in place.

Archiveopteryx works well as an IMAP server, and it is indeed fast when dealing with folders containing many messages. Operations like deleting or refiling groups of messages go notably faster than with Dovecot on the same server. On the other hand, your editor was unable to get the Sieve script functionality to work at all; this is probably more a matter of incomplete configuration than fundamental problems with Archiveopteryx itself, but it was still a discouraging development.

That ties into the biggest disappointment with Archiveopteryx, though, which is probably totally unjustified: your editor would like this tool to be something that it is not. If one is going to go to the trouble of storing all of one's email into a complex database, it would be nice to be able to do fast, complex searches on that email. That way, the next time it becomes necessary to, say, collect linux-kernel zombie posts, a quick search will do. Archiveopteryx seems to have a search feature built into it, but actually using that feature appears to be limited to exporting messages with the aoxexport tool. The IMAP protocol is not particularly friendly toward the implementation of fast, server-side searching, but it still seems like something better should be possible.

All that should not detract from what Archiveopteryx does well: store and serve email in large volumes using standard protocols. As a tool for ISPs and for others needing to make email available to lots of users, it seems highly useful; it is clearly meant to scale in ways that servers like Dovecot are not.

There is one remaining problem, though: the future of Archiveopteryx is not entirely assured. For years, this program has been developed by a company called Oryx, which offered commercial support for it. In June, 2009, though, the developers behind Oryx announced that the company was shutting down, with the final closure expected in October of this year. They say:

So we're gradually closing down Oryx, BUT NOT ARCHIVEOPTERYX. We'll relicense it using either the BSD or Apache 2 licenses and continue making new releases for years to come. We both feel obliged to keep the existing archives viable.

(The code is currently licensed under OSLv3).

A sense of obligation may keep Archiveopteryx going for a while, but if it's going to be something that people can count on for years into the future, it will have to develop a more active development community. Archiveopteryx has the look of a solidly company-controlled project - the project's git repository is overwhelmingly dominated by commits from the two principal developers. Such projects are always at a bit of risk if the backing company runs into trouble. But Archiveopteryx is free software, and highly useful free software at that; it seems like its user community should be able to carry it forward.

