To follow up on the last entry, what I am using Django for is (re)writing our account request handling system. The simple version of how people get accounts here is a three step process. The would-be user tells us at least their desired Unix login, their name, their email address, and which professor is sponsoring them. We ask the professor if they actually want to sponsor the person's account; if the professor says yes, we create the account and email the requester with details.

(There are a few non-professor account sponsors for things like staff accounts. Professors can sponsor accounts for whoever they want, including people not otherwise associated with the university.)

What I want to automate is the process of submitting requests and having them approved (which seems like a great fit for a simple application built on a modern web framework); we'll continue to do the actual account creation by hand, using a set of scripts we have for it. So far, this has a pretty straightforward two-table and two-form application design; one table for submitted account requests, one table for sponsors, one form for submitting an account request, and a second form for sponsors to approve or reject accounts. Of course, now we get to the complications.

The big complication is that the current 'sponsors' information bundles three or four separate things together: what name people ask to sponsor their account, who actually approves the account, and what home directory new users should be put into (and what Unix group they should be assigned to). The name is usually a professor's, but it can also be a generic thing like 'Professional Masters Student' or 'Graduate Chair'; this means that the same person may have several sponsor entries that they approve accounts for. Home directories are complicated because some professors (and special sponsors) have their own home directories for sponsored accounts, but others put new accounts in the general home directories for their research group.

(DRY suggests that it would be a bad idea to manually replicate a research group's home directory information into the sponsor entries for each of its professors. The OO way out of this is different from the SQL way out.)

Then there are the workflow complications:

Points of Contact can approve accounts in place of one of their professors. I don't know how to cleanly represent this in a schema at all if I want to reuse the same form that sponsors use. (Besides, I already have the case that one person can approve requests for multiple 'sponsors' entries.)

the mass intake of new graduate students is handled differently. The Graduate Office prepares a list of new students and who is theoretically supervising them, then the supervisors approve their new students, and finally we email all of the approved people to ask them to basically come pick their login. This creates a couple of schema complications. First, an account request's approval status is different from whether or not it is 'complete' (has enough information to be created). New grad student accounts start out both unapproved and incomplete (since we don't know what login the new grad student wants), become approved but incomplete, and are finally completed when the new grad student picks a login. Second, there needs to be some way for new grad students to access their approved but incomplete account request so that they can fill in their desired login name, and some sort of authentication for this access. (I just realized that this implies that the login cannot be the primary key on the 'requests' table, although it still has to be unique.)

sometimes sponsors just outright make new accounts for people, including picking their login (this is most common with new administrative staff). Making them first fill in the request form then immediately approve it is kind of silly; they should be able to fill in a preapproved request.

oh yeah, we need an audit trail for when various things happened and who did them. Should this audit trail simply be text messages, or should I try to give it more structure?

(So far I am assuming that core staff will use the general Django administrative interface to do things like add new sponsors.)

All of these complications leave me looking at a scheme where either the tables are multiplying and cross-connecting, or things are mutating into objects that look less and less like anything with a good SQL representation.

(Talking to the duck here has already been useful in making me realize a few things about the problem.)

Sidebar: the OO way versus the SQL way of handling home directories

The OO way is that the 'sponsors' object has both a 'group' field and a 'homedirs' field that can be empty. The 'group' field points to an object for the research group, which has a 'homedirs' field of its own. If sponsors.homedirs is non-empty, we use that; otherwise, we use sponsors.group.homedirs (which must be non-empty).

The SQL way is probably to have a separate mapping table that translates entities to homedirs. Both groups and sponsors have entries in this table (we require that their names be non-overlapping, which is not a problem in practice). Rows in the 'sponsors' table have a foreign key that points to an entry in the mapping table, either the sponsor's group's entry or the sponsor's individual entry.

(The SQL mapping table approach is roughly how the current system handles this.)