Automating an audit trail

As raised in ​a recent discussion on django-developers, this code is one solution for creating an audit trail for a given model. This is working in multiple production sites, but is still incomplete. See Caveats below for more information. The code below requires an SVN checkout as of r8223 or later.

Usage

Copy the code at the bottom of this article into a location of your choice. It's just a one-file utility, so it doesn't require an app directory or anything. The examples below assume it's called audit.py and is somewhere on your PYTHONPATH.

In your models file, there are only a couple things to do. First, obviously you'll need to import your audit file, or possibly just get AuditTrail from within it. Then, add an AuditTrail to the model of your choice, assigning it to whatever name you like. That's the only thing necessary to set up the audit trail and get Python-level access to it. If you need to view the audit information in the admin interface, simply add show_in_admin=True as an argument to AuditTrail .

from django.db import models import audit class Person ( models . Model ): first_name = models . CharField ( max_length = 255 ) last_name = models . CharField ( max_length = 255 ) salary = models . PositiveIntegerField () history = audit . AuditTrail () def __str__ ( self ): return " %s %s " % ( self . first_name , self . last_name )

This simple addition will do the rest, allowing you to run syncdb and install the audit model. Once it's installed, the following code will work as shown below. As you will see, Person.history becomes a manager that's used to access the audit trail for a particular object. The type of manager available depends on how you access the audit trail. From an instance, the audit trail will automatically be filtered to only return results related to that instance. From the model class itself, the results will not be filtered in any way, and is the likely approach for doing reporting across several audited items.

>>> from myapp.models import Person >>> person = Person . objects . create ( first_name = 'John' , last_name = 'Public' , salary = 50000 ) >>> < Person : John Public > >>> person . history . count () 1 >>> person . salary = 65000 >>> person . save () >>> person . history . count () 2 >>> for item in person . history . all (): ... print " %s : %s " % ( item , item . salary ) John Public as of 2007 - 08 - 14 20 : 31 : 21.852000 : 65000 John Public as of 2007 - 08 - 14 20 : 30 : 58.959000 : 50000 >>> person2 = Person . objects . create ( first_name = 'Tom' , last_name = 'Smith' , salary = 25000 ) >>> person2 < Person : Tom Smith > >>> person . history . count () 2 >>> person2 . history . count () 1 >>> Person . history . count () 3

As you can see, the audit trail is listed with the most recent state first. Each entry also inclues a timestamp when the edit took place.

Saves and deletes are both tracked, and can be filtered on via Person.history.filter(_audit_change_type='_') . Do not use underscore, use 'I' for inserts, 'U' for updates, and 'D' for deletes.

ForeignKeys and OneToOneFields are now supported both for saving and accessing the audit data. However, it does not archive the contents of the ForeignKey table for the appropriate entries at the same time, and will fail if the ForeignKey a given audit entry is related to is deleted (including if you're auditing the ForeignKey table as well, it does not have a way to link the two audit tables together).

Tracking Extra Information

Sometimes you need to track more information than is available in just the model. For instance, you may want to know who is performing the change on a particular entry, or track some sort of state information about the system. AuditTrail now supports this through the concept of "track fields". These can be specified on a per-model basis or a global basis, and the per-model options will stack with the global ones (but per-model options cannot override global ones currently). Here's an example:

def some_callback ( instance ): return `random.randrange(1, 99)` + 'trackable_val' class Person ( models . Model ): first_name = models . CharField ( maxlength = 255 ) last_name = models . CharField ( maxlength = 255 ) salary = models . PositiveIntegerField () history = audit . AuditTrail ( track_fields = (( 'extra_1' , models . CharField ( maxlength = 50 ), 'hardcoded_value' ), ( 'extra_2' , models . CharField ( maxlength = 50 ), some_callback ),)) def __str__ ( self ): return " %s %s " % ( self . first_name , self . last_name )

The track_fields is a tuple of 3-tuples. The 3-tuples are structured (field_name, type_of_field, value) . type_of_field can be any currently functioning field type, although see the Caveats for issues related to ForeignKeys. value can be either a static value or a callback function, which will get called at the time of the save/delete. This means that if you want to do something involving threadlocals at runtime (for getting things out of the request, for instance) you can do it via this callback.

Assume we ran the example above with this new model. You could then do the following:

>>> p_hist = person . history . all () [ < PersonAudit : John Public as of 2007 - 08 - 27 09 : 29 : 14 > , < PersonAudit : John Public as of 2007 - 08 - 27 09 : 28 : 57 > ] >>> p_hist [ 0 ] . extra_1 'hardcoded_value' >>> p_hist [ 0 ] . extra_2 '27trackable_val'

Currently, you cannot filter on these trackable columns, this should be fixable.

Global Track Fields

What if you have a field you want tracked on every model that supports history? No problem! In the root of your project, create a file called settings_audit.py, and put something like this in it:

from django.db import models def callback_func_ptr2 ( original_instance ): import random return `random.randrange(1, 99)` + 'hardcoded_global_2' # Populate the fields that every Audit model in this app will use. GLOBAL_TRACK_FIELDS = ( ( 'global_1' , models . CharField ( maxlength = 50 ), 'hardcoded_global_1' ), ( 'global_2' , models . CharField ( maxlength = 20 ), callback_func_ptr2 ), )

GLOBAL_TRACK_FIELDS is set up exactly the same way as the track_fields option passed into AuditTrail, and has the same uses and limitations.

Caveats

This needs testing! This has only been used in a few cases, there's plenty of possible room for strangeness. It has specifically not been tested for things like safe (de-)serialization.

In order to copy the fields from the original model to the audit model, it uses some hackery I'm not particularly proud of. It seems to work for all the cases I would have hoped it would, but it relies on the arguments passed to the Field class being named the same as the attributes stored on the Field object after it's created. If there's ever a time that's not the case, it will fail completely on that Field type.

It fails completely on ManyToManyField s, something I've yet to remedy. That's definitely a must-have, but I haven't worked out the best way to go about it. And since this whole things isn't something I'm particularly interested in, I'm probably going to leave that up to somebody else to work out.

Likewise, it fails when there are multiple ForeignKeys pointing to the same Model, as it doesn't support / compensate for related_name.

It currently copies and overrides the model's __str__ method, so that it can helpfully describe each entry in the audit history. This means, however, that if your __str__ method relies on any other methods (such as get_full_name or similar), it won't work and will need to be adjusted.

Code

Hopefully there are enough comments to make sense of what's going on. More information can be found ​here.