14:40

7 November 2009

Django has quite a bit of code in it devoted to transaction management. Although the documentation goes into quite a bit of depth on transactions, I’ve never felt that the docs by themselves let you build a good mental model of how transactions actually work. So, I decided to approach it experimentally: Build a small Django app, and see how the various options actually work.

For all of this, I’m using Django 1.1.1 and PostgreSQL 8.4.1, through the psycopg2 interface.

For the application, we have a models.py :

from django.db import models class Address(models.Model): street_address = models.CharField(max_length=80) city = models.CharField(max_length=80) state = models.CharField(max_length=2) zip = models.CharField(max_length=5) def __unicode__(self): return self.street_address + " " + self.city + " " + self.state + " " + self.zip class Order(models.Model): customer_name = models.CharField(max_length=80) shipping_address = models.ForeignKey(Address) def __unicode__(self): return "Order " + unicode(self.id) + " going to " + \ self.customer_name + ", " + unicode(self.shipping_address)

and a views.py :

from django.http import HttpResponse from djangotrans.example.models import Order, Address from django.db import transaction, connection def create_order(request): address = Address(street_address="1112 E Broad St", city="Westfield", state="NJ", zip="07090") address.save() order = Order(customer_name="Gomez Addams", shipping_address=address) order.save() return HttpResponse("Created: " + unicode(order)) def show_order(request, order_id): order = Order.objects.get(pk=order_id) return HttpResponse(unicode(order))

and a urls.py :

from django.conf.urls.defaults import * urlpatterns = patterns('djangotrans.example.views', (r'^createorder/$', 'create_order'), (r'^showorder/(?P<order_id>\d+)$', 'show_order'), )

The Basic Rule

In absence of any mention of transaction management at all, Django opens a new transaction with each database operation, and closes it when the operation is complete. If all you do is read operations on the model, it does a ROLLBACK to close the transaction; if you do a data-changing operation, it does a COMMIT . For example, in create_order , the pattern looks something like:

# BEGIN address = Address(street_address="1112 E Broad St", city="Westfield", state="NJ", zip="07090") address.save() # COMMIT # BEGIN order = Order(customer_name="Gomez Addams", shipping_address=address) order.save() # COMMIT

… and in show_order :

# BEGIN order = Order.objects.get(pk=order_id) # ROLLBACK

This is simple, straight-forward… and wrong. The two operations in save_order should be wrapped in a single transaction. If we saved the address, but didn’t then save the order, we’d have an “orphan” address, and that would be just tragic. So, what to do? Fortunately, Django provides a rather large toolbox full of other ways of managing transactions.

Transaction Middleware

First, let’s take a look at the transaction middleware. This is a middleware component that you add to settings.py :

MIDDLEWARE_CLASSES = ( 'django.middleware.common.CommonMiddleware', 'django.contrib.sessions.middleware.SessionMiddleware', 'django.contrib.auth.middleware.AuthenticationMiddleware', 'django.middleware.transaction.TransactionMiddleware', )

This component sets a flag in Django turning off the automatic COMMIT / ROLLBACK behavior, and instead does an automatic BEGIN when each request starts, and then a COMMIT (if the view function returns normally and did a data-changing operation) or ROLLBACK (when the view function throws an exception, or returns normally having not modified data.)

Doing this to our extremely sophisticated application above, we get a much more pleasing pattern of transaction management:

# BEGIN address = Address(street_address="1112 E Broad St", city="Westfield", state="NJ", zip="07090") address.save() order = Order(customer_name="Gomez Addams", shipping_address=address) order.save() # COMMIT

… and in show_order :

# BEGIN order = Order.objects.get(pk=order_id) # ROLLBACK

Much better.

Transaction Decorators

Django provides a set of transaction decorators that can be added to view functions to explicitly control how transactions work for that particular function. They work the same way whether or not the transaction middleware is present. They are:

@transaction.autocommit — This sets the default Django behavior: A COMMIT after each save() , and a ROLLBACK after each database read. If the transaction middleware isn’t present, this doesn’t do anything; if the transaction middleware is present, this reverts this particular function back to the default behavior.

— This sets the default Django behavior: A after each , and a after each database read. If the transaction middleware isn’t present, this doesn’t do anything; if the transaction middleware is present, this reverts this particular function back to the default behavior. @transaction.commit_on_success — This sets the transaction middleware behavior of commiting all of the work done in a single view method (if data modification occurs and the view returns successfully), or doing a single ROLLBACK at the end (if data modification does not occur, or the method throws an exception). It’s redunant if the transaction middleware is enabled.

— This sets the transaction middleware behavior of commiting all of the work done in a single view method (if data modification occurs and the view returns successfully), or doing a single at the end (if data modification does not occur, or the method throws an exception). It’s redunant if the transaction middleware is enabled. @transaction.commit_manually — This overrides the current transaction behavior, and requires that the view function call either transaction.commit() or transaction.rollback() at the appropriate place or places (they can be called multiple times within a single view function). If a view function exits with data-modifying work done after the last commit() or rollback() call, the decorator will throw a TransactionManagementError exception.

So, What’s Not to Like?

That seems to be a pretty comprehensive set of transaction control mechanisms. So, what else could one want? Let’s take a look at what is actually sent to the database when we show an order:

LOG: statement: SET DATESTYLE TO 'ISO' LOG: statement: SHOW client_encoding LOG: statement: SHOW default_transaction_isolation LOG: statement: BEGIN; SET TRANSACTION ISOLATION LEVEL READ COMMITTED LOG: statement: SET TIME ZONE E'America/Los_Angeles' LOG: statement: SELECT "example_order"."id", "example_order"."customer_name", "example_order"."shipping_address_id" FROM "example_order" WHERE "example_order"."id" = 47 LOG: statement: SELECT "example_address"."id", "example_address"."street_address", "example_address"."city", "example_address"."state", "example_address"."zip" FROM "example_address" WHERE "example_address"."id" = 79 LOG: statement: ROLLBACK

The transaction here isn’t actually required at all, setting aside the other overhead. Can we get rid of it? In part two, we’ll see how we can.