Django Patterns: Pluggable Backends

As the first installment in a series on common patterns in Django development, I'd like to discuss the Pluggable Backend pattern. The pattern addresses the common problem of providing extensible support for multiple implementations of a lower-level function, for example caching, database querying, etc.

Problem

The use of this pattern often coincides with places where the application needs to be configurable to use one of many possible solutions, as in the case of database engine support. Consider the following:

The application needs to support Backend A and Backend B but if you look closely at the methods exposed there are some discrepancies:

Backend A has slightly more verbose method names than Backend B

Backend B does not accept a default value on the get() method

In Django we see this pattern all over the place:

Analysis

Here is a first stab at getting our Application to talk to backend A and B:

class ApplicationBackend ( object ): def __init__ ( self , backend ): # here we are self . backend = backend def get ( self , key , default = None ): if isinstance ( self . backend , BackendA ): return self . backend . get_data ( key , default ) elif isinstance ( self . backend , BackendB ): try : return self . backend . get ( key ) except BackendB . KeyError : return default def set ( self , key , value ): ... etc ...

Notice how tightly-coupled our Application is to backends A & B. If backend C comes along, then we're in our code adding extra elif checks all over the place. What if an end-user wants to write support for a proprietary backend? Then they have to go into your code and add the special-casing -- there has to be a better way!

Solution

The solution is to add a layer of abstraction between your application and the backend that unifies the APIs.

Most commonly, you will specify your API in a BaseBackend (in this case the BaseAdapter class is the BaseBackend). You will also specify some way of routing between the interface you expose through the Application and the Adapter that communicates directly with the Backend. In Django this is usually done by specifying a path to your module then dynamically importing the backend at runtime.

Let's see some code:

class ApplicationBackend ( object ): """ Our application ships all logic off to the adapter, which has a single, unified interface """ def __init__ ( self , adapter ): self . adapter = adapter # we'll cover dynamic loading below def get ( self , key , default = None ): return self . adapter . get ( key , default ) def set ( self , key , value ): self . adapter . set ( key , value )

As you can see, our application only has to know how to talk to the BaseAdapter, which in this case implements two methods, get() and set() . Which adapter our application uses is configured at instantiation. Here is what the BaseAdapter looks like. It provides a default implementation of get() and set() , but could just as easily raise a NotImplementedError and force every subclass to define its own implementation:

class BaseAdapter ( object ): def __init__ ( self ): self . backend = self . get_backend () def get ( self , key , default = None ): """ Since Python does not have interfaces, its common to raise NotImplementedErrors when specifying a base class that you wish to act as an interface. If you did not want to specify a default behavior but leave all implementation up to your adapters, you would raise an exception here """ return self . backend . get ( key , default ) def set ( self , key , value ): """Same applies here as for the get() method""" return self . backend . set ( key , value )

Now it is just a matter of writing our specific adapters for Backend A and Backend B:

class AdapterA ( BaseAdapter ): """ Since BackendA uses different method names, we need to override the default behavior specified by the BaseBackend """ def get_backend ( self ): return self . connect_to_backend_a () def get ( self , key , default = None ): return self . backend . get_data ( key , default ) def set ( self , key , value ): return self . backend . set_data ( key , value ) class AdapterB ( BaseAdapter ): """ Since BackendB does not support a default value for the get() operation, we'll be sure to wrap the call in a try/except, catching the error and returning default. """ def get_backend ( self ): return self . connect_to_backend_b () def get ( self , key , default = None ): try : return super ( AdapterB , self ) . get ( key ) except self . backend . KeyError : return default

In the code snippets above, our Application loaded its adapter at initialization. Django rarely does this, favoring a setting instead. Let's look at an example of how you might allow a module path to be used to specify the default backend:

from django.conf import settings from django.utils.importlib import import_module # provide a sane default? APP_ADAPTER = getattr ( settings , 'APP_ADAPTER' , 'app.backends.adapter_a.AdapterA' ) def get_adapter (): # grab the classname off of the backend string package , klass = APP_ADAPTER . rsplit ( '.' , 1 ) # dynamically import the module, in this case app.backends.adapter_a module = import_module ( package ) # pull the class off the module and return return getattr ( module , klass )

Strengths and Weaknesses

This pattern allows loose coupling between the API you expose and the underlying code that does the actual work. The loose coupling also makes our interface extensible, as additional implementations can be written without needing to touch the actual application code.

The biggest weakness is feature loss, as this is generally a lowest-common-denominator solution. Suppose backend A has some awesome features that are not supported by backend B - in the interest of maintaining a consistent interface you're stuck either leaving those features out or implementing them yourself in AdapterB.

Other Uses

There are other uses for this pattern besides talking to various cache/db/storage backends. Spammers recently targeted some of my company's sites, hitting things like Comments and Blog Entries. We needed a solution that would work for these content types, as well as for any we decided to add down the road. The various models that needed filtering had a lot in common - like the user that created them (and their email address and IP), the content field that contained the spammy links, etc, but the fields were named something different, or were on a related model. We could have used introspection or done a special-case solution, but instead I opted for a pluggable approach.

The big difference between this example and the example above is that the routing logic is baked right into the BaseBackend , which in this case is the SpamFilter itself. So the SpamFilter class contains not only the logic for handling spammy content, but also contains a registry of more specialized spam filters. The workflow is something like this:

Create custom filters for any models you need to special case, in this case comments and blog entries

Register those filters with the SpamFilter so it can use them when comments or blog entries come through

so it can use them when comments or blog entries come through Whenever new user-generated-content is created, send it to the SpamFilter.check_spam() method

method The SpamFilter instance will see if it has a filter for the new piece of content, falling back to a default implementation

The code works like this:

class SpamFilter ( object ): _filters = {} def add_filter ( self , model , filter_class ): self . _filters [ model ] = filter_class def remove_filter ( self , model ): del ( self . _filters [ model ]) def get_filter_for_object ( self , model_instance ): # return the proper filter to use for this model instance - if one # does not exist, fall back to the default implementation provided # by the SpamFilter class (which uses introspection) for ( model , filter_class ) in self . _filters . items (): if isinstance ( model_instance , model ): return filter_class () return self def check_spam ( self , model_instance ): # grab the correct filter to use for this model spam_filter = self . get_filter_for_object ( model_instance ) # use our custom backend to get the right fields off the object user = spam_filter . get_user ( model_instance ) content = spam_filter . get_content ( model_instance ) # call out to Akismet, or whatever here object_is_spam = self . make_api_call ( user , content ) if object_is_spam : # if the object is spam, allow the spam filter to specify a # callback that will do the appropriate thing. with comments # this generally means marking is_public = False, with blogs # it means setting the status to a special spam flag spam_filter . object_is_spam ( model_instance ) return object_is_spam def get_user ( self , model_instance ): # introspect the model - subclasses should override def get_content ( self , model_instance ): # introspect the model - subclasses should override def object_is_spam ( self , model_instance ): # default behavior when spam is found is to mail the managers mail_managers ( ... ) class CommentFilter ( SpamFilter ): def get_user ( self , model_instance ): return model_instance . user def get_content ( self , model_instance ): return model_instance . comment def object_is_spam ( self , model_instance ): model_instance . is_public = False model_instance . save () spam_filter = SpamFilter () spam_filter . register ( Comment , CommentFilter )

Conclusion

I hope you found this information useful. It is one of the more common patterns I see both in Django and in the wider sphere of reusable apps. I'm planning a couple more entries in this vein, Django Patterns, so keep an eye out for new posts! As always, any comments, feedback, suggestions, errata, etc are appreciated. Thanks for reading.

More Examples

Commenting has been closed, but please feel free to contact me