Self-referencing many-to-many through

Django's ManyToMany through attribute allows you to describe relationships between objects. I've written a post about this - (Describing Relationships, Django's ManyToMany Through) - and so I won't cover here the details of its implementation or usage. What I want to talk about in this post is how to create ManyToMany relationships between objects of the same kind, and more than that, to show how those relationships can be described using through models.

Asymmetrical Relationships - the Twitter model

On twitter you follow people. Maybe some people follow you, but the relationships are all in one direction, asymmetrical. In Django you can implement this using a ManyToMany relationship. We don't need a special through model for this, but suppose we wanted to attach some metadata to those relationships. Below is sample code for a twitter-style database of people and their relationships with one another. The relationships carry a status column denoting whether a particular user is following another or blocking another:

class Person ( models . Model ): name = models . CharField ( max_length = 100 ) relationships = models . ManyToManyField ( 'self' , through = 'Relationship' , symmetrical = False , related_name = 'related_to' ) def __unicode__ ( self ): return self . name RELATIONSHIP_FOLLOWING = 1 RELATIONSHIP_BLOCKED = 2 RELATIONSHIP_STATUSES = ( ( RELATIONSHIP_FOLLOWING , 'Following' ), ( RELATIONSHIP_BLOCKED , 'Blocked' ), ) class Relationship ( models . Model ): from_person = models . ForeignKey ( Person , related_name = 'from_people' ) to_person = models . ForeignKey ( Person , related_name = 'to_people' ) status = models . IntegerField ( choices = RELATIONSHIP_STATUSES )

Taking a look at the models, what's important to note is that on the Person model I've created a ManyToMany to self through Relationship . The attribute asymmetrical is True, but when you're using intermediary models in Django this is a must because Django won't know exactly how to describe the other side of relationship since the through model may have any number of fields besides ForeignKeys. Which brings up the next model, Relationship. Relationship has two foreign keys to Person, and a status, which indicates the type of relationship 'from_person' has to 'to_person'. Now, let's add some methods to the Person model to make it easier to talk about how these relationships can be used:

def add_relationship ( self , person , status ): relationship , created = Relationship . objects . get_or_create ( from_person = self , to_person = person , status = status ) return relationship def remove_relationship ( self , person , status ): Relationship . objects . filter ( from_person = self , to_person = person , status = status ) . delete () return

Adding and removing relationships requires no magic - we can deal directly with the Relationship model and create or delete instances of it. If we wanted to find out who is following a user, though, it's sort of obnoxious to query Relationship and then extract the people from the returned queryset. This is where the ManyToMany comes in. We can query the 'relationships' (and its partner 'related_to') to look at Relationship objects and return the people they refer to. Here are some more methods for the Person model:

def get_relationships ( self , status ): return self . relationships . filter ( to_people__status = status , to_people__from_person = self ) def get_related_to ( self , status ): return self . related_to . filter ( from_people__status = status , from_people__to_person = self ) def get_following ( self ): return self . get_relationships ( RELATIONSHIP_FOLLOWING ) def get_followers ( self ): return self . get_related_to ( RELATIONSHIP_FOLLOWING )

Looking at the actual SQL helps me understand what these ORM incantations actually mean. Creating a relationship between two users is a simple INSERT into the relationships table. But reading relationships out and referring them back to people in a meaningful and efficient way is the biggest win of using the ManyToMany. Here is the SQL for getting who a person is following:

SELECT twitter_person . id , twitter_person . name FROM twitter_person INNER JOIN twitter_relationship ON ( twitter_person . id = twitter_relationship . to_person_id ) WHERE ( twitter_relationship . from_person_id = 1 AND twitter_relationship . status = 1 )

This is opposed to what the ORM would run if we got a Relationship queryset and then iterated over it to find out who the 'to_user' was:

SELECT twitter_relationship . id , twitter_relationship . from_person_id , twitter_relationship . to_person_id , twitter_relationship . status FROM twitter_relationship WHERE ( twitter_relationship . status = 1 AND twitter_relationship . from_person_id = 1 ) -- followed by this for every twitter user returned: SELECT * FROM twitter_person WHERE id = X

It's generally much more efficient to use the JOIN and execute just one query. The 'get_relationships' and 'get_related_to' are simple wrappers around filter which creates the appropriate query. Here's an example of what you might do:

In [1]: from twitter.models import Person In [2]: john = Person.objects.create(name='John') In [3]: paul = Person.objects.create(name='Paul') In [4]: from twitter.models import RELATIONSHIP_FOLLOWING In [5]: john.add_relationship(paul, RELATIONSHIP_FOLLOWING) Out[5]: <Relationship: Relationship object> In [6]: john.get_following() Out[6]: [<Person: Paul>] In [7]: paul.get_followers() Out[7]: [<Person: John>] In [8]: paul.add_relationship(john, RELATIONSHIP_FOLLOWING) Out[8]: <Relationship: Relationship object> In [9]: paul.get_following() Out[9]: [<Person: John>] In [10]: yoko = Person.objects.create(name='Yoko') In [11]: john.add_relationship(yoko, RELATIONSHIP_FOLLOWING) Out[11]: <Relationship: Relationship object> In [12]: paul.remove_relationship(john, RELATIONSHIP_FOLLOWING) In [13]: john.get_following() Out[13]: [<Person: Paul>, <Person: Yoko>] In [14]: paul.get_following() Out[14]: []

Now, let's add one more thing to the mix. Say that if two people are following eachother, we'll call them 'friends'. How you would implement this is by combining the two queries for get_followers and get_following:

def get_friends ( self ): return self . relationships . filter ( to_people__status = RELATIONSHIP_FOLLOWING , to_people__from_person = self , from_people__status = RELATIONSHIP_FOLLOWING , from_people__to_person = self )

Symmetrical Relationships - the Facebook model

Django's ManyToManyField allows you to specify a 'symmetrical' attribute, but you cannot use this when also specifying a 'through' model. We can actually use most of the model definitions from above -- the only change will be to the ManyToMany field:

class Person ( models . Model ): name = models . CharField ( max_length = 100 ) relationships = models . ManyToManyField ( 'self' , through = 'Relationship' , symmetrical = False , related_name = 'related_to+' )

It's hard to spot the difference. Note the plus-sign at the end of related_name . This indicates to Django that the reverse relationship should not be exposed. Since the relationships are symmetrical, this is the desired behavior, after all, if I am friends with person A, then person A is friends with me. Django won't create the symmetrical relationships for you, so a bit needs to get added to the add_relationship and remove_relationship methods to explicitly handle the other side of the relationship:

def add_relationship ( self , person , status , symm = True ): relationship , created = Relationship . objects . get_or_create ( from_person = self , to_person = person , status = status ) if symm : # avoid recursion by passing `symm=False` person . add_relationship ( self , status , False ) return relationship def remove_relationship ( self , person , status , symm = True ): Relationship . objects . filter ( from_person = self , to_person = person , status = status ) . delete () if symm : # avoid recursion by passing `symm=False` person . remove_relationship ( self , status , False )

Now, whenever we create a relationship going one way, its complement is created (or removed). Since the relationships go in both directions, we can get rid of the following/followers stuff and simply use:

def get_relationships ( self , status ): return self . relationships . filter ( to_people__status = status , to_people__from_person = self )

Using it in the admin

You may want to use access the Relationships in the context of a Person in the admin. Since the Relationship model has two foreign keys to Person, the underlying code that instantiates the inlines will blow up unless you specify a ForeignKey to use. Here's how to make it work:

# admin.py from django.contrib import admin from twitter.models import Person , Relationship class RelationshipInline ( admin . StackedInline ): model = Relationship fk_name = 'from_person' class PersonAdmin ( admin . ModelAdmin ): inlines = [ RelationshipInline ] admin . site . register ( Person , PersonAdmin )

Doing other cool stuff

The pattern here can be used to do a lot more than just describe who's friends with who. One possible improvement is to normalize status types into its own proper model, so status-types can be defined more dynamically. You could even make a Relationship's status a ManyToMany itself. Another possible use would be if you had a tree-like structure but wanted to describe relationships between Nodes that may not be direct descendants of one another. Anyways, that's about it for this post - I hope you found it useful!

Relationships App

Commenting has been closed, but please feel free to contact me