Descriptors, introduced in Python 2.2, provide a way to add managed attributes to objects. They are not used much in everyday programming, but it’s important to learn them to understand a lot of the “magic” that happens in the standard library and third-party packages.

The problem

Imagine we are running a bookshop with an inventory management system written in Python. The system contains a class called Book that captures the author, title and price of physical books.

class Book(object):<br />

def __init__(self, author, title, price):<br />

self.author = author<br />

self.title = title<br />

self.price = price</p>

<p> def __str__(self):<br />

return “{0} – {1}”.format(self.author, self.title) 1 2 3 4 5 6 7 8 class Book ( object ) : def __init__ ( self , author , title , price ) : self . author = author self . title = title self . price = price def __str__ ( self ) : return “{0} – {1}” . format ( self . author , self . title )

Our simple Book class works fine for a while, but eventually bad data starts to creep into the system. The system is full of books with negative prices or prices that are too high because of data entry errors. We decide that we want to limit book prices to values between 0 and 100. In addition, the system contains a Magazine class that suffers from the same problem, so we want our solution to be easily reusable.

The descriptor protocol

The descriptor protocol is simply a set of methods a class must implement to qualify as a descriptor. There are three of them:

__get__ ( self , instance , owner )

__set__ ( self , instance , value )

__delete__ ( self , instance )

__get__ accesses a value stored in the object and returns it.

__set__ sets a value stored in the object and returns nothing.

__delete__ deletes a value stored in the object and returns nothing.

Using these methods, we can write a descriptor called Price that limits the value stored in it to between 0 and 100.

from weakref import WeakKeyDictionary</p>

<p>class Price(object):<br />

def __init__(self):<br />

self.default = 0<br />

self.values = WeakKeyDictionary()</p>

<p> def __get__(self, instance, owner):<br />

return self.values.get(instance, self.default)</p>

<p> def __set__(self, instance, value):<br />

if value < 0 or value > 100:<br />

raise ValueError(“Price must be between 0 and 100.”)<br />

self.values[instance] = value</p>

<p> def __delete__(self, instance):<br />

del self.values[instance] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 from weakref import WeakKeyDictionary class Price ( object ) : def __init__ ( self ) : self . default = 0 self . values = WeakKeyDictionary ( ) def __get__ ( self , instance , owner ) : return self . values . get ( instance , self . default ) def __set__ ( self , instance , value ) : if value < 0 or value > 100 : raise ValueError ( “Price must be between 0 and 100.” ) self . values [ instance ] = value def __delete__ ( self , instance ) : del self . values [ instance ]

A few details in the implementation of Price deserve mentioning.

An instance of a descriptor must be added to a class as a class attribute, not as an instance attribute. Therefore, to store different data for each instance, the descriptor needs to maintain a dictionary that maps instances to instance-specific values. In the implementation of Price, that dictionary is self.values.

A normal Python dictionary stores references to objects it uses as keys. Those references by themselves are enough to prevent the object from being garbage collected. To prevent Book instances from hanging around after we are finished with them, we use the WeakKeyDictionary from the weakref standard module. Once the last strong reference to the instance passes away, the associated key-value pair will be discarded.

Using descriptors

As we saw in the last section, descriptors are linked to classes, not to instances, so to add a descriptor to the Book class, we must add it as a class variable.

class Book(object):<br />

price = Price()</p>

<p> def __init__(self, author, title, price):<br />

self.author = author<br />

self.title = title<br />

self.price = price</p>

<p> def __str__(self):<br />

return “{0} – {1}”.format(self.author, self.title) 1 2 3 4 5 6 7 8 9 10 class Book ( object ) : price = Price ( ) def __init__ ( self , author , title , price ) : self . author = author self . title = title self . price = price def __str__ ( self ) : return “{0} – {1}” . format ( self . author , self . title )

The price constraint for books is now enforced.

>>> b = Book(“William Faulkner”, “The Sound and the Fury”, 12)<br />

>>> b.price<br />

12<br />

>>> b.price = -12<br />

Traceback (most recent call last):<br />

File “<pyshell#68>”, line 1, in <module><br />

b.price = -12<br />

File “<pyshell#58>”, line 9, in __set__<br />

raise ValueError(“Price must be between 0 and 100.”)<br />

ValueError: Price must be between 0 and 100.<br />

>>> b.price = 101<br />

Traceback (most recent call last):<br />

File “<pyshell#69>”, line 1, in <module><br />

b.price = 101<br />

File “<pyshell#58>”, line 9, in __set__<br />

raise ValueError(“Price must be between 0 and 100.”)<br />

ValueError: Price must be between 0 and 100. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 >>> b = Book ( “William Faulkner” , “The Sound and the Fury” , 12 ) >>> b . price 12 >>> b . price = – 12 Traceback ( most recent call last ) : File “<pyshell#68>” , line 1 , in < module > b . price = – 12 File “<pyshell#58>” , line 9 , in __set__ raise ValueError ( “Price must be between 0 and 100.” ) ValueError : Price must be between 0 and 100. >>> b . price = 101 Traceback ( most recent call last ) : File “<pyshell#69>” , line 1 , in < module > b . price = 101 File “<pyshell#58>” , line 9 , in __set__ raise ValueError ( “Price must be between 0 and 100.” ) ValueError : Price must be between 0 and 100.

How descriptors are accessed

So far we’ve managed to implement a working descriptor that manages the price attribute on our Book class, but how it works might not be clear. It all feels a bit too magical, but not to worry. It turns out that descriptor access is quite simple:

When we try to evaluate b . price and retrieve the value, Python recognizes that price is a descriptor and calls Book . price . __get__ .

and retrieve the value, Python recognizes that is a descriptor and calls . When we try to change the value of the price attribute, e.g. b . price = 23 , Python again recognizes that price is a descriptor and substitutes the assignment with a call to Book . price . __set__ .

attribute, e.g. , Python again recognizes that is a descriptor and substitutes the assignment with a call to . And when we try to delete the price attribute stored against an instance of Book , Python automatically interprets that as a call to Book . price . __delete__ .

The number 1 descriptor gotcha

Unless we fully understand the fact that descriptors are linked to classes and not to instances, and therefore need to maintain their own mapping of instances to instance-specific values, we might be tempted to write the Price descriptor as follows:

class Price(object):<br />

def __init__(self):<br />

self.__price = 0</p>

<p> def __get__(self, instance, owner):<br />

return self.__price</p>

<p> def __set__(self, instance, value):<br />

if value < 0 or value > 100:<br />

raise ValueError(“Price must be between 0 and 100.”)<br />

self.__price = value</p>

<p> def __delete__(self, instance):<br />

del self.__price 1 2 3 4 5 6 7 8 9 10 11 12 13 14 class Price ( object ) : def __init__ ( self ) : self . __price = 0 def __get__ ( self , instance , owner ) : return self . __price def __set__ ( self , instance , value ) : if value < 0 or value > 100 : raise ValueError ( “Price must be between 0 and 100.” ) self . __price = value def __delete__ ( self , instance ) : del self . __price

But once we start instantiating multiple Book instances, we’re going to have a problem.

>>> b1 = Book(“William Faulkner”, “The Sound and the Fury”, 12)<br />

>>> b1.price<br />

12<br />

>>> b2 = Book(“John Dos Passos”, “Manhattan Transfer”, 13)<br />

>>> b1.price<br />

13 1 2 3 4 5 6 >>> b1 = Book ( “William Faulkner” , “The Sound and the Fury” , 12 ) >>> b1 . price 12 >>> b2 = Book ( “John Dos Passos” , “Manhattan Transfer” , 13 ) >>> b1 . price 13

The key is to understand that there is only one instance of Price for Book, so every time the value in the descriptor is changed, it changes for all instances. That behaviour in itself is useful for creating managed class attributes, but it is not what we want in this case. To store separate instance-specific values, we need to use the WeakRefDictionary.

The property built-in function

Another way of building descriptors is to use the property built-in function. Here is the function signature:

property(fget=None, fset=None, fdel=None, doc=None) 1 property ( fget = None , fset = None , fdel = None , doc = None )

fget, fset and fdel are methods to get, set and delete attributes, respectively. doc is a docstring.

Instead of defining a single class-level descriptor object that manages instance-specific values, property works by combining instance methods from the class. Here is a simple example of a Publisher class from our inventory system with a managed name property. Each method passed into property has a print statement to illustrate when it is called.

class Publisher(object):<br />

def __init__(self, name):<br />

self.__name = name</p>

<p> def get_name(self):<br />

print(“getting name”)<br />

return self.__name</p>

<p> def set_name(self, value):<br />

print(“setting name”)<br />

self.__name = value</p>

<p> def delete_name(self):<br />

print(“deleting name”)<br />

del self.__name</p>

<p> name = property(get_name, set_name, delete_name, “Publisher name”) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 class Publisher ( object ) : def __init__ ( self , name ) : self . __name = name def get_name ( self ) : print ( “getting name” ) return self . __name def set_name ( self , value ) : print ( “setting name” ) self . __name = value def delete_name ( self ) : print ( “deleting name” ) del self . __name name = property ( get_name , set_name , delete_name , “Publisher name” )

If we make an instance of Publisher and access the name attribute, we can see the appropriate methods being called.

>>> p = Publisher(“Faber & Faber”)<br />

>>> p.name<br />

getting name<br />

‘Faber & Faber'<br />

>>> p.name = “Random House”<br />

setting name<br />

>>> del p.name<br />

deleting name 1 2 3 4 5 6 7 8 >>> p = Publisher ( “Faber & Faber” ) >>> p . name getting name ‘Faber & Faber’ >>> p . name = “Random House” setting name >>> del p . name deleting name

That’s it for this basic introduction to descriptors. If you want a challenge, take what you have learned and try to reimplement the @property decorator. There is enough information in this post to allow you to figure it out.