A while ago some of us tried to find the best solution for a tail recursion decorator. The story began when Crutcher Dunnavant came up with a surprising decorator that was able to eliminate tail recursion. His solution used stack inspections and could flatten the call stack s.t. the maximum height of the stack became 2. I gave an alternative, faster implementation that omitted stack inspections. This solution was further improved by Michele Simionato and George Sakkis. The story already ends here because the performance penalty of those decorators was still quite considerable. On the samples we used the undecorated recursive function was more than twice as fast as the decorated one. So the decorator added more than 100% overhead.

Today I tried something new. I reimplemented Georges decorator in Cython which is considerably faster than Python and then I compared the performance of undecorated code, code that is decorated with the pure Python decorator and the Cython version of it.

Here you can download the relevant source code for the decorators.

The following implementation shows the Cython decorator defined in tail_recursive.pyx

cdef class tail_recursive: cdef int firstcall cdef int CONTINUE cdef object argskwd cdef object func def __init__ ( self , func ) : self . func = func self . firstcall = True self . CONTINUE = id ( object ( ) ) def __call__ ( self , *args , **kwd ) : if self . firstcall : self . firstcall = False try : while True : result = self . func ( *args , **kwd ) if result == self . CONTINUE : # update arguments args , kwd = self . argskwd else : # last call return result finally : self . firstcall = True else : # return the arguments of the tail call self . argskwd = args , kwd return self . CONTINUE

Here are some performance tests.

As a sample function I used a tail recursive factorial

def factorial ( n , acc = 1 ) : "calculate a factorial" return ( acc if n == 0 else factorial ( n- 1 , n*acc ) )

The timining function is defined by

import time def mtime ( foo ) : a = time . time ( ) for j in range ( 10 ) : for i in range ( 900 ) : foo ( i ) return time . time ( ) -a

The results I got were:

8.484 -- undecorated 9.405 -- with Cython decorator + 10% 17.93 -- with Python decorator + 111%

tail_recursive

def even ( n ) : if n == 0 : return True else : return odd ( n- 1 ) def odd ( n ) : if n == 0 : return False else : return even ( n- 1 )

Next I checked out a pair of mutual recursive functions. Notice that in those pairs only one function may be decorated by

Here are the results:

2.969 -- undecorated 3.312 -- with Cython decorator + 11% 7.437 -- with Python decorator + 150%

factorial

These are about the same proportions as forexample.

My conclusion is that one can expect about 10% performance penalty for Cythons tail_recursive decorator. This is a quite good result and I don’t shy away from recommending it.