There are many aspects of Python that appeal to mathematicians. To mention only a few: built-in support for tuples, lists and sets, all with notation that is very similar to conventional mathematical notation, and list comprehensions which are very similar to set comprehension and the "set-builder" notation used for them.

Another set of features that are very appealing to the mathematically-minded are Python's iterators and generators, and the related itertools package. These tools make it easy to write elegant code that deals with such mathematical objects as infinite sequences, stochastic processes, recurrence relations, and combinatorial structures. This post covers my notes on iterators and generators, and goes over some of my experiments while I was learning more about these features of Python.

Iterators Iterators are objects that allow iteration over a collection. Such collections need not be of objects that already exist in memory, and because of this, they need not necessarily be finite. For official documentation, see "Iterator Types" in Python's documentation. Let's be a bit more precise in our definitions. An iterable is defined as an object that has an __iter__ method, which is required to return an iterator object. An iterator in turn is an object that has the two methods __iter__ and next (or __next__ in Python 3), with the former returning an iterator object and the latter returning the next element of the iteration. As far as I know, iterators always simply return self in their __iter__ method because they are their own iterators. Generally speaking, you should avoid calling __iter__ and next directly. By using for or list comprehensions, Python will automatically call them for you. If you have to call them manually, use Python's built-in functions iter and next and pass the iterator or container as parameter. For example, if c is an iterable, then use iter(c) instead of c.__iter__() and similarly, if a is an iterator, use next(a) instead of a.next() . This is similar to how you would use len . Speaking of len , it is worth mentioning that iterators do not have to, and often do not, have a well-defined length. As such, they often do not implement __len__ . If you would like to count the number of items in an iterator, you will have to do so manually, or use sum . An example of this is shown at the end of this article, under the " itertools Module" section. Some iterables are not iterators and instead have other objects as their iterators. For example, the list object is an iterable but not an iterator (it implements __iter__ but not next ). Iterators for list objects are of type listiterator as you can see in the example below. Also notice how while list objects have a well-defined length, listiterator objects do not. >>> a = [ 1 , 2 ] >>> type ( a ) <type 'list'> >>> type ( iter ( a )) <type 'listiterator'> >>> it = iter ( a ) >>> next ( it ) 1 >>> next ( it ) 2 >>> next ( it ) Traceback (most recent call last): File "<stdin>" , line 1 , in <module> StopIteration >>> len ( a ) 2 >>> len ( it ) Traceback (most recent call last): File "<stdin>" , line 1 , in <module> TypeError : object of type 'listiterator' has no len() When an iterator finishes it is expected by Python's interpreter to raise a StopIteration exception. However, as mentioned, iterators can iterate over an infinite set. For such iterators, it is the user's responsibility to ensure their use does not lead to an infinite loop. An example of this is shown in a bit. A very basic example of an iterator is given below. This iterator will simply start counting at 0 and go up indefinitely. It is a simpler version of itertools.count . class count_iterator ( object ): n = 0 def __iter__ ( self ): return self def next ( self ): y = self . n self . n += 1 return y Usage example is below. Note that the last line attempts to convert the iterator object to a list, which will result in an infinite loop since the particular iterator never ends. >>> counter = count_iterator () >>> next ( counter ) 0 >>> next ( counter ) 1 >>> next ( counter ) 2 >>> next ( counter ) 3 >>> list ( counter ) # This will result in an infinite loop! Finally, to be accurate we should make an amendment to the above: objects that do not have the method __iter__ defined can still be iterable if they define __getitem__ . In this case when Python's built-in function iter will return an iterator of type iterator for the object, which uses __getitem__ to go over the items in the list. If either StopIteration or IndexError is raised by __getitem__ then the iteration stops. Let's see an example of this: class SimpleList ( object ): def __init__ ( self , * items ): self . items = items def __getitem__ ( self , i ): return self . items [ i ] And its use: >>> a = SimpleList ( 1 , 2 , 3 ) >>> it = iter ( a ) >>> next ( it ) 1 >>> next ( it ) 2 >>> next ( it ) 3 >>> next ( it ) Traceback (most recent call last): File "<stdin>" , line 1 , in <module> StopIteration Now let's look at a more interesting example: generating the Hofstadter Q sequence given initial conditions using an iterator. Hofstadter first mentioned this nested recurrence in his book "Gödel, Escher, Bach: An Eternal Golden Braid" and since then the problem of proving that the sequence is well-defined for all values of n has been open. The code below uses an iterator to generate a sequence given by the nested recurrence Q ( n ) = Q ( n − Q ( n − 1 ) ) + Q ( n − Q ( n − 2 ) ) Q(n) = Q(n-Q(n-1)) + Q(n-Q(n-2)) Q ( n ) = Q ( n − Q ( n − 1 ) ) + Q ( n − Q ( n − 2 ) ) given a list as the initial conditions. For example, qsequence([1, 1]) will generate exactly the Hofstadter sequence. We use the StopIteration exception to indicate the sequence can not go any further because an invalid index is required to generate the next element. For example, if the initial conditions are [1,2], then the sequence "terminates" immediately. class qsequence ( object ): def __init__ ( self , s ): self . s = s [:] def next ( self ): try : q = self . s [ - self . s [ - 1 ]] + self . s [ - self . s [ - 2 ]] self . s . append ( q ) return q except IndexError : raise StopIteration () def __iter__ ( self ): return self def current_state ( self ): return self . s And here's how you can use it: >>> Q = qsequence ([ 1 , 1 ]) >>> next ( Q ) 2 >>> next ( Q ) 3 >>> [ next ( Q ) for __ in xrange ( 10 )] [3, 4, 5, 5, 6, 6, 6, 8, 8, 8]

Generators Generators are iterators that are defined using a simpler function notation. More specifically, a generator is a function that uses the yield expression somewhere in it. Generators can not return values, and instead yield results when they are ready. Python automates the process of remembering a generator's context, that is, where its current control flow is, what the value its local variables are, etc. Each time a generator is called using next it yields the next value in the iteration. The __iter__ method will also be automatically implemented, meaning generators can be used anywhere an iterator is needed. Let's look at an example. This implementation is functionally equivalent to our iterator class above, but is much more compact and readable. def count_generator (): n = 0 while True : yield n n += 1 Now let's see it in action: >>> counter = count_generator () >>> counter <generator object count_generator at 0x106bf1aa0> >>> next ( counter ) 0 >>> next ( counter ) 1 >>> iter ( counter ) <generator object count_generator at 0x106bf1aa0> >>> iter ( counter ) is counter True >>> type ( counter ) <type 'generator'> Now, let's implement Hofstadter's Q sequence using a generator. Notice that the implementation is much simpler. However, we no longer can implement methods such as current_state above. As far as I know, it is NOT possible to access any variables that are stored in the context of a generator from outside the generator and as such current_state can not be accessed from the generator object. (OK, there is the gi_frame.f_locals dictionary but it is implementation specific, namely specific to CPython, and is not a standard part of the language. As such it is not recommended for use.) One possible solution in these circumstances is to yield both the result, and the list so far but I will leave that as an exercise. def hofstadter_generator ( s ): a = s [:] while True : try : q = a [ - a [ - 1 ]] + a [ - a [ - 2 ]] a . append ( q ) yield q except IndexError : return Notice that a simple return statement with no parameter ends the iteration of a generator. Internally, this raises a StopIteration exception. Next example for generators comes from my randomness extraction article article. Here we use two generators to implement first a Bernoulli process, which is an infinite sequence of random boolean values, with True having probability p p p and False probability q = 1 − p q=1-p q=1−p , and then to implement a von Neumann extractor that takes as input a Bernoulli process with 0 < p < 1 0<p<1 0<p<1 as a source of entropy and returns a fair Bernoulli process, with p = 0.5 p=0.5 p=0.5 . import random def bernoulli_process ( p ): if p > 1.0 or p < 0.0 : raise ValueError ( "p should be between 0.0 and 1.0." ) while True : yield random . random () < p def von_neumann_extractor ( process ): while True : x , y = process . next (), process . next () if x != y : yield x Finally, generators are great tools for implementing discrete dynamical systems. The following example shows how the famous tent map dynamical system can be implemented nicely using generators. (On a side note, have a look at how numerical inaccuracies begin to creep in relatively fast and then grow exponentially, a key feature of dynamical systems such as the tent map.) >>> def tent_map ( mu , x0 ): ... x = x0 ... while True : ... yield x ... x = mu * min ( x , 1.0 - x ) ... >>> >>> t = tent_map ( 2.0 , 0.1 ) >>> for __ in xrange ( 30 ): ... print t . next () ... 0.1 0.2 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.8 0.4 0.799999999999 0.400000000001 0.800000000003 0.399999999994 0.799999999988 0.400000000023 0.800000000047 0.399999999907 0.799999999814 0.400000000373 0.800000000745 0.39999999851 0.79999999702 Another somewhat similar example would be the Collatz sequence. def collatz ( n ): yield n while n != 1 : n = n / 2 if n % 2 == 0 else 3 * n + 1 yield n Notice again that in this example also, we do not have to raise StopIteration manually since it will be raised automatically when control flow reaches the end of the function. Example of the collatz generator in use: >>> # If the Collatz conjecture is true then list(collatz(n)) for any n will ... # always terminate (though your machine might run out of memory first!) >>> list ( collatz ( 7 )) [7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] >>> list ( collatz ( 13 )) [13, 40, 20, 10, 5, 16, 8, 4, 2, 1] >>> list ( collatz ( 17 )) [17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] >>> list ( collatz ( 19 )) [19, 58, 29, 88, 44, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1]

Recursive Generators Generators can be recursive, just like any other function. Let's see an example of this by implementing our own simpler version of itertools.permutations , namely a generator that generates all permutations of a given list of items. (Note: In practice use itertools.permutations instead of the method given here. It's faster!) The basic idea here is simple: for each element in the list, we move it to be the first element by swapping it with the first element, and then we permute the rest of the list. def permutations ( items ): if len ( items ) == 0 : yield [] else : pi = items [:] for i in xrange ( len ( pi )): pi [ 0 ], pi [ i ] = pi [ i ], pi [ 0 ] for p in permutations ( pi [ 1 :]): yield [ pi [ 0 ]] + p >>> for p in permutations ([ 1 , 2 , 3 ]): ... print p ... [1, 2, 3] [1, 3, 2] [2, 1, 3] [2, 3, 1] [3, 1, 2] [3, 2, 1]

Generator Expressions Generator expressions let you define generators using an even simpler, inline, notation. This notation is very similar to Python's list comprehension notation. For example, the following provides a generator that iterates over all perfect squares. Note how the results of generator expressions are an objects of type generator , and as such they implement both next and __iter__ methods. >>> g = ( x ** 2 for x in itertools . count ( 1 )) >>> g < generator object < genexpr > at 0x1029a5fa0 > >>> next ( g ) 1 >>> next ( g ) 4 >>> iter ( g ) < generator object < genexpr > at 0x1029a5fa0 > >>> iter ( g ) is g True >>> [ g . next () for __ in xrange ( 10 )] [ 9 , 16 , 25 , 36 , 49 , 64 , 81 , 100 , 121 , 144 ] We can also implement the Bernoulli process given in previous section using a generator expression, in this example with p = 0.4 p=0.4 p=0.4 . If a generator expressions requires another iterator to be used as the "loop counter" then itertools.count is a good choice if the generator expression is to produce an infinite sequence. Otherwise xrange can be used. >>> g = ( random . random () < 0.4 for __ in itertools . count ()) >>> [ g . next () for __ in xrange ( 10 )] [ False , False , False , True , True , False , True , False , False , True ] As mentioned, generator expressions can be passed to any function that requires an iterator as an argument. For example, we can sum the first ten perfect squares by writing: >>> sum ( x ** 2 for x in xrange ( 10 )) 285 More examples of generator expressions are given in next section.

Further Reading Pramode's article on generators contains a very interesting example implementing an efficient but simple sieve of Eratosthenes. David Beazley also has an excellent page on generators found here. Make sure to check it out for more interesting examples and use cases.