While understanding the different python concepts I came across the topic Generator and Iterator, Which of them is faster, what is yield keyword etc. I tried googling and found a bunch of blog posts. Each of them had their own ways of explaining things which ended making me more frustrated. I started from scratch again and it took a concerted effort to understand the underlying concept.

What are Generator functions?

A function containing a yield keyword inside a loop is called a Generator. And if there is no yield keyword it is an Iterator.

How is it different from an iterator?

Basically Iterators contains two main methods:

__iter__ returns the iterator object itself. This is used in for and in statements.

__next__ method returns the next value from the iterator. If there are no more items to return then it should raise StopIteration exception.

It means what iterator function do is to follow a loop, iterate through all the elements. Once the iteration has been complete it triggers StopIteration exception. When the function is called again it starts from the beginning.

While what Generators do is, complete one iteration, saves its current state and remains paused until the next() function is not called.

What is meant by saving the current state?

By saving the state I meant, all the values of local variables used in the function, logics i.e loops etc are saved into the memory before getting paused. When the function is called again then it will resume execution directly after the yield statement along with reloading the status of the function that was saved previously.

Let’s try to understand it by dry running a program simple_generator.py :

# simple_generator.py def simpleGeneratorFun():

pre1 = 1

pre2 = 1

sumValue = 1

yield sumValue

print('First stop')



for i in range(0, 5):

yield sumValue

print('Second Stop')

import pdb; pdb.set_trace()

sumValue = pre1 + pre2

pre1 = pre2

pre2 = sumValue

The above is a simple generator function. I have used pdb to debug the code to understand the code clearly. Let’s do this:-

abhi@thinkpad ~/pythonbook $ ipython



In [1]: from generator import simpleGeneratorFun In [2]: a = simpleGeneratorFun() In [3]: a.next()

Out[3]: 1 In [4]: a.next()

First stop

Out[4]: 1

Until now that execution has been done till line 9

sumValue = 1 <--- 7th line

yield sumValue <--- 8th line

print('First stop') <---- 9th line

When the first time we did a.next() , the program had come to line no 8. The variable sumValue has already been assigned the value 1 in line no 7. When the program encountered the yield statement, it saved the state/values of local variables, returned the yield variable (in our case sumValue) and the program(function) get paused.

print('First stop') # <--- 9th line



for i in range(0, 5): # <--- 11th line

yield sumValue # <--- 12th line

print('Second Stop') # <---- 13th line

import pdb; pdb.set_trace() # <---- 14th line

When we did a.next() the second time then the program started it’s execution directly from line no 9, restoring the all the states(values) of variables defined in the program. It will again encounter a yield statement at line no 12. It will again return the value of sumValue that equals to 1 and get paused.

In [5]: a.next() # 3rd time

Second Stop

-> sumValue = pre1 + pre2

(Pdb)(Pdb) pre1

1 # value of pre1 variable

(Pdb) pre2

1 # value of pre2 variable

(Pdb) sumValue

1 # value of pre3 variable

(Pdb)continue

Second Stop

Out[20]: 2

This may seem a bit trickier. The debugger stopped at line 14 ie. import pdb; pdb.set_trace(). All I did was to check the values of all the variables defined in the program. I did (Pdb)continue it will execute the code that is inside the loop, till the end.

for i in range(0, 5):

yield sumValue # <--- line no 12

print('Second Stop')

import pdb; pdb.set_trace()

sumValue = pre1 + pre2

pre1 = pre2

pre2 = sumValue # <--- line no 17

Also since the loop is supposed to iterate for 5 time it will again try looping for the next time from line no 12 to 17. While looping it, again and again, will encounter yield statement, return the sumValue variable, saves the states of variables and get paused. The procedure goes until the loop is completed.

In [6]: a.next()

Secode Stop

-> sumValue = pre1 + pre2

(Pdb) pre1

1

(Pdb) pre2

2

(Pdb) sumValue

2

(Pdb) continue

Out[6]: 3 In [7]: a.next()

Secode Stop

-> sumValue = pre1 + pre2

(Pdb) pre1

2

(Pdb) pre2

3

(Pdb) sumValue

3

(Pdb) c

Out[7]: 5 In [8]: a.next()

Secode Stop

-> sumValue = pre1 + pre2

(Pdb) pre1

3

(Pdb) pre2

5

(Pdb) sumValue

5

(Pdb) a.next()

(Pdb) c

Out[8]: 8 In [9]: a.next()

Secode Stop

-> sumValue = pre1 + pre2

(Pdb) pre2

8

(Pdb) pre2

8

(Pdb) pre1

5

(Pdb) sumValue

8

(Pdb) c

---------------------------------------------------------------------------

StopIteration

Here when you count ‘Second Stop’ has been printed for fives times. Once the loop is complete there is no piece of code left that is supposed to be executed, the program gave StopIteration and become halted.

Generators & Iterators:

Let us now create the iterators and generator both to clear the fuzzy thing.

def iterator(x):

for i in range(10):

if i == 5:

return # <---- (a) def generator(x):

for i in range(10):

if i == 5:

return

else:

yield i # <---- (b) a = iterator(10) # <---- (c)

b = generator(10) # <---- (d)

In the above code:

(a) There is no yield keyword is used hence iterator.

(b) Yield statement is used hence generator

In [22]: dir(a)

Out[22]:

['__class__',

'__delattr__',

'__doc__',

'__format__',

'__getattribute__',

'__hash__',

'__init__',

'__new__',

'__reduce__',

'__reduce_ex__',

'__repr__',

'__setattr__',

'__sizeof__',

'__str__',

'__subclasshook__'] In [23]: dir(b)

Out[23]:

['__class__',

'__delattr__',

'__doc__',

'__format__',

'__getattribute__',

'__hash__',

'__init__',

'__iter__',

'__name__',

'__new__',

'__reduce__',

'__reduce_ex__',

'__repr__',

'__setattr__',

'__sizeof__',

'__str__',

'__subclasshook__',

'close',

'gi_code',

'gi_frame',

'gi_running',

'next',

'send',

'throw']

dir(): This method tries to return a list of valid attributes of the object.

Syntax: dir([object])

A generator function returns a generator object and an iterator function return an iterator object . Exploring the methods of this generator object you will find that it contains __iter__ and *__next__* methods among the other methods.

Hence, we can conclude that Generators are an extended version of Iterators.

I have added 2 more examples in a Github Gist to which you can hack around.

Please comment for any kind of confusion/suggestions etc. Also Follow me as I write mostly about Python, Blockchain and Web development using Django .