# Generators

Python generators are **a simple way of creating iterators**. All the overhead associated with building an iterator (implementing a class with `__iter__()`

and `__next__()`

, keeping track of internal states etc.) are automatically handled by generators in Python.

## Generator functions

A generator is **a function that returns an object (iterator) which we can iterate over (one value at a time)**.

### Example

```
# Build and return a list
def firstn(n):
num, nums = 0, []
while num < n:
nums.append(num)
num += 1
return nums
sum_of_first_n = sum(firstn(1000000))
```

The code is quite simple and straightforward, but it **builds the full list in memory** (all n "10 megabyte" integers). One solution is to resort to the **generator pattern** and implement an iterable object:

```
# Using the generator pattern (an iterable)
class firstn(object):
def __init__(self, n):
self.n = n
self.num, self.nums = 0, []
def __iter__(self):
return self
# Python 3 compatibility
def __next__(self):
return self.next()
def next(self):
if self.num < self.n:
cur, self.num = self.num, self.num+1
return cur
else:
raise StopIteration()
sum_of_first_n = sum(firstn(1000000))
```

This will perform as we expect, but there is **a lot of boilerplate**. Python provides** generator functions as a convenient shortcut to building iterators**:

```
# A generator that yields items instead of returning a list
def firstn(n):
num = 0
while num < n:
yield num
num += 1
sum_of_first_n = sum(firstn(1000000))
```

It is very similar to the implementation that built a list in memory, but has the memory usage characteristic of the iterator implementation.

## Generator expressions

Generator expressions provide **an additional shortcut to build generators out of expressions similar to that of list comprehensions**. A list comprehension can also be turned into a generator expression by replacing the square brackets with parentheses.

### Example

```
# List comprehension
doubles = [2 * n for n in range(50)]
# Same as the list comprehension above
doubles = list(2 * n for n in range(50))
```

By allowing generator expressions, we don't have to write a generator function if we do not need the list. If only list comprehensions were available, and we needed to lazily build a set of items to be processed, we will have to write a generator function.

## Performance imporovements

The performance improvement from the use of generators is **the result of the lazy (on demand) generation of values**, which translates to lower memory usage. Furthermore, we **do not need to wait until all the elements have been generated before we start to use them**. This is similar to the benefits provided by iterators, but the generator makes building iterators easy.

### Example

Both `range`

and `xrange`

represent a range of numbers, and have the same function signature, but `range`

returns a `list`

while `xrange`

returns a generator (at least in concept; the implementation may differ).

```
# Using a non-generator (Python 2.x only)
sum_of_first_n = sum(range(1000000))
# Using a generator
sum_of_first_n = sum(xrange(1000000))
```

When we use `range`

we build a 1,000,000 element list in memory and then find its sum. This is a waste, considering that we use these 1,000,000 elements just to compute the sum.

On the other hand, when we use `xrange`

, we do not incur the cost of building a 1,000,000 element list in memory. The generator created by xrange will generate each number, which `sum`

will consume to accumulate the sum.

__Note__: A generator **will provide performance benefits only if we do not intend to use that set of generated values more than once**.

```
# Note: Python 2.x only
s = sum(xrange(1000000))
p = product(xrange(1000000))
```

The same expensive process is performed twice. In cases like this, building a list in memory might be worth it:

```
# Note: Python 2.x only
nums = list(xrange(1000000))
s = sum(nums)
p = product(nums)
```

### Examples

```
# The for loop will generate each i (i.e. 1,2,3,4,5, ...), add it to total, and throw it away
# before the next i is generated. This is opposed to iterating through range(...), which creates
# a potentially massive list and then iterates through it.
total = 0
for i in irange(1000000):
total += i
```

```
# Generators can be composed
# square is a generator
square = (i * i for i in irange(1000000))
# Add the squares
total = 0
for i in square:
total += i
```

```
```