Introduction to Iteration in Python
At its core, iteration is the process of taking items from a sequence, one by one. The
for loop in Python is the most common example of iteration. But behind every for loop,
there's a powerful and consistent design pattern at work: the Iterator Protocol.
Understanding this protocol is key to mastering Python's data handling capabilities
and unlocking the power of generators.
This guide will cover:
● The Iterator Protocol: The rules that allow objects to be iterable.
● Generators and the yield keyword: A simple and elegant way to create iterators.
1. The Iterator Protocol: __iter__ and __next__
Python's iteration mechanism is governed by two special methods, often called
"dunder" (double underscore) methods. An object must implement these two
methods to be an iterator.
● Iterable: An object that you can loop over. It has an __iter__() method that returns
an iterator. Examples include lists, tuples, dictionaries, sets, and strings.
● Iterator: An object that represents a stream of data. It produces the next value in
the sequence when you call next() on it. It must have both an __iter__() method
and a __next__() method.
Method Description
__iter__(self) Called when an iterator is required for a
container (e.g., at the start of a for loop). It
returns the iterator object itself. This allows
iterators to be used where iterables are
expected.
__next__(self) Returns the next item from the container. If
there are no more items, it must raise the
StopIteration exception.
The for loop uses this protocol under the hood. When you write for item in my_list:,
Python first calls iter(my_list), which in turn calls my_list.__iter__() to get an iterator.
Then, it repeatedly calls next() on that iterator to get each item until a StopIteration
exception is caught, which signals the end of the loop.
Creating a Custom Iterator
To see the protocol in action, let's build a custom iterator that mimics the range()
function.
class MyRange:
"""A simple iterator class that generates numbers up to a stop value."""
def __init__(self, start, stop):
self.current = start
self.stop = stop
# Make the class an iterator by implementing __iter__
def __iter__(self):
# This method must return the iterator object itself
return self
# Implement __next__ to produce the next value
def __next__(self):
if self.current < self.stop:
num = self.current
self.current += 1
return num
else:
# Signal that the iteration is finished
raise StopIteration
# Using the custom iterator with a for loop
for number in MyRange(0, 5):
print(number)
# Manual iteration to see the StopIteration exception
numbers = MyRange(0, 2)
iterator = iter(numbers)
print(next(iterator)) # Output: 0
print(next(iterator)) # Output: 1
try:
print(next(iterator))
except StopIteration:
print("Iteration has ended.")
# Output from the for loop:
# 0
# 1
# 2
# 3
# 4
This class-based approach works, but it's verbose. We have to manually manage the
state (self.current) and explicitly raise StopIteration. This is where generators provide
a much cleaner solution.
2. Creating Generators with yield
Generators offer a significantly simpler and more elegant way to create iterators.
Instead of writing a class with __iter__ and __next__, you can write a simple function
that uses the yield keyword.
Generator Function: Any function in Python that contains a yield statement.
Generator Object: The object that is returned when you call a generator function. This object
is a special kind of iterator.
The yield Keyword
The yield keyword is the magic behind generators. It works like a return statement, but
with a crucial difference:
● return: Terminates the function completely.
● yield: Pauses the function's execution and "yields" a value to the caller. The
function's entire state (local variables, instruction pointer) is saved. The next time
next() is called on the generator, the function resumes execution right after the
yield statement.
When a generator function is called, it doesn't execute the function body. Instead, it
immediately returns a generator object. The code inside the function only runs when
next() is called on the generator.
Example: A Generator for MyRange
Let's rewrite our MyRange iterator as a generator function.
def my_range_generator(start, stop):
"""A generator function that yields numbers up to a stop value."""
current = start
while current < stop:
yield current
current += 1
# Using the generator function
# Calling the function returns a generator object
gen = my_range_generator(0, 5)
print(f"Type of gen: {type(gen)}") # Output: <class 'generator'>
# The for loop works on the generator just like any other iterator
for number in gen:
print(number)
# Output from the for loop:
# 0
# 1
# 2
# 3
# 4
This code is much more concise and readable. The state (current) is a simple local
variable, and the while loop logic is clear. The Python interpreter handles the __iter__,
__next__, and StopIteration implementation for us automatically.
Lazy Evaluation: The Power of Generators
The most significant advantage of generators is lazy evaluation. They produce items
one at a time and only when requested. This makes them incredibly memory-efficient,
especially when dealing with:
● Very large datasets (e.g., reading a multi-gigabyte file line by line).
● Infinite sequences (e.g., generating all Fibonacci numbers).
Example: An infinite Fibonacci sequence generator
def fibonacci_generator():
"""Generates an infinite sequence of Fibonacci numbers."""
a, b = 0, 1
while True:
yield a
a, b = b, a + b
fib = fibonacci_generator()
# Print the first 10 Fibonacci numbers
print("\nFirst 10 Fibonacci numbers:")
for _ in range(10):
print(next(fib), end=" ") # Output: 0 1 1 2 3 5 8 13 21 34
It would be impossible to store an infinite sequence in a list, but a generator handles it
with ease because it only ever holds one number in memory at a time.
Comparison: Custom Iterators vs. Generators
Feature Custom Iterator (Class) Generator (Function with
yield)
Implementation Requires a class with __init__, A simple function using the
__iter__, and __next__ yield keyword.
methods.
State Management State must be manually saved State is automatically saved
in instance variables (e.g., by Python between yield calls.
self.current).
Code Complexity More verbose and boilerplate Concise, simple, and more
code. readable.
Use Case When you need a complex For most custom iteration
state machine or want to tasks. It's the standard,
expose other methods on the Pythonic way.
iterator object.
In summary, all generators are iterators, but not all iterators are generators.
Generators are simply an easier syntax for creating iterators. For most use cases, a
generator function is the preferred choice.