I wrote an article sometime ago onthe iterator protocol that powers Python’sfor
loops.One thing I left out of that article washow to make your own iterators.
In this article I’m going to discuss why you’d want to make your own iterators and then show you how to do so.
First let’s quickly address what an iterator is.For a much more detailed explanation, consider watching myLoop Better talk or readingthe article based on the talk.
Aniterable is anything you’re able to loop over.
Aniterator is the object that does the actual iterating.
You can get an iterator from any iterable by calling the built-initer
function on the iterable.
123 |
|
You can use the built-innext
function on an iterator to get the next item from it (you’ll get aStopIteration
exception if there are no more items).
123456 |
|
There’s one more rule about iterators that makes everything interesting:iterators are also iterables and their iterator is themselves.I explain the consequences of that more fully in thatLoop Better talk I mentioned above.
Iterators allow you to make an iterable that computes its items as it goes.Which means that you can make iterables that arelazy, in that they don’t determine what their next item is until you ask them for it.
Using an iterator instead of a list, set, or another iterable data structure can sometimes allow us to save memory.For example, we can useitertools.repeat
to create an iterable that provides 100 million4
’s to us:
12 |
|
This iterator takes up 56 bytes of memory on my machine:
123 |
|
An equivalent list of 100 million4
’s takes up many megabytes of memory:
1234 |
|
While iterators can save memory, they can also save time.For example if you wanted to print out just the first line of a 10 gigabyte log file, you could do this:
12 |
|
File objects in Python are implemented as iterators.As you loop over a file, data is read into memory one line at a time.If we instead used thereadlines
method to store all lines in memory, we might run out of system memory.
Soiterators can save us memory, butiterators can sometimes save us time also.
Additionally,iterators have abilities that other iterables don’t.For example, the laziness of iterators can be used to make iterables that have an unknown length.In fact, you can even make infinitely long iterators.
For example, theitertools.count
utility will give us an iterator that will provide every number from0
upward as we loop over it:
12345678 |
|
Thatitertools.count
object is essentially an infinitely long iterable.And it’s implemented as an iterator.
So we’ve seen that iterators can save us memory, save us CPU time, and unlock new abilities to us.
Let’s make our own iterators.We’ll start be re-inventing theitertools.count
iterator object.
Here’s an iterator implemented using a class:
1234567891011121314 |
|
This class has an initializer that initializes our current number to0
(or whatever is passed in as thestart
).The things that make this class usable as an iterator are the__iter__
and__next__
methods.
When an object is passed to thestr
built-in function, its__str__
method is called.When an object is passed to thelen
built-in function, its__len__
method is called.
12345 |
|
Calling the built-initer
function on an object will attempt to call its__iter__
method.Calling the built-innext
function on an object will attempt to call its__next__
method.
Theiter
function is supposed to return an iterator.So our__iter__
function must return an iterator.Butour object is an iterator, so should return ourself.Therefore ourCount
object returnsself
from its__iter__
method because it isits own iterator.
Thenext
function is supposed to return the next item in our iterator or raise aStopIteration
exception when there are no more items.We’re returning the current number and incrementing the number so it’ll be larger during the next__next__
call.
We can manually loop over ourCount
iterator class like this:
12345 |
|
We could also loop over ourCount
object like using afor
loop, as with any other iterable:
1234567 |
|
This object-oriented approach to making an iterator is cool, but it’s not the usual way that Python programmers make iterators.Usually when we want an iterator, we make a generator.
The easiest ways to make our own iterators in Python is to create a generator.
There are two ways to make generators in Python.
Given this list of numbers:
1 |
|
We can make a generator that will lazily provide us with all the squares of these numbers like this:
12345 |
|
Or we can make the same generator like this:
1 |
|
The first one is called agenerator function and the second one is called agenerator expression.
Both of these generator objects work the same way.They both have a type ofgenerator
and they’re both iterators that provide squares of the numbers in our numbers list.
123456 |
|
We’re going to talk about both of these approaches to making a generator, but first let’s talk about terminology.
The word “generator” is used in quite a few ways in Python:
generator
With that terminology out of the way, let’s take a look at each one of these things individually.We’ll look at generator functions first.
Generator functions are distinguished from plain old functions by the fact that they have one or moreyield
statements.
Normally when you call a function, its code is executed:
12345678 |
|
But if the function has ayield
statement in it, it isn’t a typical function anymore.It’s now agenerator function, meaning it will return agenerator object when called.That generator object can be looped over to execute it until ayield
statement is hit:
1234567891011 |
|
The mere presence of ayield
statement turns a function into a generator function.If you see a function and there’s ayield
, you’re working with a different animal.It’s a bit odd, but that’s the way generator functions work.
Okay let’s look at a real example of a generator function.We’ll make a generator function that does the same thing as ourCount
iterator class we made earlier.
12345 |
|
Just like ourCount
iterator class, we can manually loop over the generator we get back from callingcount
:
12345 |
|
And we can loop over this generator object using afor
loop, just like before:
1234567 |
|
But this function is considerably shorter than ourCount
class we created before.
Generator expressions are a list comprehension-like syntax that allow us to make a generator object.
Let’s say we have a list comprehension that filters empty lines from a file and strips newlines from the end:
12345 |
|
We could create a generator instead of a list, by turning the square brackets of that comprehension into parenthesis:
12345 |
|
Just as our list comprehension gave us a list back, ourgenerator expression gives us agenerator object back:
123456 |
|
Generator expressions use a shorter inline syntax compared to generator functions.They’re not as powerful though.
If you can write your generator function in this form:
1234 |
|
Then you can replace it with a generator expression:
123456 |
|
If you *can’t write your generator function in that form, thenyou can’t create a generator expression to replace it.
Note that we’ve changed the example we’re using becausewe can’t use a generator expression for our previous example (our example that re-implementsitertools.count
).
You can think of generator expressions as the list comprehensions of the generator world.
If you’re not familiar with list comprehensions, I recommend reading my article onlist comprehensions in Python.I note in that article that you can copy-paste your way from afor
loop to a list comprehension.
You can also copy-paste your way from a generator function to a function that returns a generator expression:
Generator expressions are to generator functions as list comprehensions are to a simplefor
loop with an append and a condition.
Generator expressions are so similar to comprehensions, that you might even be tempted to saygenerator comprehension instead of generator expression.That’s not technically the correct name, but if you say it everyone will know what you’re talking about.Ned Batchelder actually proposed that we should allstart calling generator expressions generator comprehensions and I tend to agree that this would be a clearer name.
To make an iterator you could create an iterator class, a generator function, or a generator expression.Which way is the best way though?
Generator expressions arevery succinct, but they’renot nearly as flexible as generator functions.Generator functions are flexible, but if you need toattach extra methods or attributes to your iterator object, you’ll probably need to switch to using an iterator class.
I’d recommend reaching for generator expressions the same way you reach for list comprehensions.If you’re doing a simplemapping or filtering operation, agenerator expression is a great solution.If you’re doing somethinga bit more sophisticated, you’ll likely need agenerator function.
I’d recommend using generator functions the same way you’d usefor
loops that append to a list.Everywhere you’d see anappend
method, you’d often see ayield
statement instead.
And I’d say that you shouldalmost never create an iterator class.If you find you need an iterator class, try to write a generator function that does what you need and see how it compares to your iterator class.
You’ll see iterator classes in the wild, but there’s rarely a good opportunity to write your own.
While it’s rare to create your own iterator class, it’s not as unusual to make your own iterable class.And iterable classes require a__iter__
method which returns an iterator.Since generators are the easy way to make an iterator, we can use a generator function or a generator expression to create our__iter__
methods.
For example here’s an iterable that provides x-y coordinates:
123456 |
|
Note that ourPoint
class here creates aniterable when called (not an iterator).That means our__iter__
method must return an iterator.The easiest way to create an iterator is by making a generator function, so that’s just what we did.
We stuckyield
in our__iter__
to make it into a generator function and now ourPoint
class can be looped over, just like any other iterable.
123456 |
|
Generator functions are a natural fit for creating__iter__
methods on your iterable classes.
Dictionaries are the typical way to make a mapping in Python.Functions are the typical way to make a callable object in Python.Likewise,generators are the typical way to make an iterator in Python.
So when you’re thinking “it sure would be nice to implement an iterable that lazily computes things as it’s looped over,” think of iterators.
And when you’re consideringhow to create your own iterator, think ofgenerator functions andgenerator expressions.
You won’t learn new Python skills by reading, you’ll learn them by writing code.
If you’d like to practice making an iterator right now, sign up forPython Morsels using the form below and I’ll immediately give you an exercise to practice making an iterator.
Hi! My name is Trey Hunner.
I help Python teamswrite better Python code throughPython team training.
I also help individualslevel-up their Python skills withweekly Python skill-building.
Python Team TrainingThe best way to improve your skills is towrite more code, but it's time consuming to figure out what code to write. I've madea Python skill-building service to help solve this problem.
Each week you'll get an exercise that'll help you dive deeper into Python and carefullyreflect on your own coding style. The first 3 exercises are free.
Sign up below forthree free exercises!
See thePython Morsels Privacy Policy.
This form is reCAPTCHA protected (see GooglePrivacy Policy &Terms of Service)
Need tofill-in gaps in yourPython skills? I send regular emails designed to do just that.
You're nearly signed up. You just need tocheck your email and click the link there toset your password.
Right after you've set your password you'll receive your first Python Morsels exercise.