Movatterモバイル変換


[0]ホーム

URL:


Trey Hunner

I help developers level-up their Python skills

Hire Me For Training

How to make an iterator in Python

|Comments

I wrote an article sometime ago onthe iterator protocol that powers Python’sfor loops.One thing I left out of that article washow to make your own iterators.

In this article I’m going to discuss why you’d want to make your own iterators and then show you how to do so.

    What is an iterator?

    First let’s quickly address what an iterator is.For a much more detailed explanation, consider watching myLoop Better talk or readingthe article based on the talk.

    Aniterable is anything you’re able to loop over.

    Aniterator is the object that does the actual iterating.

    You can get an iterator from any iterable by calling the built-initer function on the iterable.

    123
    >>>favorite_numbers=[6,57,4,7,68,95]>>>iter(favorite_numbers)<list_iteratorobjectat0x7fe8e5623160>

    You can use the built-innext function on an iterator to get the next item from it (you’ll get aStopIteration exception if there are no more items).

    123456
    >>>favorite_numbers=[6,57,4,7,68,95]>>>my_iterator=iter(favorite_numbers)>>>next(my_iterator)6>>>next(my_iterator)57

    There’s one more rule about iterators that makes everything interesting:iterators are also iterables and their iterator is themselves.I explain the consequences of that more fully in thatLoop Better talk I mentioned above.

    Why make an iterator?

    Iterators allow you to make an iterable that computes its items as it goes.Which means that you can make iterables that arelazy, in that they don’t determine what their next item is until you ask them for it.

    Using an iterator instead of a list, set, or another iterable data structure can sometimes allow us to save memory.For example, we can useitertools.repeat to create an iterable that provides 100 million4’s to us:

    12
    >>>fromitertoolsimportrepeat>>>lots_of_fours=repeat(4,times=100_000_000)

    This iterator takes up 56 bytes of memory on my machine:

    123
    >>>importsys>>>sys.getsizeof(lots_of_fours)56

    An equivalent list of 100 million4’s takes up many megabytes of memory:

    1234
    >>>lots_of_fours=[4]*100_000_000>>>importsys>>>sys.getsizeof(lots_of_fours)800000064

    While iterators can save memory, they can also save time.For example if you wanted to print out just the first line of a 10 gigabyte log file, you could do this:

    12
    >>>print(next(open('giant_log_file.txt')))Thisisthefirstlineinagiantfile

    File objects in Python are implemented as iterators.As you loop over a file, data is read into memory one line at a time.If we instead used thereadlines method to store all lines in memory, we might run out of system memory.

    Soiterators can save us memory, butiterators can sometimes save us time also.

    Additionally,iterators have abilities that other iterables don’t.For example, the laziness of iterators can be used to make iterables that have an unknown length.In fact, you can even make infinitely long iterators.

    For example, theitertools.count utility will give us an iterator that will provide every number from0 upward as we loop over it:

    12345678
    >>>fromitertoolsimportcount>>>fornincount():...print(n)...012(thisgoesonforever)

    Thatitertools.count object is essentially an infinitely long iterable.And it’s implemented as an iterator.

    Making an iterator: the object-oriented way

    So we’ve seen that iterators can save us memory, save us CPU time, and unlock new abilities to us.

    Let’s make our own iterators.We’ll start be re-inventing theitertools.count iterator object.

    Here’s an iterator implemented using a class:

    1234567891011121314
    classCount:"""Iterator that counts upward forever."""def__init__(self,start=0):self.num=startdef__iter__(self):returnselfdef__next__(self):num=self.numself.num+=1returnnum

    This class has an initializer that initializes our current number to0 (or whatever is passed in as thestart).The things that make this class usable as an iterator are the__iter__ and__next__ methods.

    When an object is passed to thestr built-in function, its__str__ method is called.When an object is passed to thelen built-in function, its__len__ method is called.

    12345
    >>>numbers=[1,2,3]>>>str(numbers),numbers.__str__()('[1, 2, 3]','[1, 2, 3]')>>>len(numbers),numbers.__len__()(3,3)

    Calling the built-initer function on an object will attempt to call its__iter__ method.Calling the built-innext function on an object will attempt to call its__next__ method.

    Theiter function is supposed to return an iterator.So our__iter__ function must return an iterator.Butour object is an iterator, so should return ourself.Therefore ourCount object returnsself from its__iter__ method because it isits own iterator.

    Thenext function is supposed to return the next item in our iterator or raise aStopIteration exception when there are no more items.We’re returning the current number and incrementing the number so it’ll be larger during the next__next__ call.

    We can manually loop over ourCount iterator class like this:

    12345
    >>>c=Count()>>>next(c)0>>>next(c)1

    We could also loop over ourCount object like using afor loop, as with any other iterable:

    1234567
    >>>forninCount():...print(n)...012(thisgoesonforever)

    This object-oriented approach to making an iterator is cool, but it’s not the usual way that Python programmers make iterators.Usually when we want an iterator, we make a generator.

    Generators: the easy way to make an iterator

    The easiest ways to make our own iterators in Python is to create a generator.

    There are two ways to make generators in Python.

    Given this list of numbers:

    1
    >>>favorite_numbers=[6,57,4,7,68,95]

    We can make a generator that will lazily provide us with all the squares of these numbers like this:

    12345
    >>>defsquare_all(numbers):...forninnumbers:...yieldn**2...>>>squares=square_all(favorite_numbers)

    Or we can make the same generator like this:

    1
    >>>squares=(n**2forninfavorite_numbers)

    The first one is called agenerator function and the second one is called agenerator expression.

    Both of these generator objects work the same way.They both have a type ofgenerator and they’re both iterators that provide squares of the numbers in our numbers list.

    123456
    >>>type(squares)<class'generator'>>>>next(squares)36>>>next(squares)3249

    We’re going to talk about both of these approaches to making a generator, but first let’s talk about terminology.

    The word “generator” is used in quite a few ways in Python:

    • Agenerator, also called agenerator object, is an iterator whose type isgenerator
    • Agenerator function is a special syntax that allows us to make a function which returns agenerator object when we call it
    • Agenerator expression is a comprehension-like syntax that allows you to create agenerator object inline

    With that terminology out of the way, let’s take a look at each one of these things individually.We’ll look at generator functions first.

    Generator functions

    Generator functions are distinguished from plain old functions by the fact that they have one or moreyield statements.

    Normally when you call a function, its code is executed:

    12345678
    >>>defgimme4_please():...print("Let me go get that number for you.")...return4...>>>num=gimme4_please()Letmegogetthatnumberforyou.>>>num4

    But if the function has ayield statement in it, it isn’t a typical function anymore.It’s now agenerator function, meaning it will return agenerator object when called.That generator object can be looped over to execute it until ayield statement is hit:

    1234567891011
    >>>defgimme4_later_please():...print("Let me go get that number for you.")...yield4...>>>get4=gimme4_later_please()>>>get4<generatorobjectgimme4_later_pleaseat0x7f78b2e7e2b0>>>>num=next(get4)Letmegogetthatnumberforyou.>>>num4

    The mere presence of ayield statement turns a function into a generator function.If you see a function and there’s ayield, you’re working with a different animal.It’s a bit odd, but that’s the way generator functions work.

    Okay let’s look at a real example of a generator function.We’ll make a generator function that does the same thing as ourCount iterator class we made earlier.

    12345
    defcount(start=0):num=startwhileTrue:yieldnumnum+=1

    Just like ourCount iterator class, we can manually loop over the generator we get back from callingcount:

    12345
    >>>c=count()>>>next(c)0>>>next(c)1

    And we can loop over this generator object using afor loop, just like before:

    1234567
    >>>fornincount():...print(n)...012(thisgoesonforever)

    But this function is considerably shorter than ourCount class we created before.

    Generator expressions

    Generator expressions are a list comprehension-like syntax that allow us to make a generator object.

    Let’s say we have a list comprehension that filters empty lines from a file and strips newlines from the end:

    12345
    lines=[line.rstrip('\n')forlineinpoem_fileifline!='\n']

    We could create a generator instead of a list, by turning the square brackets of that comprehension into parenthesis:

    12345
    lines=(line.rstrip('\n')forlineinpoem_fileifline!='\n')

    Just as our list comprehension gave us a list back, ourgenerator expression gives us agenerator object back:

    123456
    >>>type(lines)<class'generator'>>>>next(lines)' This little bag I hope will prove'>>>next(lines)'To be not vainly made--'

    Generator expressions use a shorter inline syntax compared to generator functions.They’re not as powerful though.

    If you can write your generator function in this form:

    1234
    defget_a_generator(some_iterable):foriteminsome_iterable:ifsome_condition(item):yielditem

    Then you can replace it with a generator expression:

    123456
    defget_a_generator(some_iterable):return(itemforiteminsome_iterableifsome_condition(item))

    If you *can’t write your generator function in that form, thenyou can’t create a generator expression to replace it.

    Note that we’ve changed the example we’re using becausewe can’t use a generator expression for our previous example (our example that re-implementsitertools.count).

    Generator expressions vs generator functions

    You can think of generator expressions as the list comprehensions of the generator world.

    If you’re not familiar with list comprehensions, I recommend reading my article onlist comprehensions in Python.I note in that article that you can copy-paste your way from afor loop to a list comprehension.

    You can also copy-paste your way from a generator function to a function that returns a generator expression:

    Generator expressions are to generator functions as list comprehensions are to a simplefor loop with an append and a condition.

    Generator expressions are so similar to comprehensions, that you might even be tempted to saygenerator comprehension instead of generator expression.That’s not technically the correct name, but if you say it everyone will know what you’re talking about.Ned Batchelder actually proposed that we should allstart calling generator expressions generator comprehensions and I tend to agree that this would be a clearer name.

    So what’s the best way to make an iterator?

    To make an iterator you could create an iterator class, a generator function, or a generator expression.Which way is the best way though?

    Generator expressions arevery succinct, but they’renot nearly as flexible as generator functions.Generator functions are flexible, but if you need toattach extra methods or attributes to your iterator object, you’ll probably need to switch to using an iterator class.

    I’d recommend reaching for generator expressions the same way you reach for list comprehensions.If you’re doing a simplemapping or filtering operation, agenerator expression is a great solution.If you’re doing somethinga bit more sophisticated, you’ll likely need agenerator function.

    I’d recommend using generator functions the same way you’d usefor loops that append to a list.Everywhere you’d see anappend method, you’d often see ayield statement instead.

    And I’d say that you shouldalmost never create an iterator class.If you find you need an iterator class, try to write a generator function that does what you need and see how it compares to your iterator class.

    Generators can help when making iterables too

    You’ll see iterator classes in the wild, but there’s rarely a good opportunity to write your own.

    While it’s rare to create your own iterator class, it’s not as unusual to make your own iterable class.And iterable classes require a__iter__ method which returns an iterator.Since generators are the easy way to make an iterator, we can use a generator function or a generator expression to create our__iter__ methods.

    For example here’s an iterable that provides x-y coordinates:

    123456
    classPoint:def__init__(self,x,y):self.x,self.y=x,ydef__iter__(self):yieldself.xyieldself.y

    Note that ourPoint class here creates aniterable when called (not an iterator).That means our__iter__ method must return an iterator.The easiest way to create an iterator is by making a generator function, so that’s just what we did.

    We stuckyield in our__iter__ to make it into a generator function and now ourPoint class can be looped over, just like any other iterable.

    123456
    >>>p=Point(1,2)>>>x,y=p>>>print(x,y)12>>>list(p)[1,2]

    Generator functions are a natural fit for creating__iter__ methods on your iterable classes.

    Generators arethe way to make iterators

    Dictionaries are the typical way to make a mapping in Python.Functions are the typical way to make a callable object in Python.Likewise,generators are the typical way to make an iterator in Python.

    So when you’re thinking “it sure would be nice to implement an iterable that lazily computes things as it’s looped over,” think of iterators.

    And when you’re consideringhow to create your own iterator, think ofgenerator functions andgenerator expressions.

    Practice making an iterator right now

    You won’t learn new Python skills by reading, you’ll learn them by writing code.

    If you’d like to practice making an iterator right now, sign up forPython Morsels using the form below and I’ll immediately give you an exercise to practice making an iterator.


    I won't share you info with others (see thePython Morsels Privacy Policy for details).
    This form is reCAPTCHA protected (GooglePrivacy Policy &TOS)

    Comments

    Hi! My name is Trey Hunner.

    I help Python teamswrite better Python code throughPython team training.

    I also help individualslevel-up their Python skills withweekly Python skill-building.

    Python Team Training

    Write Pythonic code

    Python Morsels logo (adorable snake wrapped around a chocolate cookie)

    The best way to improve your skills is towrite more code, but it's time consuming to figure out what code to write. I've madea Python skill-building service to help solve this problem.

    Each week you'll get an exercise that'll help you dive deeper into Python and carefullyreflect on your own coding style. The first 3 exercises are free.

    Sign up below forthree free exercises!

    See thePython Morsels Privacy Policy.
    This form is reCAPTCHA protected (see GooglePrivacy Policy &Terms of Service)

    Favorite Posts

    Follow @treyhunner
    Write more Pythonic code

    Need tofill-in gaps in yourPython skills? I send regular emails designed to do just that.

    You're nearly signed up. You just need tocheck your email and click the link there toset your password.

    Right after you've set your password you'll receive your first Python Morsels exercise.


    [8]ページ先頭

    ©2009-2025 Movatter.jp