Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 479 – Change StopIteration handling inside generators

Author:
Chris Angelico <rosuav at gmail.com>, Guido van Rossum <guido at python.org>
Status:
Final
Type:
Standards Track
Created:
15-Nov-2014
Python-Version:
3.5
Post-History:
15-Nov-2014, 19-Nov-2014, 05-Dec-2014

Table of Contents

Abstract

This PEP proposes a change to generators: whenStopIteration israised inside a generator, it is replaced withRuntimeError.(More precisely, this happens when the exception is about to bubbleout of the generator’s stack frame.) Because the change is backwardsincompatible, the feature is initially introduced using a__future__ statement.

Acceptance

This PEP was accepted by the BDFL on November 22. Because of theexceptionally short period from first draft to acceptance, the mainobjections brought up after acceptance were carefully considered andhave been reflected in the “Alternate proposals” section below.However, none of the discussion changed the BDFL’s mind and the PEP’sacceptance is now final. (Suggestions for clarifying edits are stillwelcome – unlike IETF RFCs, the text of a PEP is not cast in stoneafter its acceptance, although the core design/plan/specificationshould not change after acceptance.)

Rationale

The interaction of generators andStopIteration is currentlysomewhat surprising, and can conceal obscure bugs. An unexpectedexception should not result in subtly altered behaviour, but shouldcause a noisy and easily-debugged traceback. Currently,StopIteration raised accidentally inside a generator function willbe interpreted as the end of the iteration by the loop constructdriving the generator.

The main goal of the proposal is to ease debugging in the situationwhere an unguardednext() call (perhaps several stack frames deep)raisesStopIteration and causes the iteration controlled by thegenerator to terminate silently. (Whereas, when some other exceptionis raised, a traceback is printed pinpointing the cause of theproblem.)

This is particularly pernicious in combination with theyieldfromconstruct ofPEP 380, as it breaks the abstraction that asubgenerator may be factored out of a generator. That PEP notes thislimitation, but notes that “use cases for these [are] rare tonon-existent”. Unfortunately while intentional use is rare, it iseasy to stumble on these cases by accident:

importcontextlib@contextlib.contextmanagerdeftransaction():print('begin')try:yield fromdo_it()except:print('rollback')raiseelse:print('commit')defdo_it():print('Refactored initial setup')yield# Body of with-statement is executed hereprint('Refactored finalization of successful transaction')defgene():foriinrange(2):withtransaction():yieldi# returnraiseStopIteration# This is wrongprint('Should not be reached')foriingene():print('main: i =',i)

Here factoring outdo_it into a subgenerator has introduced asubtle bug: if the wrapped block raisesStopIteration, under thecurrent behavior this exception will be swallowed by the contextmanager; and, worse, the finalization is silently skipped! Similarlyproblematic behavior occurs when anasyncio coroutine raisesStopIteration, causing it to terminate silently, or whennextis used to take the first result from an iterator that unexpectedlyturns out to be empty, for example:

# using the same context manager as aboveimportpathlibwithtransaction():print('commit file{}'.format(# I can never remember what the README extension isnext(pathlib.Path('/some/dir').glob('README*'))))

In both cases, the refactoring abstraction ofyieldfrom breaksin the presence of bugs in client code.

Additionally, the proposal reduces the difference between listcomprehensions and generator expressions, preventing surprises such asthe one that started this discussion[2]. Henceforth, the followingstatements will produce the same result if either produces a result atall:

a=list(F(x)forxinxsifP(x))a=[F(x)forxinxsifP(x)]

With the current state of affairs, it is possible to write a functionF(x) or a predicateP(x) that causes the first form to producea (truncated) result, while the second form raises an exception(namely,StopIteration). With the proposed change, both formswill raise an exception at this point (albeitRuntimeError in thefirst case andStopIteration in the second).

Finally, the proposal also clears up the confusion about how toterminate a generator: the proper way isreturn, notraiseStopIteration.

As an added bonus, the above changes bring generator functions muchmore in line with regular functions. If you wish to take a piece ofcode presented as a generator and turn it into something else, youcan usually do this fairly simply, by replacing everyyield witha call toprint() orlist.append(); however, if there are anybarenext() calls in the code, you have to be aware of them. Ifthe code was originally written without relying onStopIterationterminating the function, the transformation would be that mucheasier.

Background information

When a generator frame is (re)started as a result of a__next__()(orsend() orthrow()) call, one of three outcomes can occur:

  • A yield point is reached, and the yielded value is returned.
  • The frame is returned from;StopIteration is raised.
  • An exception is raised, which bubbles out.

In the latter two cases the frame is abandoned (and the generatorobject’sgi_frame attribute is set to None).

Proposal

If aStopIteration is about to bubble out of a generator frame, itis replaced withRuntimeError, which causes thenext() call(which invoked the generator) to fail, passing that exception out.From then on it’s just like any old exception.[3]

This affects the third outcome listed above, without altering anyother effects. Furthermore, it only affects this outcome when theexception raised isStopIteration (or a subclass thereof).

Note that the proposed replacement happens at the point where theexception is about to bubble out of the frame, i.e. after anyexcept orfinally blocks that could affect it have beenexited. TheStopIteration raised by returning from the frame isnot affected (the point being thatStopIteration means that thegenerator terminated “normally”, i.e. it did not raise an exception).

A subtle issue is what will happen if the caller, having caught theRuntimeError, calls the generator object’s__next__() methodagain. The answer is that from this point on it will raiseStopIteration – the behavior is the same as when any otherexception was raised by the generator.

Another logical consequence of the proposal: if someone usesg.throw(StopIteration) to throw aStopIteration exception intoa generator, if the generator doesn’t catch it (which it could dousing atry/except around theyield), it will be transformedintoRuntimeError.

During the transition phase, the new feature must be enabledper-module using:

from__future__importgenerator_stop

Any generator function constructed under the influence of thisdirective will have theREPLACE_STOPITERATION flag set on its codeobject, and generators with the flag set will behave according to thisproposal. Once the feature becomes standard, the flag may be dropped;code should not inspect generators for it.

A proof-of-concept patch has been created to facilitate testing.[4]

Consequences for existing code

This change will affect existing code that depends onStopIteration bubbling up. The pure Python referenceimplementation ofgroupby[5] currently has comments “Exit onStopIteration” where it is expected that the exception willpropagate and then be handled. This will be unusual, but not unknown,and such constructs will fail. Other examples abound, e.g.[6],[7].

(Alyssa Coghlan comments: “””If you wanted to factor out a helperfunction that terminated the generator you’d have to do “returnyield from helper()” rather than just “helper()”.”””)

There are also examples of generator expressions floating around thatrely on aStopIteration raised by the expression, the target or thepredicate (rather than by the__next__() call implied in theforloop proper).

Writing backwards and forwards compatible code

With the exception of hacks that raiseStopIteration to exit agenerator expression, it is easy to write code that works equally wellunder older Python versions as under the new semantics.

This is done by enclosing those places in the generator body where aStopIteration is expected (e.g. barenext() calls or in somecases helper functions that are expected to raiseStopIteration)in atry/except construct that returns whenStopIteration israised. Thetry/except construct should appear directly in thegenerator function; doing this in a helper function that is not itselfa generator does not work. IfraiseStopIteration occurs directlyin a generator, simply replace it withreturn.

Examples of breakage

Generators which explicitly raiseStopIteration can generally bechanged to simply return instead. This will be compatible with allexisting Python versions, and will not be affected by__future__.Here are some illustrations from the standard library.

Lib/ipaddress.py:

ifother==self:raiseStopIteration

Becomes:

ifother==self:return

In some cases, this can be combined withyieldfrom to simplifythe code, such as Lib/difflib.py:

ifcontextisNone:whileTrue:yieldnext(line_pair_iterator)

Becomes:

ifcontextisNone:yield fromline_pair_iteratorreturn

(Thereturn is necessary for a strictly-equivalent translation,though in this particular file, there is no further code, and thereturn can be omitted.) For compatibility with pre-3.3 versionsof Python, this could be written with an explicitfor loop:

ifcontextisNone:forlineinline_pair_iterator:yieldlinereturn

More complicated iteration patterns will need explicittry/exceptconstructs. For example, a hypothetical parser like this:

defparser(f):whileTrue:data=next(f)whileTrue:line=next(f)ifline=="- end -":breakdata+=lineyielddata

would need to be rewritten as:

defparser(f):whileTrue:try:data=next(f)whileTrue:line=next(f)ifline=="- end -":breakdata+=lineyielddataexceptStopIteration:return

or possibly:

defparser(f):fordatainf:whileTrue:line=next(f)ifline=="- end -":breakdata+=lineyielddata

The latter form obscures the iteration by purporting to iterate overthe file with afor loop, but then also fetches more data fromthe same iterator during the loop body. It does, however, clearlydifferentiate between a “normal” termination (StopIterationinstead of the initial line) and an “abnormal” termination (failingto find the end marker in the inner loop, which will now raiseRuntimeError).

This effect ofStopIteration has been used to cut a generatorexpression short, creating a form oftakewhile:

defstop():raiseStopIterationprint(list(xforxinrange(10)ifx<5orstop()))# prints [0, 1, 2, 3, 4]

Under the current proposal, this form of non-local flow control isnot supported, and would have to be rewritten in statement form:

defgen():forxinrange(10):ifx>=5:returnyieldxprint(list(gen()))# prints [0, 1, 2, 3, 4]

While this is a small loss of functionality, it is functionality thatoften comes at the cost of readability, and just aslambda hasrestrictions compared todef, so does a generator expression haverestrictions compared to a generator function. In many cases, thetransformation to full generator function will be trivially easy, andmay improve structural clarity.

Explanation of generators, iterators, and StopIteration

The proposal does not change the relationship between generators anditerators: a generator object is still an iterator, and not alliterators are generators. Generators have additional methods thatiterators don’t have, likesend andthrow. All this isunchanged. Nothing changes for generator users – only authors ofgenerator functions may have to learn something new. (This includesauthors of generator expressions that depend on early termination ofthe iteration by aStopIteration raised in a condition.)

An iterator is an object with a__next__ method. Like many otherspecial methods, it may either return a value, or raise a specificexception - in this case,StopIteration - to signal that it hasno value to return. In this, it is similar to__getattr__ (canraiseAttributeError),__getitem__ (can raiseKeyError),and so on. A helper function for an iterator can be written tofollow the same protocol; for example:

defhelper(x,y):ifx>y:return1/(x-y)raiseStopIterationdef__next__(self):ifself.a:returnhelper(self.b,self.c)returnhelper(self.d,self.e)

Both forms of signalling are carried through: a returned value isreturned, an exception bubbles up. The helper is written to matchthe protocol of the calling function.

A generator function is one which contains ayield expression.Each time it is (re)started, it may either yield a value, or return(including “falling off the end”). A helper function for a generatorcan also be written, but it must also follow generator protocol:

defhelper(x,y):ifx>y:yield1/(x-y)defgen(self):ifself.a:return(yield fromhelper(self.b,self.c))return(yield fromhelper(self.d,self.e))

In both cases, any unexpected exception will bubble up. Due to thenature of generators and iterators, an unexpectedStopIterationinside a generator will be converted intoRuntimeError, butbeyond that, all exceptions will propagate normally.

Transition plan

  • Python 3.5: Enable new semantics under__future__ import; silentdeprecation warning ifStopIteration bubbles out of a generatornot under__future__ import.
  • Python 3.6: Non-silent deprecation warning.
  • Python 3.7: Enable new semantics everywhere.

Alternate proposals

Raising something other than RuntimeError

Rather than the genericRuntimeError, it might make sense to raisea new exception typeUnexpectedStopIteration. This has thedownside of implicitly encouraging that it be caught; the correctaction is to catch the originalStopIteration, not the chainedexception.

Supplying a specific exception to raise on return

Alyssa (Nick) Coghlan suggested a means of providing a specificStopIteration instance to the generator; if any other instance ofStopIteration is raised, it is an error, but if that particularone is raised, the generator has properly completed. This subproposalhas been withdrawn in favour of better options, but is retained forreference.

Making return-triggered StopIterations obvious

For certain situations, a simpler and fully backward-compatiblesolution may be sufficient: when a generator returns, instead ofraisingStopIteration, it raises a specific subclass ofStopIteration (GeneratorReturn) which can then be detected.If it is not that subclass, it is an escaping exception rather than areturn statement.

The inspiration for this alternative proposal was Alyssa’s observation[8] that if anasyncio coroutine[9] accidentally raisesStopIteration, it currently terminates silently, which may presenta hard-to-debug mystery to the developer. The main proposal turnssuch accidents into clearly distinguishableRuntimeError exceptions,but if that is rejected, this alternate proposal would enableasyncio to distinguish between areturn statement and anaccidentally-raisedStopIteration exception.

Of the three outcomes listed above, two change:

  • If a yield point is reached, the value, obviously, would still bereturned.
  • If the frame is returned from,GeneratorReturn (rather thanStopIteration) is raised.
  • If an instance ofGeneratorReturn would be raised, instead aninstance ofStopIteration would be raised. Any other exceptionbubbles up normally.

In the third case, theStopIteration would have thevalue ofthe originalGeneratorReturn, and would reference the originalexception in its__cause__. If uncaught, this would clearly showthe chaining of exceptions.

This alternative doesnot affect the discrepancy between generatorexpressions and list comprehensions, but allows generator-aware code(such as thecontextlib andasyncio modules) to reliablydifferentiate between the second and third outcomes listed above.

However, once code exists that depends on this distinction betweenGeneratorReturn andStopIteration, a generator that invokesanother generator and relies on the latter’sStopIteration tobubble out would still be potentially wrong, depending on the use madeof the distinction between the two exception types.

Converting the exception inside next()

Mark Shannon suggested[10] that the problem could be solved innext() rather than at the boundary of generator functions. Byhavingnext() catchStopIteration and raise insteadValueError, all unexpectedStopIteration bubbling would beprevented; however, the backward-incompatibility concerns are farmore serious than for the current proposal, as everynext() callnow needs to be rewritten to guard againstValueError instead ofStopIteration - not to mention that there is no way to write oneblock of code which reliably works on multiple versions of Python.(Using a dedicated exception type, perhaps subclassingValueError,would help this; however, all code would still need to be rewritten.)

Note that callingnext(it,default) catchesStopIteration andsubstitutes the given default value; this feature is often useful toavoid atry/except block.

Sub-proposal: decorator to explicitly request current behaviour

Alyssa Coghlan suggested[11] that the situations where the currentbehaviour is desired could be supported by means of a decorator:

fromitertoolsimportallow_implicit_stop@allow_implicit_stopdefmy_generator():...yieldnext(it)...

Which would be semantically equivalent to:

defmy_generator():try:...yieldnext(it)...exceptStopIterationreturn

but be faster, as it could be implemented by simply permitting theStopIteration to bubble up directly.

Single-source Python 2/3 code would also benefit in a 3.7+ world,since libraries like six and python-future could just define their ownversion of “allow_implicit_stop” that referred to the new builtin in3.5+, and was implemented as an identity function in other versions.

However, due to the implementation complexities required, the ongoingcompatibility issues created, the subtlety of the decorator’s effect,and the fact that it would encourage the “quick-fix” solution of justslapping the decorator onto all generators instead of properly fixingthe code in question, this sub-proposal has been rejected.[12]

Criticism

Unofficial and apocryphal statistics suggest that this is seldom, ifever, a problem.[13] Code does exist which relies on the currentbehaviour (e.g.[3],[6],[7]), and there is the concern that thiswould be unnecessary code churn to achieve little or no gain.

Steven D’Aprano started an informal survey on comp.lang.python[14];at the time of writing only two responses have been received: one wasin favor of changing list comprehensions to match generatorexpressions (!), the other was in favor of this PEP’s main proposal.

The existing model has been compared to the perfectly-acceptableissues inherent to every other case where an exception has specialmeaning. For instance, an unexpectedKeyError inside a__getitem__ method will be interpreted as failure, rather thanpermitted to bubble up. However, there is a difference. Specialmethods usereturn to indicate normality, andraise to signalabnormality; generatorsyield to indicate data, andreturn tosignal the abnormal state. This makes explicitly raisingStopIteration entirely redundant, and potentially surprising. Ifother special methods had dedicated keywords to distinguish betweentheir return paths, they too could turn unexpected exceptions intoRuntimeError; the fact that they cannot should not precludegenerators from doing so.

Why not fix all __next__() methods?

When implementing a regular__next__() method, the only way toindicate the end of the iteration is to raiseStopIteration. SocatchingStopIteration here and converting it toRuntimeErrorwould defeat the purpose. This is a reminder of the special status ofgenerator functions: in a generator function, raisingStopIteration is redundant since the iteration can be terminatedby a simplereturn.

References

[2]
Initial mailing list comment(https://mail.python.org/pipermail/python-ideas/2014-November/029906.html)
[3] (1,2)
Proposal by GvR(https://mail.python.org/pipermail/python-ideas/2014-November/029953.html)
[4]
Tracker issue with Proof-of-Concept patch(http://bugs.python.org/issue22906)
[5]
Pure Python implementation of groupby(https://docs.python.org/3/library/itertools.html#itertools.groupby)
[6] (1,2)
Split a sequence or generator using a predicate(http://code.activestate.com/recipes/578416-split-a-sequence-or-generator-using-a-predicate/)
[7] (1,2)
wrap unbounded generator to restrict its output(http://code.activestate.com/recipes/66427-wrap-unbounded-generator-to-restrict-its-output/)
[8]
Post from Alyssa (Nick) Coghlan mentioning asyncio(https://mail.python.org/pipermail/python-ideas/2014-November/029961.html)
[9]
Coroutines in asyncio(https://docs.python.org/3/library/asyncio-task.html#coroutines)
[10]
Post from Mark Shannon with alternate proposal(https://mail.python.org/pipermail/python-dev/2014-November/137129.html)
[11]
Idea from Alyssa Coghlan(https://mail.python.org/pipermail/python-dev/2014-November/137201.html)
[12]
Rejection of above idea by GvR(https://mail.python.org/pipermail/python-dev/2014-November/137243.html)
[13]
Response by Steven D’Aprano(https://mail.python.org/pipermail/python-ideas/2014-November/029994.html)
[14]
Thread on comp.lang.python started by Steven D’Aprano(https://mail.python.org/pipermail/python-list/2014-November/680757.html)

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0479.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp