Movatterモバイル変換


[0]ホーム

URL:


Following system colour schemeSelected dark colour schemeSelected light colour scheme

Python Enhancement Proposals

PEP 521 – Managing global context via ‘with’ blocks in generators and coroutines

Author:
Nathaniel J. Smith <njs at pobox.com>
Status:
Withdrawn
Type:
Standards Track
Created:
27-Apr-2015
Python-Version:
3.6
Post-History:
29-Apr-2015

Table of Contents

PEP Withdrawal

Withdrawn in favor ofPEP 567.

Abstract

While we generally try to avoid global state when possible, therenonetheless exist a number of situations where it is agreed to be thebest approach. In Python, a standard pattern for handling such casesis to store the global state in global or thread-local storage, andthen usewith blocks to limit modifications of this global stateto a single dynamic scope. Examples where this pattern is used includethe standard library’swarnings.catch_warnings anddecimal.localcontext, NumPy’snumpy.errstate (which exposesthe error-handling settings provided by the IEEE 754 floating pointstandard), and the handling of logging context or HTTP request contextin many server application frameworks.

However, there is currently no ergonomic way to manage such localchanges to global state when writing a generator or coroutine. Forexample, this code:

deff():withwarnings.catch_warnings():forxing():yieldx

may or may not successfully catch warnings raised byg(), and mayor may not inadvertently swallow warnings triggered elsewhere in thecode. The context manager, which was intended to apply only tofand its callees, ends up having a dynamic scope that encompassesarbitrary and unpredictable parts of its callers. This problembecomes particularly acute when writing asynchronous code, whereessentially all functions become coroutines.

Here, we propose to solve this problem by notifying context managerswhenever execution is suspended or resumed within their scope,allowing them to restrict their effects appropriately.

Specification

Two new, optional, methods are added to the context manager protocol:__suspend__ and__resume__. If present, these methods will becalled whenever a frame’s execution is suspended or resumed fromwithin the context of thewith block.

More formally, consider the following code:

withEXPRasVAR:PARTIAL-BLOCK-1f((yieldfoo))PARTIAL-BLOCK-2

Currently this is equivalent to the following code (copied fromPEP 343):

mgr=(EXPR)exit=type(mgr).__exit__# Not calling it yetvalue=type(mgr).__enter__(mgr)exc=Truetry:try:VAR=value# Only if "as VAR" is presentPARTIAL-BLOCK-1f((yieldfoo))PARTIAL-BLOCK-2except:exc=Falseifnotexit(mgr,*sys.exc_info()):raisefinally:ifexc:exit(mgr,None,None,None)

This PEP proposes to modifywith block handling to instead become:

mgr=(EXPR)exit=type(mgr).__exit__# Not calling it yet### --- NEW STUFF ---ifthe_block_contains_yield_points:# known statically at compile timesuspend=getattr(type(mgr),"__suspend__",lambda:None)resume=getattr(type(mgr),"__resume__",lambda:None)### --- END OF NEW STUFF ---value=type(mgr).__enter__(mgr)exc=Truetry:try:VAR=value# Only if "as VAR" is presentPARTIAL-BLOCK-1### --- NEW STUFF ---suspend(mgr)tmp=yieldfooresume(mgr)f(tmp)### --- END OF NEW STUFF ---PARTIAL-BLOCK-2except:exc=Falseifnotexit(mgr,*sys.exc_info()):raisefinally:ifexc:exit(mgr,None,None,None)

Analogous suspend/resume calls are also wrapped around theyieldpoints embedded inside theyieldfrom,await,asyncwith,andasyncfor constructs.

Nested blocks

Given this code:

deff():withOUTER:withINNER:yieldVALUE

then we perform the following operations in the following sequence:

INNER.__suspend__()OUTER.__suspend__()yieldVALUEOUTER.__resume__()INNER.__resume__()

Note that this ensures that the following is a valid refactoring:

deff():withOUTER:yield fromg()defg():withINNERyieldVALUE

Similarly,with statements with multiple context managers suspendfrom right to left, and resume from left to right.

Other changes

Appropriate__suspend__ and__resume__ methods are added towarnings.catch_warnings anddecimal.localcontext.

Rationale

In the abstract, we gave an example of plausible but incorrect code:

deff():withwarnings.catch_warnings():forxing():yieldx

To make this correct in current Python, we need to instead writesomething like:

deff():withwarnings.catch_warnings():it=iter(g())whileTrue:withwarnings.catch_warnings():try:x=next(it)exceptStopIteration:breakyieldx

OTOH, if this PEP is accepted then the original code will becomecorrect as-is. Or if this isn’t convincing, then here’s anotherexample of broken code; fixing it requires even greater gyrations, andthese are left as an exercise for the reader:

asyncdeftest_foo_emits_warning():withwarnings.catch_warnings(record=True)asw:awaitfoo()assertlen(w)==1assert"xyzzy"inw[0].message

And notice that this last example isn’t artificial at all – this isexactly how you write a test that an async/await-using coroutinecorrectly raises a warning. Similar issues arise for pretty much anyuse ofwarnings.catch_warnings,decimal.localcontext, ornumpy.errstate in async/await-using code. So there’s clearly areal problem to solve here, and the growing prominence of async codemakes it increasingly urgent.

Alternative approaches

The main alternative that has been proposed is to create some kind of“task-local storage”, analogous to “thread-local storage”[1]. In essence, the idea would be that theevent loop would take care to allocate a new “task namespace” for eachtask it schedules, and provide an API to at any given time fetch thenamespace corresponding to the currently executing task. While thereare many details to be worked out[2], the basicidea seems doable, and it is an especially natural way to handle thekind of global context that arises at the top-level of asyncapplication frameworks (e.g., setting up context objects in a webframework). But it also has a number of flaws:

  • It only solves the problem of managing global state for coroutinesthatyield back to an asynchronous event loop. But thereactually isn’t anything about this problem that’s specific toasyncio – as shown in the examples above, simple generators runinto exactly the same issue.
  • It creates an unnecessary coupling between event loops and code thatneeds to manage global state. Obviously an async web framework needsto interact with some event loop API anyway, so it’s not a big dealin that case. But it’s weird thatwarnings ordecimal orNumPy should have to call into an async library’s API to accesstheir internal state when they themselves involve no async code.Worse, since there are multiple event loop APIs in common use, itisn’t clear how to choose which to integrate with. (This could besomewhat mitigated by CPython providing a standard API for creatingand switching “task-local domains” that asyncio, Twisted, tornado,etc. could then work with.)
  • It’s not at all clear that this can be made acceptably fast. NumPyhas to check the floating point error settings on every singlearithmetic operation. Checking a piece of data in thread-localstorage is absurdly quick, because modern platforms have put massiveresources into optimizing this case (e.g. dedicating a CPU registerfor this purpose); calling a method on an event loop to fetch ahandle to a namespace and then doing lookup in that namespace ismuch slower.

    More importantly, this extra cost would be paid onevery access tothe global data, even for programs which are not otherwise using anevent loop at all. This PEP’s proposal, by contrast, only affectscode that actually mixeswith blocks andyield statements,meaning that the users who experience the costs are the same userswho also reap the benefits.

On the other hand, such tight integration between task context and theevent loop does potentially allow other features that are beyond thescope of the current proposal. For example, an event loop could notewhich task namespace was in effect when a task calledcall_soon,and arrange that the callback when run would have access to the sametask namespace. Whether this is useful, or even well-defined in thecase of cross-thread calls (what does it mean to have task-localstorage accessed from two threads simultaneously?), is left as apuzzle for event loop implementors to ponder – nothing in thisproposal rules out such enhancements as well. It does seem thoughthat such features would be useful primarily for state that alreadyhas a tight integration with the event loop – while we might want arequest id to be preserved acrosscall_soon, most people would notexpect:

withwarnings.catch_warnings():loop.call_soon(f)

to result inf being run with warnings disabled, which would bethe result ifcall_soon preserved global context in general. It’salso unclear how this would even work given that the warnings contextmanager__exit__ would be called beforef.

So this PEP takes the position that__suspend__/__resume__and “task-local storage” are two complementary tools that are bothuseful in different circumstances.

Backwards compatibility

Because__suspend__ and__resume__ are optional and default tono-ops, all existing context managers continue to work exactly asbefore.

Speed-wise, this proposal adds additional overhead when entering awith block (where we must now check for the additional methods;failed attribute lookup in CPython is rather slow, since it involvesallocating anAttributeError), and additional overhead atsuspension points. Since the position ofwith blocks andsuspension points is known statically, the compiler canstraightforwardly optimize away this overhead in all cases exceptwhere one actually has ayield inside awith. Furthermore,because we only do attribute checks for__suspend__ and__resume__ once at the start of awith block, when theseattributes are undefined then the per-yield overhead can be optimizeddown to a single C-levelif(frame->needs_suspend_resume_calls){...}. Therefore, we expect the overall overhead to be negligible.

Interaction with PEP 492

PEP 492 added new asynchronous context managers, which are likeregular context managers, but instead of having regular methods__enter__ and__exit__ they have coroutine methods__aenter__ and__aexit__.

Following this pattern, one might expect this proposal to add__asuspend__ and__aresume__ coroutine methods. But thisdoesn’t make much sense, since the whole point is that__suspend__should be called before yielding our thread of execution and allowingother code to run. The only thing we accomplish by making__asuspend__ a coroutine is to make it possible for__asuspend__ itself to yield. So either we need to recursivelycall__asuspend__ from inside__asuspend__, or else we need togive up and allow these yields to happen without calling the suspendcallback; either way it defeats the whole point.

Well, with one exception: one possible pattern for coroutine code isto callyield in order to communicate with the coroutine runner,but without actually suspending their execution (i.e., the coroutinemight know that the coroutine runner will resume them immediatelyafter processing theyielded message). An example of this is thecurio.timeout_after async context manager, which yields a specialset_timeout message to the curio kernel, and then the kernelimmediately (synchronously) resumes the coroutine which sent themessage. And from the user point of view, this timeout value acts justlike the kinds of global variables that motivated this PEP. But, thereis a crucal difference: this kind of async context manager is, bydefinition, tightly integrated with the coroutine runner. So, thecoroutine runner can take over responsibility for keeping track ofwhich timeouts apply to which coroutines without any need for this PEPat all (and this is indeed how curio.timeout_after works).

That leaves two reasonable approaches to handling async context managers:

  1. Add plain__suspend__ and__resume__ methods.
  2. Leave async context managers alone for now until we have moreexperience with them.

Either seems plausible, so out of laziness /YAGNI this PEP tentativelyproposes to stick with option (2).

References

[1]
https://groups.google.com/forum/#!topic/python-tulip/zix5HQxtElghttps://github.com/python/asyncio/issues/165
[2]
For example, we would have to decidewhether there is a single task-local namespace shared by all users(in which case we need a way for multiple third-party libraries toadjudicate access to this namespace), or else if there are multipletask-local namespaces, then we need some mechanism for each libraryto arrange for their task-local namespaces to be created anddestroyed at appropriate moments. The preliminary patch linkedfrom the github issue above doesn’t seem to provide any mechanismfor such lifecycle management.

Copyright

This document has been placed in the public domain.


Source:https://github.com/python/peps/blob/main/peps/pep-0521.rst

Last modified:2025-02-01 08:59:27 GMT


[8]ページ先頭

©2009-2025 Movatter.jp