Context variables provide a generic mechanism for tracking dynamic,context-local state, similar to thread-local storage but generalizedto cope work with other kinds of thread-like contexts, such as asyncioTasks.PEP 550 proposed a mechanism for context-local state that wasalso sensitive to generator context, but this was pretty complicated,so the BDFL requested it be simplified. The result wasPEP 567, whichis targeted for inclusion in 3.7. This PEP then extendsPEP 567’smachinery to add generator context sensitivity.
This PEP is starting out in the “deferred” status, because there isn’tenough time to give it proper consideration before the 3.7 featurefreeze. The only goalright now is to understand what would berequired to add generator context sensitivity in 3.8, so that we canavoid shipping something in 3.7 that would rule it out by accident.(Ruling it out on purpose can wait until 3.8 ;-).)
[Currently the point of this PEP is just to understandhow thiswould work, with discussion ofwhether it’s a good idea deferreduntil after the 3.7 feature freeze. So rationale is TBD.]
Instead of holding a singleContext, the threadstate now holds aChainMap ofContexts.ContextVar.get andContextVar.set are backed by theChainMap. Generators andasync generators each have an associatedContext that they pushonto theChainMap while they’re running to isolate theircontext-local changes from their callers, though this can beoverridden in cases like@contextlib.contextmanager where“leaking” context changes from the generator into its caller isdesirable.
Let’s start by reviewing howPEP 567 works, and then in the nextsection we’ll describe the differences.
InPEP 567, aContext is aMapping fromContextVar objectsto arbitrary values. In our pseudo-code here we’ll pretend that ituses adict for backing storage. (The real implementation uses aHAMT, which is semantically equivalent to adict but withdifferent performance trade-offs.):
classContext(collections.abc.Mapping):def__init__(self):self._data={}self._in_use=Falsedef__getitem__(self,key):returnself._data[key]def__iter__(self):returniter(self._data)def__len__(self):returnlen(self._data)
At any given moment, the threadstate holds a currentContext(initialized to an emptyContext when the threadstate is created);we can useContext.run to temporarily switch the currentContext:
# Context.rundefrun(self,fn,*args,**kwargs):ifself._in_use:raiseRuntimeError("Context already in use")tstate=get_thread_state()old_context=tstate.current_contexttstate.current_context=selfself._in_use=Truetry:returnfn(*args,**kwargs)finally:state.current_context=old_contextself._in_use=False
We can fetch a shallow copy of the currentContext by callingcopy_context; this is commonly used when spawning a new task, sothat the child task can inherit context from its parent:
defcopy_context():tstate=get_thread_state()new_context=Context()new_context._data=dict(tstate.current_context)returnnew_context
In practice, what end users generally work with isContextVarobjects, which also provide the only way to mutate aContext. Theywork with a utility classToken, which can be used to restore aContextVar to its previous value:
classToken:MISSING=sentinel_value()# Note: constructor is privatedef__init__(self,context,var,old_value):self._context=contextself.var=varself.old_value=old_value# XX: PEP 567 currently makes this a method on ContextVar, but# I'm going to propose it switch to this API because it's simpler.defreset(self):# XX: should we allow token reuse?# XX: should we allow tokens to be used if the saved# context is no longer active?ifself.old_valueisself.MISSING:delself._context._data[self.context_var]else:self._context._data[self.context_var]=self.old_value# XX: the handling of defaults here uses the simplified proposal from# https://mail.python.org/pipermail/python-dev/2018-January/151596.html# This can be updated to whatever we settle on, it was just less# typing this way :-)classContextVar:def__init__(self,name,*,default=None):self.name=nameself.default=defaultdefget(self):context=get_thread_state().current_contextreturncontext.get(self,self.default)defset(self,new_value):context=get_thread_state().current_contexttoken=Token(context,self,context.get(self,Token.MISSING))context._data[self]=new_valuereturntoken
In general,Context remains the same. However, now instead ofholding a singleContext object, the threadstate stores a stack ofthem. This stack acts just like acollections.ChainMap, so we’lluse that in our pseudocode.Context.run then becomes:
# Context.rundefrun(self,fn,*args,**kwargs):ifself._in_use:raiseRuntimeError("Context already in use")tstate=get_thread_state()old_context_stack=tstate.current_context_stacktstate.current_context_stack=ChainMap([self])# changedself._in_use=Truetry:returnfn(*args,**kwargs)finally:state.current_context_stack=old_context_stackself._in_use=False
Aside from some updated variables names (e.g.,tstate.current_context →tstate.current_context_stack), theonly change here is on the marked line, which now wraps the context inaChainMap before stashing it in the threadstate.
We also add aContext.push method, which is almost exactly likeContext.run, except that it temporarily pushes theContextonto the existing stack, instead of temporarily replacing the wholestack:
# Context.pushdefpush(self,fn,*args,**kwargs):ifself._in_use:raiseRuntimeError("Context already in use")tstate=get_thread_state()tstate.current_context_stack.maps.insert(0,self)# different from runself._in_use=Truetry:returnfn(*args,**kwargs)finally:tstate.current_context_stack.maps.pop(0)# different from runself._in_use=False
In most cases, we don’t expectpush to be used directly; instead,it will be used implicitly by generators. Specifically, everygenerator object and async generator object gains a new attribute.context. When an (async) generator object is created, thisattribute is initialized to an emptyContext (self.context=Context()). This is a mutable attribute; it can be changed by usercode. But trying to set it to anything that isn’t aContext objectorNone will raise an error.
Whenever we enter an generator via__next__,send,throw,orclose, or enter an async generator by calling one of thosemethods on its__anext__,asend,athrow, oraclosecoroutines, then its.context attribute is checked, and ifnon-None, is automatically pushed:
# GeneratorType.__next__def__next__(self):ifself.contextisnotNone:returnself.context.push(self.__real_next__)else:returnself.__real_next__()
While we don’t expect people to useContext.push often, making ita public API preserves the principle that a generator can always berewritten as an explicit iterator class with equivalent semantics.
Also, we modifycontextlib.(async)contextmanager to always set its(async) generator objects’.context attribute toNone:
# contextlib._GeneratorContextManagerBase.__init__def__init__(self,func,args,kwds):self.gen=func(*args,**kwds)self.gen.context=None# added...
This makes sure that code like this continues to work as expected:
@contextmanagerdefdecimal_precision(prec):withdecimal.localcontext()asctx:ctx.prec=precyieldwithdecimal_precision(2):...
The general idea here is that by default, every generator object getsits own local context, but if users want to explicitly get some otherbehavior then they can do that.
Otherwise, things mostly work as before, except that we go through andswap everything to use the threadstateChainMap instead of thethreadstateContext. In full detail:
Thecopy_context function now returns a flattened copy of the“effective” context. (As an optimization, the implementation mightchoose to do this flattening lazily, but if so this will be madeinvisible to the user.) Compared to our previous implementation above,the only change here is thattstate.current_context has beenreplaced withtstate.current_context_stack:
defcopy_context()->Context:tstate=get_thread_state()new_context=Context()new_context._data=dict(tstate.current_context_stack)returnnew_context
Token is unchanged, and the changes toContextVar.get aretrivial:
# ContextVar.getdefget(self):context_stack=get_thread_state().current_context_stackreturncontext_stack.get(self,self.default)
ContextVar.set is a little more interesting: instead of goingthrough theChainMap machinery like everything else, it alwaysmutates the topContext in the stack, and – crucially! – sets upthe returnedToken to restoreits state later. This allows us toavoid accidentally “promoting” values between different levels in thestack, as would happen if we didold=var.get();...;var.set(old):
# ContextVar.setdefset(self,new_value):top_context=get_thread_state().current_context_stack.maps[0]token=Token(top_context,self,top_context.get(self,Token.MISSING))top_context._data[self]=new_valuereturntoken
And finally, to allow for introspection of the full context stack, weprovide a new functioncontextvars.get_context_stack:
defget_context_stack()->List[Context]:returnlist(get_thread_state().current_context_stack.maps)
That’s all.
The main difference fromPEP 550 is that it reified what we’re calling“contexts” and “context stacks” as two different concrete types(LocalContext andExecutionContext respectively). This led tolots of confusion about what the differences were, and which objectshould be used in which places. This proposal simplifies things byonly reifying theContext, which is “just a dict”, and makes the“context stack” an unnamed feature of the interpreter’s runtime state– though it is still possible to introspect it usingget_context_stack, for debugging and other purposes.
Context will continue to use a HAMT-based mapping structure underthe hood instead ofdict, since we expect that calls tocopy_context are much more common thanContextVar.set. Inalmost all cases,copy_context will find that there’s only oneContext in the stack (because it’s rare for generators to spawnnew tasks), and can simply re-use it directly; in other cases HAMTsare cheap to merge and this can be done lazily.
Rather than using an actualChainMap object, we’ll represent thecontext stack using some appropriate structure – the most appropriateoptions are probably either a barelist with the “top” of thestack being the end of the list so we can usepush/pop, orelse an intrusive linked list (PyThreadState →Context →Context → …), with the “top” of the stack at the beginning ofthe list to allow efficient push/pop.
A critical optimization inPEP 567 is the caching of values insideContextVar. Switching from a single context to a context stackmakes this a little bit more complicated, but not too much. Currently,we invalidate the cache whenever the threadstate’s currentContextchanges (on thread switch, and when entering/exitingContext.run).The simplest approach here would be to invalidate the cache wheneverstack changes (on thread switch, when entering/exitingContext.run, and when entering/leavingContext.push). The maineffect of this is that iterating a generator will invalidate thecache. It seems unlikely that this will cause serious problems, but ifit does, then I think it can be avoided with a cleverer cache key thatrecognizes that pushing and then popping aContext returns thethreadstate to its previous state. (Idea: store the cache key for aparticular stack configuration in the topmostContext.)
It seems unavoidable in this design that uncachedget will beO(n), where n is the size of the context stack. However, n willgenerally be very small – it’s roughly the number of nestedgenerators, so usually n=1, and it will be extremely rare to see ngreater than, say, 5. At worst, n is bounded by the recursion limit.In addition, we can expect that in most cases of deep generatorrecursion, most of theContexts in the stack will be empty, andthus can be skipped extremely quickly during lookup. And for repeatedlookups the caching mechanism will kick in. So it’s probably possibleto construct some extreme case where this causes performance problems,but ordinary code should be essentially unaffected.
This document has been placed in the public domain.
Source:https://github.com/python/peps/blob/main/peps/pep-0568.rst
Last modified:2025-02-01 08:55:40 GMT