Python Enhancement Proposals

Python »
PEP Index »
PEP 734

PEP 734 – Multiple Interpreters in the Stdlib

Author:: Eric Snow <ericsnowcurrently at gmail.com>
Discussions-To:

Table of Contents

Important

This PEP is a historical document. The up-to-date, canonical documentation can now be found atconcurrent.interpreters.

SeePEP 1 for how to propose changes.

Note

This PEP is essentially a continuation ofPEP 554. That documenthad grown a lot of ancillary information across 7 years of discussion.This PEP is a reduction back to the essential information. Much ofthat extra information is still valid and useful, just not in theimmediate context of the specific proposal here.

Note

This PEP was accepted with the provision that the name changetoconcurrent.interpreters.

Abstract

This PEP proposes to add a new module,interpreters, to supportinspecting, creating, and running code in multiple interpreters in thecurrent process. This includesInterpreter objects that representthe underlying interpreters. The module will also provide a basicQueue class for communication between interpreters.Finally, we will add a newconcurrent.futures.InterpreterPoolExecutorbased on theinterpreters module.

Introduction

Fundamentally, an “interpreter” is the collection of (essentially)all runtime state which Python threads must share. So, let’s firstlook at threads. Then we’ll circle back to interpreters.

Threads and Thread States

A Python process will have one or more OS threads running Python code(or otherwise interacting with the C API). Each of these threadsinteracts with the CPython runtime using its own thread state(PyThreadState), which holds all the runtime state unique to thatthread. There is also some runtime state that is shared betweenmultiple OS threads.

Any OS thread may switch which thread state it is currently using, aslong as it isn’t one that another OS thread is already using (or hasbeen using). This “current” thread state is stored by the runtimein a thread-local variable, and may be looked up explicitly withPyThreadState_Get(). It gets set automatically for the initial(“main”) OS thread and forthreading.Thread objects. From theC API it is set (and cleared) byPyThreadState_Swap() and maybe set byPyGILState_Ensure(). Most of the C API requires thatthere be a current thread state, either looked up implicitlyor passed in as an argument.

The relationship between OS threads and thread states is one-to-many.Each thread state is associated with at most a single OS thread andrecords its thread ID. A thread state is never used for more than oneOS thread. In the other direction, however, an OS thread may have morethan one thread state associated with it, though, again, only onemay be current.

When there’s more than one thread state for an OS thread,PyThreadState_Swap() is used in that OS thread to switchbetween them, with the requested thread state becoming the current one.Whatever was running in the thread using the old thread state iseffectively paused until that thread state is swapped back in.

Interpreter States

As noted earlier, there is some runtime state that multiple OS threadsshare. Some of it is exposed by thesys module, though much isused internally and not exposed explicitly or only through the C API.

This shared state is called the interpreter state(PyInterpreterState). We’ll sometimes refer to it here as just“interpreter”, though that is also sometimes used to refer to thepython executable, to the Python implementation, and to thebytecode interpreter (i.e.exec()/eval()).

CPython has supported multiple interpreters in the same process (AKA“subinterpreters”) since version 1.5 (1997). The feature has beenavailable via theC API.

Interpreters and Threads

Thread states are related to interpreter states in much the same waythat OS threads and processes are related (at a high level). Tobegin with, the relationship is one-to-many.A thread state belongs to a single interpreter (and storesa pointer to it). That thread state is never used for a differentinterpreter. In the other direction, however, an interpreter may havezero or more thread states associated with it. The interpreter is onlyconsidered active in OS threads where one of its thread statesis current.

Interpreters are created via the C API usingPy_NewInterpreterFromConfig() (orPy_NewInterpreter(), whichis a light wrapper aroundPy_NewInterpreterFromConfig()).That function does the following:

create a new interpreter state
create a new thread state
set the thread state as current(a current tstate is needed for interpreter init)
initialize the interpreter state using that thread state
return the thread state (still current)

Note that the returned thread state may be immediately discarded.There is no requirement that an interpreter have any thread states,except as soon as the interpreter is meant to actually be used.At that point it must be made active in the current OS thread.

To make an existing interpreter active in the current OS thread,the C API user first makes sure that interpreter has a correspondingthread state. ThenPyThreadState_Swap() is called like normalusing that thread state. If the thread state for another interpreterwas already current then it gets swapped out like normal and executionof that interpreter in the OS thread is thus effectively paused untilit is swapped back in.

Once an interpreter is active in the current OS thread like that, thethread can call any of the C API, such asPyEval_EvalCode()(i.e.exec()). This works by using the current thread state asthe runtime context.

The “Main” Interpreter

When a Python process starts, it creates a single interpreter state(the “main” interpreter) with a single thread state for the currentOS thread. The Python runtime is then initialized using them.

After initialization, the script or module or REPL is executed usingthem. That execution happens in the interpreter’s__main__ module.

When the process finishes running the requested Python code or REPL,in the main OS thread, the Python runtime is finalized in that threadusing the main interpreter.

Runtime finalization has only a slight, indirect effect on still-runningPython threads, whether in the main interpreter or in subinterpreters.That’s because right away it waits indefinitely for all non-daemonPython threads to finish.

While the C API may be queried, there is no mechanism by which anyPython thread is directly alerted that finalization has begun,other than perhaps with “atexit” functions that may be beenregistered usingthreading._register_atexit().

Any remaining subinterpreters are themselves finalized later,but at that point they aren’t current in any OS threads.

Interpreter Isolation

CPython’s interpreters are intended to be strictly isolated from eachother. That means interpreters never share objects (except in veryspecific cases with immortal, immutable builtin objects). Eachinterpreter has its own modules (sys.modules), classes, functions,and variables. Even where two interpreters define the same class,each will have its own copy. The same applies to state in C, includingin extension modules. The CPython C API docsexplain more.

Notably, there is some process-global state that interpreters willalways share, some mutable and some immutable. Sharing immutablestate presents few problems, while providing some benefits (mainlyperformance). However, all shared mutable state requires specialmanagement, particularly for thread-safety, some of which the OStakes care of for us.

Mutable:

file descriptors
low-level env vars
process memory (though allocatorsare isolated)
the list of interpreters

Immutable:

builtin types (e.g.dict,bytes)
singletons (e.g.None)
underlying static module data (e.g. functions) forbuiltin/extension/frozen modules

Existing Execution Components

There are a number of existing parts of Python that may helpwith understanding how code may be run in a subinterpreter.

In CPython, each component is built around one of the followingC API functions (or variants):

PyEval_EvalCode(): run the bytecode interpreter with the givencode object
PyRun_String(): compile +PyEval_EvalCode()
PyRun_File(): read + compile +PyEval_EvalCode()
PyRun_InteractiveOneObject(): compile +PyEval_EvalCode()
PyObject_Call(): callsPyEval_EvalCode()

builtins.exec()

The builtinexec() may be used to execute Python code. It isessentially a wrapper around the C API functionsPyRun_String()andPyEval_EvalCode().

Here are some relevant characteristics of the builtinexec():

It runs in the current OS thread and pauses whateverwas running there, which resumes whenexec() finishes.No other OS threads are affected.(To avoid pausing the current Python thread, runexec()in athreading.Thread.)
It may start additional threads, which don’t interrupt it.
It executes against a “globals” namespace (and a “locals”namespace). At module-level,exec() defaults to using__dict__ of the current module (i.e.globals()).exec() uses that namespace as-is and does not clear it before or after.
It propagates any uncaught exception from the code it ran.The exception is raised from theexec() call in the Pythonthread that originally calledexec().

Command-line

Thepython CLI provides several ways to run Python code. In eachcase it maps to a corresponding C API call:

<noargs>,-i - run the REPL(PyRun_InteractiveOneObject())
<filename> - run a script (PyRun_File())
-c<code> - run the given Python code (PyRun_String())
-mmodule - run the module as a script(PyEval_EvalCode() viarunpy._run_module_as_main())

In each case it is essentially a variant of runningexec()at the top-level of the__main__ module of the main interpreter.

threading.Thread

When a Python thread is started, it runs the “target” functionwithPyObject_Call() using a new thread state. The globalsnamespace come fromfunc.__globals__ and any uncaughtexception is discarded.

Motivation

Theinterpreters module will provide a high-level interface to themultiple interpreter functionality. The goal is to make the existingmultiple-interpreters feature of CPython more easily accessible toPython code. This is particularly relevant now that CPython has aper-interpreter GIL (PEP 684) and people are more interestedin using multiple interpreters.

Without a stdlib module, users are limited to theC API, which restricts how muchthey can try out and take advantage of multiple interpreters.

The module will include a basic mechanism for communicating betweeninterpreters. Without one, multiple interpreters are a much lessuseful feature.

Specification

The module will:

expose the existing multiple interpreter support
introduce a basic mechanism for communicating between interpreters

The module will wrap a new low-level_interpreters module(in the same way as thethreading module).However, that low-level API is not intended for public useand thus not part of this proposal.

Using Interpreters

The module defines the following functions:

get_current()->Interpreter
Returns theInterpreter object for the currently executinginterpreter.
list_all()->list[Interpreter]
Returns theInterpreter object for each existing interpreter,whether it is currently running in any OS threads or not.
create()->Interpreter
Create a new interpreter and return theInterpreter objectfor it. The interpreter doesn’t do anything on its own and isnot inherently tied to any OS thread. That only happens whensomething is actually run in the interpreter(e.g.Interpreter.exec()), and only while running.The interpreter may or may not have thread states ready to use,but that is strictly an internal implementation detail.

Interpreter Objects

Aninterpreters.Interpreter object that represents the interpreter(PyInterpreterState) with the corresponding unique ID.There will only be one object for any given interpreter.

If the interpreter was created withinterpreters.create() thenit will be destroyed as soon as allInterpreter objects with its ID(across all interpreters) have been deleted.

Interpreter objects may represent other interpreters than thosecreated byinterpreters.create(). Examples include the maininterpreter (created by Python’s runtime initialization) and thosecreated via the C-API, usingPy_NewInterpreter(). SuchInterpreter objects will not be able to interact with theircorresponding interpreters, e.g. viaInterpreter.exec()(though we may relax this in the future).

Attributes and methods:

id
(read-only) A non-negativeint that identifies theinterpreter that thisInterpreter instance represents.Conceptually, this is similar to a process ID.
__hash__()
Returns the hash of the interpreter’sid. This is the sameas the hash of the ID’s integer value.
is_running()->bool
ReturnsTrue if the interpreter is currently executing codein its__main__ module. This excludes sub-threads.
It refers only to if there is an OS threadrunning a script (code) in the interpreter’s__main__ module.That basically means whether or notInterpreter.exec()is running in some OS thread. Code running in sub-threadsis ignored.
prepare_main(**kwargs)
Bind one or more objects in the interpreter’s__main__ module.
The keyword argument names will be used as the attribute names.For most objects a copy will be bound in the interpreter, withpickle used in between. For some objects, likememoryview,the underlying data will be shared between the interpreters.SeeShareable Objects.
prepare_main() is helpful for initializing theglobals for an interpreter before running code in it.
exec(code,/)
Execute the given source code in the interpreter(in the current OS thread), using its__main__ module.It doesn’t return anything.
This is essentially equivalent to switching to this interpreterin the current OS thread and then calling the builtinexec()using this interpreter’s__main__ module’s__dict__ asthe globals and locals.
The code running in the current OS thread (a differentinterpreter) is effectively paused untilInterpreter.exec()finishes. To avoid pausing it, create a newthreading.Threadand callInterpreter.exec() in it(likeInterpreter.call_in_thread() does).
Interpreter.exec() does not reset the interpreter’s state northe__main__ module, neither before nor after, so eachsuccessive call picks up where the last one left off. This canbe useful for running some code to initialize an interpreter(e.g. with imports) before later performing some repeated task.
If there is an uncaught exception, it will be propagated intothe calling interpreter as anExecutionFailed. The full errordisplay of the original exception, generated relative to thecalled interpreter, is preserved on the propagatedExecutionFailed.That includes the full traceback, with all the extra info likesyntax error details and chained exceptions.If theExecutionFailed is not caught then that full error displaywill be shown, much like it would be if the propagated exceptionhad been raised in the main interpreter and uncaught. Havingthe full traceback is particularly useful when debugging.
If exception propagation is not desired then an explicittry-except should be used around thecode passed toInterpreter.exec(). Likewise any error handling that dependson specific information from the exception must use an explicittry-except around the givencode, sinceExecutionFailedwill not preserve that information.
call(callable,/)
Call the callable object in the interpreter.The return value is discarded. If the callable raises an exceptionthen it gets propagated as anExecutionFailed exception,in the same way asInterpreter.exec().
For now only plain functions are supported and only ones thattake no arguments and have no cell vars. Free globals are resolvedagainst the target interpreter’s__main__ module.
In the future, we can add support for arguments, closures,and a broader variety of callables, at least partly via pickle.We can also consider not discarding the return value.The initial restrictions are in place to allow us to get the basicfunctionality of the module out to users sooner.
call_in_thread(callable,/)->threading.Thread
Essentially, applyInterpreter.call() in a new thread.Return values are discarded and exceptions are not propagated.
call_in_thread() is roughly equivalent to:
deftask():interp.call(func)t=threading.Thread(target=task)t.start()
close()
Destroy the underlying interpreter.

Communicating Between Interpreters

The module introduces a basic communication mechanism through specialqueues.

There areinterpreters.Queue objects, but they only proxythe actual data structure: an unbounded FIFO queue that existsoutside any one interpreter. These queues have special accommodationsfor safely passing object data between interpreters, without violatinginterpreter isolation. This includes thread-safety.

As with other queues in Python, for each “put” the object is added tothe back and each “get” pops the next one off the front. Every addedobject will be popped off in the order it was pushed on.

Any object that can be pickled may be sent through aninterpreters.Queue.

Note that the actual objects aren’t sent, but rather their underlyingdata is sent. The resulting object is strictly equivalent to theoriginal. For most objects the underlying data is serialized (e.g.pickled). In a few cases, like withmemoryview, the underlying datais sent (and shared) without serialization. SeeShareable Objects.

The module defines the following functions:

create_queue(maxsize=0)->Queue
Create a new queue. If the maxsize is zero or negative then thequeue is unbounded.

Queue Objects

interpreters.Queue objects act as proxies for the underlyingcross-interpreter-safe queues exposed by theinterpreters module.EachQueue object represents the queue with the correspondingunique ID.There will only be one object for any given queue.

Queue implements all the methods ofqueue.Queue except fortask_done() andjoin(), hence it is similar toasyncio.Queue andmultiprocessing.Queue.

Attributes and methods:

id
(read-only) A non-negativeint that identifiesthe corresponding cross-interpreter queue.Conceptually, this is similar to the file descriptorused for a pipe.
maxsize
(read-only) Number of items allowed in the queue.Zero means “unbounded”.
__hash__()
Return the hash of the queue’sid. This is the sameas the hash of the ID’s integer value.
empty()
ReturnTrue if the queue is empty,False otherwise.
This is only a snapshot of the state at the time of the call.Other threads or interpreters may cause this to change.
full()
ReturnTrue if there aremaxsize items in the queue.
If the queue was initialized withmaxsize=0 (the default),thenfull() never returnsTrue.
This is only a snapshot of the state at the time of the call.Other threads or interpreters may cause this to change.
qsize()
Return the number of items in the queue.
This is only a snapshot of the state at the time of the call.Other threads or interpreters may cause this to change.
put(obj,timeout=None)
Add the object to the queue.
Ifmaxsize>0 and the queue is full then this blocks untila free slot is available. Iftimeout is a positive numberthen it only blocks at least that many seconds and then raisesinterpreters.QueueFull. Otherwise is blocks forever.
Nearly all objects can be sent through the queue. In a few cases,like withmemoryview, the underlying data is actually shared,rather than just copied. SeeShareable Objects.
If an object is still in the queue, and the interpreter which putit in the queue (i.e. to which it belongs) is destroyed, then theobject is immediately removed from the queue. (We may later addan option to replace the removed object in the queue with asentinel or to raise an exception for the correspondingget()call.)
put_nowait(obj
Likeput() but effectively with an immediate timeout.Thus if the queue is full, it immediately raisesinterpreters.QueueFull.
get(timeout=None)->object
Pop the next object from the queue and return it. Block whilethe queue is empty. If a positivetimeout is provided and anobject hasn’t been added to the queue in that many secondsthen raiseinterpreters.QueueEmpty.
get_nowait()->object
Likeget(), but do not block. If the queue is not emptythen return the next item. Otherwise, raiseinterpreters.QueueEmpty.

Shareable Objects

A “shareable” object is one which may be passed from one interpreterto another. The object is not actually directly shared by theinterpreters. However, the shared object should be treated as thoughitwere shared directly, with caveats for mutability.

All objects that can be pickled are shareable. Thus, nearly everyobject is shareable.interpreters.Queue objects are also shareable.

In nearly every case where an object is sent to an interpreter, whetherwithinterp.prepare_main() orqueue.put(), the actual objectis not sent. Instead, the object’s underlying data is sent. Formost objects the object is pickled and the receivinginterpreter unpickles it.

A notable exception is objects which implement the “buffer” protocol,likememoryview. Their underlyingPy_buffer is actually sharedbetween interpreters.interp.prepare_main() andqueue.get()wrap the buffer in a newmemoryview object.

For most mutable objects, when one is sent to another interpreter, it iscopied. Thus any changes to the original or to the copy will never besynchronized to the other. Mutable objects shared through pickling fallinto this category. However,interpreters.Queue and objects thatimplement the buffer protocol are notable cases where the underlyingdatais shared between interpreters, so objects stay synchronized.

When interpreters genuinely share mutable data there is always a riskof data races. Cross-interpreter safety, including thread-safety,is a fundamental feature ofinterpreters.Queue.

However, the buffer protocol (i.e.Py_buffer) does not have anynative accommodations against data races. Instead, the user isresponsible for managing thread-safety, whether passing a token backand forth through a queue to indicate safety (seeSynchronization),or by assigning sub-range exclusivity to individual interpreters.

Most objects will be shared through queues (interpreters.Queue),as interpreters communicate information between each other.Less frequently, objects will be shared throughprepare_main()to set up an interpreter prior to running code in it. However,prepare_main() is the primary way that queues are shared,to provide another interpreter with a meansof further communication.

Synchronization

There are situations where two interpreters should be synchronized.That may involve sharing a resource, worker management, or preservingsequential consistency.

In threaded programming the typical synchronization primitives aretypes like mutexes. Thethreading module exposes several.However, interpreters cannot share objects which means they cannotsharethreading.Lock objects.

Theinterpreters module does not provide any such dedicatedsynchronization primitives. Instead,interpreters.Queueobjects provide everything one might need.

For example, if there’s a shared resource that needs managedaccess then a queue may be used to manage it, where the interpreterspass an object around to indicate who can use the resource:

importinterpretersfrommymoduleimportload_big_data,check_datanumworkers=10control=interpreters.create_queue()data=memoryview(load_big_data())defworker():interp=interpreters.create()interp.prepare_main(control=control,data=data)interp.exec("""if True:        from mymodule import edit_data        while True:            token = control.get()            edit_data(data)            control.put(token)        """)threads=[threading.Thread(target=worker)for_inrange(numworkers)]fortinthreads:t.start()token='football'control.put(token)whileTrue:control.get()ifnotcheck_data(data):breakcontrol.put(token)

Exceptions

InterpreterError
Indicates that some interpreter-related failure occurred.
This exception is a subclass ofException.
InterpreterNotFoundError
Raised fromInterpreter methods after the underlyinginterpreter has been destroyed, e.g. via the C-API.
This exception is a subclass ofInterpreterError.
ExecutionFailed
Raised fromInterpreter.exec() andInterpreter.call()when there’s an uncaught exception.The error display for this exception includes the tracebackof the uncaught exception, which gets shown after the normalerror display, much like happens forExceptionGroup.
Attributes:
- type - a representation of the original exception’s class,with__name__,__module__, and__qualname__ attrs.
- msg -str(exc) of the original exception
- snapshot - atraceback.TracebackException objectfor the original exception
This exception is a subclass ofInterpreterError.
QueueError
Indicates that some queue-related failure occurred.
This exception is a subclass ofException.
QueueNotFoundError
Raised frominterpreters.Queue methods after the underlyingqueue has been destroyed.
This exception is a subclass ofQueueError.
QueueEmpty
Raised fromQueue.get() (orget_nowait() with no default)when the queue is empty.
This exception is a subclass of bothQueueErrorand the stdlibqueue.Empty.
QueueFull
Raised fromQueue.put() (with a timeout) orput_nowait()when the queue is already at its max size.
This exception is a subclass of bothQueueErrorand the stdlibqueue.Empty.

InterpreterPoolExecutor

Along with the newinterpreters module, there will be a newconcurrent.futures.InterpreterPoolExecutor. It will be aderivative ofThreadPoolExecutor, where each worker executesin its own thread, but each with its own subinterpreter.

Like the other executors,InterpreterPoolExecutor will supportcallables for tasks, and for the initializer. Also like the otherexecutors, the arguments in both cases will be mostly unrestricted.The callables and arguments will typically be serialized when sentto a worker’s interpreter, e.g. with pickle, like how theProcessPoolExecutor works. This contrasts withInterpreter.call(), which will (at least initially)be much more restricted.

Communication between workers, or between the executor(or generally its interpreter) and the workers, may still be donethroughinterpreters.Queue objects, set with the initializer.

sys.implementation.supports_isolated_interpreters

Python implementations are not required to support subinterpreters,though most major ones do. If an implementation does support themthensys.implementation.supports_isolated_interpreters will beset toTrue. Otherwise it will beFalse. If the featureis not supported then importing theinterpreters module willraise anImportError.

Examples

The following examples demonstrate practical cases where multipleinterpreters may be useful.

Example 1:

There’s a stream of requests coming in that will be handledvia workers in sub-threads.

each worker thread has its own interpreter
there’s one queue to send tasks to workers andanother queue to return results
the results are handled in a dedicated thread
each worker keeps going until it receives a “stop” sentinel (None)
the results handler keeps going until all workers have stopped

importinterpretersfrommymoduleimportiter_requests,handle_resulttasks=interpreters.create_queue()results=interpreters.create_queue()numworkers=20threads=[]defresults_handler():running=numworkerswhilerunning:try:res=results.get(timeout=0.1)exceptinterpreters.QueueEmpty:# No workers have finished a request since last time.passelse:ifresisNone:# A worker has stopped.running-=1else:handle_result(res)empty=object()assertresults.get_nowait(empty)isemptythreads.append(threading.Thread(target=results_handler))defworker():interp=interpreters.create()interp.prepare_main(tasks=tasks,results=results)interp.exec("""if True:        from mymodule import handle_request, capture_exception        while True:            req = tasks.get()            if req is None:                # Stop!                break            try:                res = handle_request(req)            except Exception as exc:                res = capture_exception(exc)            results.put(res)        # Notify the results handler.        results.put(None)        """)threads.extend(threading.Thread(target=worker)for_inrange(numworkers))fortinthreads:t.start()forreqiniter_requests():tasks.put(req)# Send the "stop" signal.for_inrange(numworkers):tasks.put(None)fortinthreads:t.join()

Example 2:

This case is similar to the last as there are a bunch of workersin sub-threads. However, this time the code is chunking up a big arrayof data, where each worker processes one chunk at a time. Copyingthat data to each interpreter would be exceptionally inefficient,so the code takes advantage of directly sharingmemoryview buffers.

all the interpreters share the buffer of the source array
each one writes its results to a second shared buffer
there’s use a queue to send tasks to workers
only one worker will ever read any given index in the source array
only one worker will ever write to any given index in the results(this is how it ensures thread-safety)

importinterpretersimportqueuefrommymoduleimportread_large_data_set,use_resultsnumworkers=3data,chunksize=read_large_data_set()buf=memoryview(data)numchunks=(len(buf)+1)/chunksizeresults=memoryview(b'\0'*numchunks)tasks=interpreters.create_queue()defworker(id):interp=interpreters.create()interp.prepare_main(data=buf,results=results,tasks=tasks)interp.exec("""if True:        from mymodule import reduce_chunk        while True:            req = tasks.get()            if res is None:                # Stop!                break            resindex, start, end = req            chunk = data[start: end]            res = reduce_chunk(chunk)            results[resindex] = res        """)threads=[threading.Thread(target=worker)for_inrange(numworkers)]fortinthreads:t.start()foriinrange(numchunks):# Assume there's at least one worker running still.start=i*chunksizeend=start+chunksizeifend>len(buf):end=len(buf)tasks.put((start,end,i))# Send the "stop" signal.for_inrange(numworkers):tasks.put(None)fortinthreads:t.join()use_results(results)

Rationale

A Minimal API

Since the core dev team has no real experience withhow users will make use of multiple interpreters in Python code, thisproposal purposefully keeps the initial API as lean and minimal aspossible. The objective is to provide a well-considered foundationon which further (more advanced) functionality may be added later,as appropriate.

That said, the proposed design incorporates lessons learned fromexisting use of subinterpreters by the community, from existing stdlibmodules, and from other programming languages. It also factors inexperience from using subinterpreters in the CPython test suite andusing them inconcurrency benchmarks.

create(), create_queue()

Typically, users call a type to create instances of the type, at whichpoint the object’s resources get provisioned. Theinterpretersmodule takes a different approach, where users must callcreate()to get a new interpreter orcreate_queue() for a new queue.Callinginterpreters.Interpreter() directly only returns a wrapperaround an existing interpreters (likewise forinterpreters.Queue()).

This is because interpreters (and queues) are special resources.They exist globally in the process and are not managed/owned by thecurrent interpreter. Thus theinterpreters module makes creatingan interpreter (or queue) a visibly distinct operation from creatingan instance ofinterpreters.Interpreter(orinterpreters.Queue).

Interpreter.prepare_main() Sets Multiple Variables

prepare_main() may be seen as a setter function of sorts.It supports setting multiple names at once,e.g.interp.prepare_main(spam=1,eggs=2), whereas most settersset one item at a time. The main reason is for efficiency.

To set a value in the interpreter’s__main__.__dict__, theimplementation must first switch the OS thread to the identifiedinterpreter, which involves some non-negligible overhead. Aftersetting the value it must switch back.Furthermore, there is some additional overhead to the mechanismby which it passes objects between interpreters, which can bereduced in aggregate if multiple values are set at once.

Therefore,prepare_main() supports setting multiplevalues at once.

Propagating Exceptions

An uncaught exception from a subinterpreter,viaInterpreter.exec(),could either be (effectively) ignored,likethreading.Thread() does,or propagated, like the builtinexec() does.SinceInterpreter.exec() is a synchronous operation,like the builtinexec(), uncaught exceptions are propagated.

However, such exceptions are not raised directly. That’s becauseinterpreters are isolated from each other and must not share objects,including exceptions. That could be addressed by raising a surrogateof the exception, whether a summary, a copy, or a proxy that wraps it.Any of those could preserve the traceback, which is useful fordebugging. TheExecutionFailed that gets raisedis such a surrogate.

There’s another concern to consider. If a propagated exception isn’timmediately caught, it will bubble up through the call stack untilcaught (or not). In the case that code somewhere else may catch it,it is helpful to identify that the exception came from a subinterpreter(i.e. a “remote” source), rather than from the current interpreter.That’s whyInterpreter.exec() raisesExecutionFailed and whyit is a plainException, rather than a copy or proxy with a classthat matches the original exception. For example, an uncaughtValueError from a subinterpreter would never get caught in a latertry:...exceptValueError:.... Instead,ExecutionFailedmust be handled directly.

In contrast, exceptions propagated fromInterpreter.call() do notinvolveExecutionFailed but are raised directly, as though originatingin the calling interpreter. This is becauseInterpreter.call() isa higher level method that uses pickle to support objects that can’tnormally be passed between interpreters.

Objects vs. ID Proxies

For both interpreters and queues, the low-level module makes use ofproxy objects that expose the underlying state by their correspondingprocess-global IDs. In both cases the state is likewise process-globaland will be used by multiple interpreters. Thus they aren’t suitableto be implemented asPyObject, which is only really an option forinterpreter-specific data. That’s why theinterpreters moduleinstead provides objects that are weakly associated through the ID.

Rejected Ideas

SeePEP 554.

Copyright

This document is placed in the public domain or under theCC0-1.0-Universal license, whichever is more permissive.

Source:https://github.com/python/peps/blob/main/peps/pep-0734.rst

Last modified:2025-07-06 09:38:43 GMT

Movatterモバイル変換

PEP 734 – Multiple Interpreters in the Stdlib