Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34.3k
gh-146302: make Py_IsInitialized() thread-safe and reflect true init completion#146303
gh-146302: make Py_IsInitialized() thread-safe and reflect true init completion#146303gpshead wants to merge 7 commits intopython:mainfrom
Conversation
… init completionPy_IsInitialized() previously returned 1 before Py_InitializeEx() hadfinished, because the runtime flag was set before site.py was imported.This caused a race in embedders like PyO3 where a second thread couldobserve an initialized interpreter before sys.path was fully configured.Move the initialized=1 store to the end of init_interp_main(), aftersite import, lazy imports, tier 2 optimizer, and dict watcher setup.Access both `initialized` and `core_initialized` through new inlineaccessors that use acquire/release atomics, eliminating the C-standarddata race and ensuring correct visibility on weakly-orderedarchitectures.Fix PySys_AddAuditHook() to check core_initialized (not initialized)when deciding whether a thread state is available, since tstate existsafter core init completes.Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| .. versionchanged:: next | ||
| This function no longer returns true until initialization has fully | ||
| completed, including import of the :mod:`site` module. Previously it | ||
| could return true while :c:func:`Py_Initialize` was still running. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
this is technically a bug fix, we don't always call these out in the main docs. i'm happy dropping this, but it feels worthwhile to mention for embedders likely to use this API. and if we choose to backport this as a bugfix it'll include proper 3.14.x and 3.13.x attribution to indicate when people need to work around it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
It is fine to add versionchanged when an important behavior needed to be pointed out in a bugfix release so I don't mind keeping this here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
However considering what it changed I think it is best to keep it in 3.15 only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Agreed, I'm fine not backporting this. It's rare enough and PyO3 already has a workaround.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| } | ||
| Py_XDECREF(warnoptions); | ||
| interp->runtime->initialized=1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Moving this down after site and the rest of the machinery is the part I'm most interested in seeing tested. Intuitively I would not expect any code site may import to ever care about Py_IsInitialized() itself as extension module code "should" never have a reason to use that API as it shouldn't be called outside of an interpreter.
But this is the one thing in this change that might be an observable behavior change. Even if we don't agree with someone relying on the existing behavior.
The docs say audit hooks are notified "after runtime initialization".With the flag move, initialized=0 throughout init (including siteimport), so hooks are correctly not invoked during any init phase.Using core_initialized would have expanded the invocation windowbeyond the original behavior.Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use a C audit hook registered before Py_Initialize() to observe thevalue of Py_IsInitialized() when the "import" event fires for "site".With the old flag position this would have returned 1; afterpythongh-146302it correctly returns 0.Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1cf04e0 todafd62eCompareUse _Py_IsCoreInitialized() in preconfig.c and Py_IsInitialized() inPy_InitializeEx(), removing the unnecessary runtime local variable.Thanks picnixz!
dafd62e toba28b37Compare
picnixz left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
OOC there are lots of boolean flags that are still bare ints. Are they all protected against races or?
Uh oh!
There was an error while loading.Please reload this page.
gpshead commentedMar 22, 2026 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
probably not technically... I doubt the data race potential honestly exposes itself much in this area in practice as filling these in happens so much sooner than most anything that could be consuming them that all memory writes long since land. Py_IsInitialized() being a reader intended for concurrent access feels like an uncommon case. But I didn't try to analyze the others. asking Opus 4.6 to analyze their use for races (my summary): [which makes sense to me] |
bedevere-bot commentedMar 22, 2026
🤖 New build scheduled with the buildbot fleet by@gpshead for commitabae231 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F146303%2Fmerge If you want to schedule another build, you need to add the🔨 test-with-buildbots label again. |
davidhewitt commentedMar 22, 2026 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Thanks for doing this for PyO3's sake. Does thread-safe here mean that PyO3 could in principle freely call (I appreciate what PyO3 is doing is slightly weird, because there's no way to reliably know which one of those threads becomes the "main thread". It's good enough for Rust test suites, at least!) |
davidhewitt commentedMar 22, 2026
Sorry, I see I misread the title - this is about |
gpshead commentedMar 22, 2026
No, CPython does not have a block until initialization completes API. That might make sense as a feature request on its own but I'd wonder what the practical motivation for it was and why we'd want people to do it. |
davidhewitt commentedMar 22, 2026
I think it's fine to leave that synchronization downstream in PyO3 rather than complicate the C API. I think in most embedding contexts you want to be deliberate about interpreter lifecycle; what PyO3 does with the concurrent initialization is a quirk of Rust test suites launching multiple threads with no particular need for any proper finalization. |
gpshead commentedMar 22, 2026
Agreed. We currently have a variety of C API surfaces (like Initialize and IsInterpreter) that are easier when embedding processes take a wide "hey i'm going to have a Python interpreter, let me go ahead and set that up once up front instead of having everything that wants to use it try to do that at time of first use" approach. That is not really practical for all situations. Your Rust example of using one from some test cases is a good example. There are others, but they might expand into a larger C API and one interpreter vs multiple interpreters vs subinterpreters question. Imagine two independent libraries linked into a process unaware of each-other both wanting to use an embedded Python interpreter for internal use. As independent code, they have nogood way to coordinate in a multithreaded context where they might both be trying at once. Which still ignores the question of if they need isolated interpreters or not. All of that could turn into a discuss.python.org thread... |
Uh oh!
There was an error while loading.Please reload this page.
Summary
runtime->initialized = 1store from beforesite.pyimport to the end ofinit_interp_main(), soPy_IsInitialized()only returns true after initialization has fully completedinitializedandcore_initializedthrough new inline accessors using acquire/release atomics, to also protect from data race undefined behaviorPySys_AddAuditHook()now uses the accessor, so with the flag move it correctly skips audit hook invocation during all init phases (matching the documented "after runtime initialization" behavior) ... We could argue that running these earlier would be good even if the intent was never explicitly expressed, but that'd be its own issue.Motivation
Py_IsInitialized()returned 1 whilePy_InitializeEx()was still running — specifically, beforesite.pyhad been imported. SeePyO3/pyo3#5900 where a second thread could acquire the GIL and start executing Python with an incompletesys.pathbecausesite.pyhadn't finished.The flag was also a plain
intwith no atomic operations, making concurrent reads a C-standard data race, though unlikely to manifest.Regression test:
The added test properly fails on
mainwithERROR: Py_IsInitialized() was true during site import.Py_IsInitialized()can return 1 beforePy_InitializeEx()has completed. #146302Fixes#146302
📚 Documentation preview 📚:https://cpython-previews--146303.org.readthedocs.build/