Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-99113: Make Sure the GIL is Acquired at the Right Places#104208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

ericsnowcurrently
Copy link
Member

@ericsnowcurrentlyericsnowcurrently commentedMay 5, 2023
edited
Loading

This is a pre-requisite for a per-interpreter GIL. Without it this change isn't strictly necessary. However, there is no downside otherwise.

(This change is broken out fromgh-99114.)

@Eclips4
Copy link
Member

Eclips4 commentedMay 5, 2023
edited
Loading

There's a failure, I can reproduce it on Windows. (originally taken fromtest_xxsubinterpreters/test_in_thread)
Minimal reproducible example:

importthreadingimport_xxsubinterpretersasinterpretersdeffoo():t=threading.Thread(target=interpreters.create)t.start()t.join()foo()foo()

Traceback:

C:\Users\KIRILL-1\CLionProjects\cpython> ./pythonexample.pyRunningDebug|x64interpreter...FatalPythonerror:drop_gil:drop_gil:GILisnotlockedPythonruntimestate:initializedCurrentthread0x00001900 (mostrecentcallfirst):<noPythonframe>
ericsnowcurrently reacted with thumbs up emoji

@ericsnowcurrently
Copy link
MemberAuthor

@Eclips4, do you get the same failure with#99114?

@ericsnowcurrently
Copy link
MemberAuthor

@Eclips4, I've rebased this branch. Do you still get the crash?

@ericsnowcurrentlyericsnowcurrently marked this pull request as ready for reviewMay 5, 2023 22:03
@Eclips4
Copy link
Member

Eclips4 commentedMay 6, 2023
edited
Loading

Hello Eric, sorry for the wait
Tried on commit2404310dea7c14a59fba9a7ffae4f203005bbf4f.
Same error:

PSC:\Users\KIRILL-1\CLionProjects\cpython> ./pythonexample.pyRunningDebug|x64interpreter...FatalPythonerror:drop_gil:drop_gil:GILisnotlockedPythonruntimestate:initializedCurrentthread0x00003a74 (mostrecentcallfirst):<noPythonframe>
ericsnowcurrently reacted with thumbs up emoji

@Eclips4
Copy link
Member

It's kinda interesting why it fails only on Windows. Maybe try build with buildbots?

@ericsnowcurrently
Copy link
MemberAuthor

Ah, looks like I was releasing the GIL unnecessarily innew_interpreter() (in pystate.c).

@ericsnowcurrently
Copy link
MemberAuthor

FTR, there's the stack trace before my fix:

(expand)
 KernelBase.dll!00007ff911127b02()Unknown>python312_d.dll!fatal_error_exit(int status) Line 2677C python312_d.dll!fatal_error(int fd, int header, const char * prefix, const char * msg, int status) Line 2859C python312_d.dll!_Py_FatalErrorFunc(const char * func, const char * msg) Line 2875C python312_d.dll!drop_gil(_ceval_state * ceval, _ts * tstate) Line 287C python312_d.dll!_PyEval_ReleaseLock(_ts * tstate) Line 625C python312_d.dll!_PyThreadState_Swap(pyruntimestate * runtime, _ts * newts) Line 1912C python312_d.dll!_extensions_cache_set(_object * filename, _object * name, PyModuleDef * def) Line 947C python312_d.dll!fix_up_extension(_object * mod, _object * name, _object * filename) Line 1177C python312_d.dll!_PyImport_FixupBuiltin(_object * mod, const char * name, _object * modules) Line 1310C python312_d.dll!_PySys_Create(_ts * tstate, _object * * sysmod_p) Line 3471C python312_d.dll!pycore_interp_init(_ts * tstate) Line 852C python312_d.dll!new_interpreter(_ts * * tstate_p, const PyInterpreterConfig * config) Line 2066C python312_d.dll!Py_NewInterpreterFromConfig(_ts * * tstate_p, const PyInterpreterConfig * config) Line 2101C python312_d.dll!interp_create(_object * self, _object * args, _object * kwds) Line 522C python312_d.dll!cfunction_call(_object * func, _object * args, _object * kwargs) Line 537C python312_d.dll!_PyObject_Call(_ts * tstate, _object * callable, _object * args, _object * kwargs) Line 367C python312_d.dll!PyObject_Call(_object * callable, _object * args, _object * kwargs) Line 380C python312_d.dll!_PyEval_EvalFrameDefault(_ts * tstate, _PyInterpreterFrame * frame, int throwflag) Line 3125C python312_d.dll!_PyEval_Vector(_ts * tstate, PyFunctionObject * func, _object * locals, _object * const * args, unsigned __int64 argcount, _object * kwnames) Line 1576C python312_d.dll!_PyFunction_Vectorcall(_object * func, _object * const * stack, unsigned __int64 nargsf, _object * kwnames) Line 419C python312_d.dll!_PyObject_VectorcallTstate(_ts * tstate, _object * callable, _object * const * args, unsigned __int64 nargsf, _object * kwnames) Line 92C python312_d.dll!method_vectorcall(_object * method, _object * const * args, unsigned __int64 nargsf, _object * kwnames) Line 67C python312_d.dll!_PyVectorcall_Call(_ts * tstate, _object *(*)(_object *, _object * const *, unsigned __int64, _object *) func, _object * callable, _object * tuple, _object * kwargs) Line 271C python312_d.dll!_PyObject_Call(_ts * tstate, _object * callable, _object * args, _object * kwargs) Line 354C python312_d.dll!PyObject_Call(_object * callable, _object * args, _object * kwargs) Line 380C python312_d.dll!thread_run(void * boot_raw) Line 1081C python312_d.dll!bootstrap(void * call) Line 182C [External Code]

@Eclips4
Copy link
Member

Yeah, after last commit error will go away.

ericsnowcurrently reacted with heart emoji

@ericsnowcurrently
Copy link
MemberAuthor

ericsnowcurrently commentedMay 6, 2023
edited
Loading

It's kinda interesting why it fails only on Windows. Maybe try build with buildbots?

That is interesting. I figured it would be worth exploring why that unexpected situation happened. Below is my analysis.

tl;dr in addition to dropping that_PyEval_ReleaseThread() call, there was a bug in_PyEval_InitGIL() that needs fixing.

(So thanks for pointing this out.)

Analysis

(expand)

The immediate problem was that we were trying to release the GIL when it wasn't held (but did already exist). This was happening, specifically, in_PyThreadState_Swap() via thepycore_interp_init() call right afterinit_interp_create_gil() call innew_interpreter() (in pystate.c). I expected that, wheninit_interp_create_gil() returned, the current thread would hold the GIL, regardless of whether or not the GIL was released earlier (which apparently we were doing).

This may mean I broke that expectation somewhere with one of my relatively recent commits (or in this PR). Or it might be a long-standing bug that I exposed with some new branch in the code.

Regardless,the failure implies broken expectationssomewhere. Here are the relevant expectations I thought of, particularly relative to the failure and leading up to the call to_PySys_Create():

  1. ✅the "current" thread state whennew_interpreter() (in pylifecycle.c) gets called is consistent (whetherNULL or not) across platforms
  2. ✅innew_interpreter(), the "current" thread state should not be change anywhere before the_PyThreadState_SwapNoGIL()
  3. _PyThreadState_SwapNoGIL() sets the "current" thread state correctly and returns the previous one
  4. ✅after the first_PyThreadState_SwapNoGIL() call,save_tstate is consistent across platforms
  5. ✅the current thread state does not change again innew_interpreter() (except at the end if it fails)
  6. _PyEval_ReleaseLock() always releases the GIL if held (and crashes otherwise)
  7. the call to_PyConfig_Copy() does not need (or touch) the GIL
  8. ✅the call toinit_interp_settings() does not need (or touch) the GIL
  9. ✅theinit_interp_create_gil() call correctly setsinterp->ceval.gil
  10. ✅it makes sure the GIL is held
  11. ✅whenpycore_interp_init() is called, the GIL is held
  12. pycore_init_global_objects() does not release the GIL
  13. _PyGC_Init() does not release the GIL
  14. _Py_Deepfreeze_Init() does not release the GIL
  15. pycore_init_types() does not release the GIL
  16. _PyWarnings_InitState() does not release the GIL
  17. _PyAtExit_Init() does not release the GIL

I checked each of these in turn, comparing on linux (where I develop) and Windows (where the failure happened), and the expectations were valid. However, in two places the GIL changed state unexpectedly (noted above without a check mark). This only happened on Windows.

I'm pretty sure that the following happened:

  • during the_PyConfig_Copy() call, the main thread re-acquired the GIL (since we had just released it in the sub-thread)
  • during the_Py_Deepfreeze_Init() call, the main interpreter released the GIL again when it started up the second sub-thread

Why did this only happen on Windows. I expect (but have not verified) that the semantics of releasing and re-acquiring the GIL in the eval loop are different on Windows and linux.

Conclusion

All that demonstrates two issues (one of which I've already addressed):

  1. the_PyEval_ReleaseLock() call wasn't necessary (but did reveal the actual problem)
  2. in_PyEval_InitGIL(), we skip acquiring the GIL if it's already held, but we don't make sure it's held by the current thread

The solution for_PyEval_InitGIL() is to checkmain_interp->ceval.gil->last_holder. I'll do that.

Eclips4 reacted with thumbs up emoji

@ericsnowcurrentlyericsnowcurrently merged commit92d8bff intopython:mainMay 6, 2023
@ericsnowcurrentlyericsnowcurrently deleted the fix-acquiring-gil branchMay 6, 2023 21:59
jbower-fb pushed a commit to jbower-fb/cpython that referenced this pull requestMay 8, 2023
…thongh-104208)This is a pre-requisite for a per-interpreter GIL.  Without it this change isn't strictly necessary.  However, there is no real downside otherwise.
@asottile
Copy link
Contributor

@ericsnowcurrently a bit of a long shot and way late to the party -- but I've recently bisected uwsgi (which deadlocks on python 3.12 only after a reload) and the bisection claims this commit as the culprit -- you can find more analysis inunbit/uwsgi#2659

my guess without debugging much yet islast_holder isn't forksafe? (need to poke around in gdb some more though)

@asottile
Copy link
Contributor

I believe there's an oversight in this patch which changes the behaviour ofPyThreadState_Swap -- previously it did not attempt to acquire the GIL at the end but now it does.

this patch "fixes" my problem:

diff --git a/Python/pystate.c b/Python/pystate.cindex f14934361da..218c08a8528 100644--- a/Python/pystate.c+++ b/Python/pystate.c@@ -1919,7 +1919,7 @@ _PyThreadState_Swap(_PyRuntimeState *runtime, PyThreadState *newts) PyThreadState * PyThreadState_Swap(PyThreadState *newts) {-    return _PyThreadState_Swap(&_PyRuntime, newts);+    return _PyThreadState_SwapNoGIL(newts); }

asottile added a commit to asottile/cpython that referenced this pull requestAug 16, 2024
inpython#104208 it was inadvertently (?) changed to acquire the GIL
@asottile
Copy link
Contributor

I opened#123079 with the potential fix

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@erlend-aaslanderlend-aaslanderlend-aasland left review comments

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

4 participants
@ericsnowcurrently@Eclips4@asottile@erlend-aasland

[8]ページ先頭

©2009-2025 Movatter.jp