Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-87135: Hang non-main threads that attempt to acquire the GIL during finalization#28525

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
jbms wants to merge11 commits intopython:mainfromjbms:fix-issue-42969

Conversation

jbms
Copy link
Contributor

@jbmsjbms commentedSep 22, 2021
edited by AlexWaygood
Loading

@the-knights-who-say-ni

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed thePSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@jbms

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please followthe steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You cancheck yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

@jbms
Copy link
ContributorAuthor

Can this possibly go into a point release? I think it would be unfortunate for this bug to remain until 3.11, even though presumably it has been around since forever.

@gpshead
Copy link
Member

gpshead commentedSep 23, 2021
edited
Loading

Can this possibly go into a point release? I think it would be unfortunate for this bug to remain until 3.11, even though presumably it has been around since forever.

The API deprecation must say 3.11. but the bugfix itself could be backported. We should let this one bake on the main branch for a bit first.

@gpsheadgpshead added the type-bugAn unexpected behavior, bug, or error labelSep 23, 2021
@gpshead
Copy link
Member

PS please do the CLA signing dance(see go/patching#cla internally, it's approved).

@jbms
Copy link
ContributorAuthor

I already signed the CLA earlier today, may take until tomorrow to register.

@jbms
Copy link
ContributorAuthor

I'm working on adding a mechanism that extension code can use to safely avoid threads hanging, and updating the documentation. I'll update this PR shortly once that is done.

@vstinner
Copy link
Member

There are legit use cases for pthread_exit(). If PyThread_exit_thread() is deprecated, does it mean that developers using it should call directlypthread_exit(0) and _endthreadex(0)?

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I dislike this PR. When Python is embedded in an application, it means that Python leaks hang threads in some cases, whereas currently these threads exit once they attempt to acquire the GIL.

I get that there are some use cases where calling pthread_exit() is bad, but other cases, calling pthread_exit() sounds less bad than hang these threads.

Would it be possible to make it possible to choose the behavior, and keep the current behavior by default?

@jbmsjbmsforce-pushed thefix-issue-42969 branch 3 times, most recently from099dd01 to8aab6c0CompareSeptember 24, 2021 00:51
Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This PR becomes more and more problems. I'm not sure if it is will fix more problems than introducing more problems.

I don't understand why Python should now start handling Windows GUI messages. Why only Windows GUI messages? Do you expect that other operating systems will also need special code in "hang threads"?

@@ -1615,6 +1616,47 @@ PyGILState_Release(PyGILState_STATE oldstate)
PyEval_SaveThread();
}

int
PyThread_TryAcquireFinalizeBlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please keep this API internal.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I am specifically intending to add this as a public API that may be used to safely attempt to acquire the GIL without the risk of hanging the thread (or if we continued to callpthread_exit, crashing the program).

The intended use case is as follows:

Previously, you might code like the following:

//somenon-PythoncreatedthreadthatwantstosendPythonanasyncnotificationPyGILState_STATEstate=PyGILState_Ensure();//maykillthreadwithpthread_exit//call`call_soon_threadsafe`onsomeeventloopobjectPyGILState_Release(state);

With this new API:

//somenon-PythoncreatedthreadthatwantstosendPythonanasyncnotificationintacquired=PyThread_TryAcquireFinalizeBlock();if (!acquired) {//skipsendingnotificationsincepythonisexitingreturn;}PyGILState_STATEstate=PyGILState_Ensure();//safenow//call`call_soon_threadsafe`onsomeeventloopobjectPyGILState_Release(state);PyThread_ReleaseFinalizeBlock();

Or with the convenience interface:

//somenon-PythoncreatedthreadthatwantstosendPythonanasyncnotificationPyGILState_TRY_STATEstate=PyGILState_TryAcquireFinalizeBlockAndGil();if (!state) {//skipsendingnotificationsincepythonisexitingreturn;}//call`call_soon_threadsafe`onsomeeventloopobjectPyGILState_ReleaseGilAndFinalizeBlock(state);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This write-up with examples actually makes some sense to be included in your "Cautions regarding runtime finalization" documentation section introducing the new APIs.

Why do we need both APIs. Why notonly offer the "convenience" API as a public API?

Also realize that extension authors and code generators will likely need conditionally compiled code based on the Python version to juse the their existing PyGILState_Ensure and Release calls on <= 3.10 while calling the new convenience API on 3.11+. When introducing something new, including that in the example should help (though maybe that specific transitional example belongs in the What's New doc for 3.11)

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

In some sense PyThread_TryAcquireFinalizeBlock is logically separate from acquiring the GIL. Yes, it is hard to imagine why you would want to acquire a finalize block if you don't also want to acquire the GIL, but you may well release the GIL while holding the finalize block (and in fact if the GIL were guaranteed not to be released, then a separate finalize block wouldn't be needed at all). However, that is a minor point.

Another issue is that the documentation forPyGILState_Ensure indicates it should not be used when using sub-interpreters. I don't really know much about how sub-interpreters work, but as this convenience API is based onPyGILState_Ensure, it would have the same limitation.

There is the additional issue of what to do when called from the finalization thread. Currently these APIs will fail, since they don't make an exception for the finalization thread. But since you can safely acquire the GIL on the finalization thread, in some cases you might want to skip acquiring the finaliation block, but proceed with acquiring the GIL, if running on the finalization thread.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I added this to the documentation, and also added an#if PY_VERSION_HEX snippet to the example as well.

Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

leaving a few more review comments here. i'll followup on the bug with larger picture PR-existential strategy comments.

@@ -1615,6 +1616,47 @@ PyGILState_Release(PyGILState_STATE oldstate)
PyEval_SaveThread();
}

int
PyThread_TryAcquireFinalizeBlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This write-up with examples actually makes some sense to be included in your "Cautions regarding runtime finalization" documentation section introducing the new APIs.

Why do we need both APIs. Why notonly offer the "convenience" API as a public API?

Also realize that extension authors and code generators will likely need conditionally compiled code based on the Python version to juse the their existing PyGILState_Ensure and Release calls on <= 3.10 while calling the new convenience API on 3.11+. When introducing something new, including that in the example should help (though maybe that specific transitional example belongs in the What's New doc for 3.11)

@jbms
Copy link
ContributorAuthor

I have updated the documentation to document the new functions, and changed to useSleepEx as suggested by@eryksun.

I have not yet made the change suggested by@gpshead to block signals in the pthread version --- I can do that if necessary, but I am not sure what the benefit is. Signal handlers already can't safely attempt to acquire the GIL or use any Python APIs that require the GIL, and therefore I would assume there is no harm in letting them run.

Other than that, what are the next steps for this pull request?

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actionsgithub-actionsbot added the staleStale PR or inactive for long period of time. labelOct 31, 2021
@Skylion007
Copy link

ping@eryksun@vstinner@gpshead

Copy link
Member

@gpsheadgpshead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

from your docs:

To avoid non-Python threads becoming blocked, or Python-created threads becoming
blocked while executing C extension code, you can use
:c:func:PyThread_TryAcquireFinalizeBlock and
:c:func:PyThread_ReleaseFinalizeBlock.

I think this phrasing undersells the magnitude of the problem and who can and should "fix" it... The"you can use" wording implies that it is the application user who debugged a hang is capable of resolving the problem by adjusting code they own.

But I suspect the reality is that until all Python C API using code in the world is updated to use these APIs (read: never/8-years)... hangs are going to be most frequently attributed to the huge pile of third party code that everybody's application uses. (Probably including a lot within the CPython internals itself).

There isn't really a solution for this. The existing GIL APIs and thus all existing Python C API using code in the world are not structured to check for and handle failure.

So it's mostly a matter of documentation semantics. We should seek to set expectations of who and when these new APIs should explicitly be used: In C code calling back into Python from potentially non-CPython created threads.

(I'd ignore the daemon thread aspect in our advice - anything could be one of those - we ultimately want to get rid of the daemon concept entirely and separately should document it as a "do not use / fundamentally broken / unfixable" feature in the threading module)


Other than the messaging, I think this overall PR looks good and want@Yhg1s to answer whether we backport to 3.12beta or not upon merge.

gpsheadand others added2 commitsMay 25, 2023 15:02
The finalized blocks is now protected by the main interpreters GIL.
currently held. Once bit 0 is set to 1, the number of finalize blocks is
not allowed to increase.

Protected by `ceval.gil.mutex`, and `ceval.gil.cond` must be broadcast
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We have a per-interpreter GIL now but - I think what needed to happen is to switch to using the main interpreters GIL as protection for this. So that's what I've committed and pushed.

Alternatively this finalize_blocks could use a lock of its own?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This PR seems like it should apply to each interpreter, rather than just to the whole global runtime (i.e. main interpreter).

@jbms
Copy link
ContributorAuthor

from your docs:

To avoid non-Python threads becoming blocked, or Python-created threads becoming
blocked while executing C extension code, you can use
:c:func:PyThread_TryAcquireFinalizeBlock and
:c:func:PyThread_ReleaseFinalizeBlock.

I think this phrasing undersells the magnitude of the problem and who can and should "fix" it... The"you can use" wording implies that it is the application user who debugged a hang is capable of resolving the problem by adjusting code they own.

But I suspect the reality is that until all Python C API using code in the world is updated to use these APIs (read: never/8-years)... hangs are going to be most frequently attributed to the huge pile of third party code that everybody's application uses. (Probably including a lot within the CPython internals itself).

There isn't really a solution for this. The existing GIL APIs and thus all existing Python C API using code in the world are not structured to check for and handle failure.

So it's mostly a matter of documentation semantics. We should seek to set expectations of who and when these new APIs should explicitly be used: In C code calling back into Python from potentially non-CPython created threads.

(I'd ignore the daemon thread aspect in our advice - anything could be one of those - we ultimately want to get rid of the daemon concept entirely and separately should document it as a "do not use / fundamentally broken / unfixable" feature in the threading module)

Other than the messaging, I think this overall PR looks good and want@Yhg1s to answer whether we backport to 3.12beta or not upon merge.

Not sure exactly what the best change to the docs is --- but as I see it, the key message around the hangs is:

"In most cases this is harmless"

I think it is a rare case indeed where the hang will not be harmless. Aside from the use case of wanting to cleanly finalize and initialize multiple times in a program, cases where hanging is a problem would likely result in a hang or crash without this PR as well.

gpshead reacted with thumbs up emoji

@gpsheadgpshead added the needs backport to 3.12only security fixes labelMay 26, 2023
Copy link
Member

@Yhg1sYhg1s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

FWIW, I think we should get this into 3.12 (preferably before beta 2, which is scheduled next week). I think it's safe to do so because only an internal struct changes, and that struct is not used from any macros.

gpshead reacted with thumbs up emoji
PyGILState_Release(state);
#if PY_VERSION_HEX >= 0x030c0000 // API added in Python 3.12
PyThread_ReleaseFinalizeBlock();
#endif // PY_VERSION_HEX
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm not sure I understand the point about PyGILState_Release() being confusing. I think the main question is whether there's a reasonable use-case for the finalize blockoutside of acquiring the GIL. For the use-case of making sure it's safe to acquire the GIL (and preventing finalization from starting while the GILState is held), I think the PyGILState API makes more sense. At that point it isn't exposed as a separate lock but as a new error condition (and a new safety guarantee when calling TryEnsure).

Framing it as "PyGILState_TryEnsure() is likePyGILState_Ensure() except it also prevents finalization from starting while the GILState is held" makes sense to me. I don't think most users of the C API care if their threads are blocked forever when finalizing, so they can just keep using PyGILState_Ensure. I don't think that's more confusing.

@jbms
Copy link
ContributorAuthor

I'm not sure I understand the point about PyGILState_Release() being confusing. I think the main question is whether there's a reasonable use-case for the finalize block outside of acquiring the GIL. For the use-case of making sure it's safe to acquire the GIL (and preventing finalization from starting while the GILState is held), I think the PyGILState API makes more sense. At that point it isn't exposed as a separate lock but as a new error condition (and a new safety guarantee when calling TryEnsure).

Framing it as "PyGILState_TryEnsure() is like PyGILState_Ensure() except it also prevents finalization from starting while the GILState is held" makes sense to me. I don't think most users of the C API care if their threads are blocked forever when finalizing, so they can just keep using PyGILState_Ensure. I don't think that's more confusing.

One example where it may be useful to acquire the finalize block separately from the GIL:

  1. Acquire finalize block
  2. Acquire a lock on some internal mutex owned by third-party code, finalize block ensures the thread doesn't hang while holding the mutex.
  3. Do various things with the Python API, which may involve repeatedly acquiring and releasing the GIL
  4. Release internal mutex
  5. Release finalize block

In any case having a single enum andPyGILState_Release seems reasonable. I don't thinkTryEnsure is a great name to replacePyGILState_AcquireFinalizeBlockAndGIL because it doesn't clearly indicate that a finalize block is being acquired, and users should not acquire finalize blocks casually (e.g. while doing any sort of blocking operation), as that may unexpectedly prevent Python from exiting.

Copy link
Member

@ericsnowcurrentlyericsnowcurrently left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There are a few things I haven't looked through (e.g. the details of how the "blocks" are implemented). Hopefully this is enough review to get things rolling though.

@@ -1469,6 +1586,32 @@ All of the following functions must be called after :c:func:`Py_Initialize`.
:c:func:`PyEval_SaveThread` or :c:func:`PyEval_ReleaseThread`
instead.

.. c:function:: int PyThread_AcquireFinalizeBlock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

AFAICT, this API is interpreter-specific, rather than tied to a thread or the global runtime. SoPy_AcquireFinalizeBlock() might be more consistent with similar existing API (e.g.Py_NewInterpreterFromConfig(),Py_EndInterpreter()).

That said, the proposed function relies on knowing the interpreter associated with the current thread (e.g.PyInterpreterState_Get()). I'd say we're trending away from that approach generally, and, ideally, we would not introduce new C-API that relies on that implicit knowledge. Instead, it may make more sense to add a variant instead:PyInterpreterState_AcquireFinalizeBlock().

The caller would explicitly provide the interpreter that should be blocked:

PyInterpreterState*interp=PyInterpreterState_Get();if (!PyInterpreterState_AcquireFinalizeBlock(interp)) {end_early_because_we_are_finalizing_unexpectedly();return;    }do_the_normal_work();PyInterpreterState_ReleaseFinalizeBlock(interp);return;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Given what I've said about "finalize block", consider alternate names:

  • PyInterpreterState_PreventFini() (andPyInterpreterState_AllowFini())
    • similarly,Py_PreventInterpreterFini() (andPy_AllowInterpreterFini())
  • PyInterpreterState_BlockFini() (andPyInterpreterState_ReleaseFini())

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

While it is fairly straightforward to make these finalize blocks interpreter-specific, it is not clear to me, with my limited understanding of sub-interpreters, whether that is actually useful.

  • It isn't clear to me howPyFinalize_Ex interacts with multiple interpreters. It only seems to finalize the current interpreter.

  • The documentation forPy_EndInterpreter states that the interpreter must "have no other threads". In fact it only does this check after calling the AtExit functions for the interpreter, so it seems it would be sufficient to ensure that all other thread states are destroyed before the AtExit functions finish. But there is also the question of what happens if we try to create a thread state whilePy_EndInterpreter is still in progress.Py_EndInterpreter doesn't seem to check for other thread states while holding the HEAD_LOCK, but that is not an issue as long as the check does not fail.

  • In general, given the "no other threads" constraint forPy_EndInterpreter it seems that if other non-Python-created or daermon threads hold references to thePyInterpreterState, then some external synchronization mechanism will be needed to ensure that they don't attempt to access thePyInterpreterState once the "no other threads" check completes.

As an example, suppose we have a C extension that provides a Python API that allows Python callbacks to be passed in, and then later calls those Python functions on its own non-Python-created thread pool. If this extension is to support sub-interpreters, then either during multi-phase module initialization, or when it receives the Python callback, it must record the PyInterpreterState associated with the callback. Then, in order to invoke the callback on a thread from its thread pool, it must obtain a PyThreadState for the (thread, interpreter) combination, creating one if one does not already exist. To ensure thePyInterpreterState pointers that it holds remain valid, it would need to register an AtExit function for the interpreter that ensures thePyInterpreterState won't be used. ThisAtExit function would likely need to essentially implement its own version of the "finalize block" mechanism introduced here.

Given the need for external synchronization of threads when callingPy_EndInterpreter, it seems to me that the finalize block mechanism defined by this PR is only useful for the main interpreter.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Before I dig in to responding: one key point to consider is that Python users (especially extension authors, via the C-API) only interact directly with the global runtime via a few API functions. In nearly every case they are instead interacting with the Python thread (state) or interpreter associated with the current OS thread.

Another key point is that, as of 3.12, each interpreter has its own GIL.

Finally, it is certainly possible that I've misunderstood either the problem you're trying to solve or the way you're trying to solve it or both. I'm completely willing to learn and adjust. Then again, I might be completely right too!

(Sorry for the length of this post. I genuinely want to understand and to be sure we're taking the right approach. I appreciate the work you've done and your willingness to converse.)


Now, on to responses:

  • It isn't clear to me howPyFinalize_Ex interacts with multiple interpreters. It only seems to finalize the current interpreter.

PyFinalize_Ex() finalizes the main interpreter and the global runtime. At the point it is called, no other interpreters should exist.

Py_EndInterpreter() finalizes any other interpreter, almost entirely in the same way we finalize the main interpreter.

  • The documentation forPy_EndInterpreter states that the interpreter must "have no other threads".

Please clarify. The only thing I see is: "All thread states associated with this interpreter are destroyed."

The behavior should be the same for all interpreters, whether viaPy_EndInterpreter() or the main interpreter viaPy_FinalizeEx():

  1. check if current thread and holds GIL and not still running (ensures no reentrancy from the eval loop)
  2. internally mark the interpreter as finalizing
  3. wait for all non-daemon threads to finish
  4. call all of the interpreter's atexit callbacks, if any
  5. externally mark the interpreter as finalizing (this causes daemon threads to die, or, now, pause)
  6. finalize imports
  7. clear the interpreter state (including all its remaining thread states)
  8. delete the interpreter state

Caveats:

Py_FinalizeEx() does a few extra things at various places, but that should not relate to interpreter lifecycle.

I do see that it calls_PyThreadState_DeleteExcept() right after step (5), whichPy_EndInterpreter() does not do. However, that's unexpected and should probably be resolved.

There are a few things we don't do in either that we probably should, probably before orright after step (5), e.g. disable the import system, disallow new threads (thread states).

Also, step (3) only applies to threads created by the threading module. We might want to extend that to all other threads states (i.e. created viaPyThreadState_New() orPyGILState_Ensure()).

In fact it only does this check after calling the AtExit functions for the interpreter,

Looking atPy_EndInterpreter():

  • callswait_for_thread_shutdown() rightbefore_PyAtExit_Call()
  • publicly marks itself as finalizing rightafter_PyAtExit_Call()
  • destroys its remaining thread states at the end duringfinalize_interp_clear()

So I'm not sure what you mean specifically.

FWIW,Py_FinalizeEx() is exactly the same, except currently it does that last part a little earlier with_PyThreadState_DeleteExcept().

so it seems it would be sufficient to ensure that all other thread states are destroyed before the AtExit functions finish.

Only daemon threads (and, for now, threads (states) created via the C-API) would still be running at that point, and only until next step (5) above.

So are we talking about both the following?

  • move step (5) before step (4)
  • apply steps (3) and (5) to thread states that werenot created by the threading module

Just to be clear, here are the ways thread states get created:

  • Py_Initialize*()
  • Py_NewInterpreter*()
  • PyThreadState_New()
  • PyGILState_Ensure()
  • _thread.start_new_thread() (viathreading.Thread.start())

At the moment, it's mostly only with that last one that we are careful during runtime/interp finalization.

It occurs to me that this PR is mostly about addressing that: dealing with other thread states in the same way we currently do threads created via the threading module. Does that sound right?

But there is also the question of what happens if we try to create a thread state whilePy_EndInterpreter is still in progress.Py_EndInterpreter doesn't seem to check for other thread states while holding the HEAD_LOCK, but that is not an issue as long as the check does not fail.

Yeah, we should probably be more deliberate about disallowing that sort of thing during finalization.

  • In general, given the "no other threads" constraint forPy_EndInterpreter it seems that if other non-Python-created or daermon threads hold references to thePyInterpreterState, then some external synchronization mechanism will be needed to ensure that they don't attempt to access thePyInterpreterState once the "no other threads" check completes.

That's what the proposed change in the PR is, AFAICS. The API you're adding must be specific to each interpreter, not to the global runtime. The resources that the proposed change protects are per-interpreter resources, not global ones. So I would not expect there to be any additional API or synchronization mechanism other than what you've already proposed (except applied to each interpreter instead of the main interpreter. Otherwise users of multiple interpreters will still be subject to the problem you're trying to solve.

As an example, suppose we have a C extension that provides a Python API that allows Python callbacks to be passed in, and then later calls those Python functions on its own non-Python-created thread pool. If this extension is to support sub-interpreters, then either during multi-phase module initialization, or when it receives the Python callback, it must record the PyInterpreterState associated with the callback. Then, in order to invoke the callback on a thread from its thread pool, it must obtain a PyThreadState for the (thread, interpreter) combination, creating one if one does not already exist.

That's literally whatPyGILState_Ensure() is for and does. 😄

To ensure thePyInterpreterState pointers that it holds remain valid, it would need to register an AtExit function for the interpreter that ensures thePyInterpreterState won't be used. ThisAtExit function would likely need to essentially implement its own version of the "finalize block" mechanism introduced here.

Why wouldn't we just exclusively use the mechanism you're proposed here? Why would each interpreter have to have an additional duplicate? Again, the resources we're trying to protect here are specific to each interpreter, not to the global runtime, no?

Given the need for external synchronization of threads when callingPy_EndInterpreter, it seems to me that the finalize block mechanism defined by this PR is only useful for the main interpreter.

Hmm, I didn't catch what external synchonization of threads you are talking about. Sorry if I missed it or misunderstood. Please restate what you mean specifically. Thanks!

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't have any experience with implementing extensions that work with multiple interpreters, but I'm trying to think how that would be done safely.

Let's say this extension lets the user schedule a Python callback to invoked at a specific time, on a global thread pool not created by Python.

With a single interpreter, the extension may either keep a cached PyThreadState per thread in the pool for the main interpreter, or create it on demand. I haven't checked exactly what happens when trying to create a new PyThreadState whilePyFinalize_Ex is running, but I think there is no problem there. The core problem is that the extension could attempt to dispatch a callback just asPy_FinalizeEx is running. Using the existing finalize block mechanism in this PR, the extension can ensure that finalization does not start while a callback is being dispatched, in order to ensure threads in the thread pool won't hang trying to acquire the GIL.

With multiple interpreters, we need a separate PyThreadState per thread in the pool per interpreter, and for each callback that has been scheduled, we also need to store the associatedPyInterpreterState*. However, we also need a way to know that the interpreter is exiting, and cancel any scheduled callbacks, so that we don't attempt to use a danglingPyInterpreterState pointer. IfPyThreadState objects have been cached for the thread pool threads, we would also need to destroy those PyThreadState objects, to avoid violating the constraint ofPy_EndInterpreter. This cancellation mechanism is what I mean by an external synchronization mechanism. Given this external synchronization mechanism, I don't think such an extension would need to use the "finalize block" mechanism. We can't usePyGILState_Ensure because that does not take aPyInterpreterState*, and even if it did, we would need a way to ensure that ourPyInterpreterState* is not dangling.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think I get it now. The "finalize block" API, as proposed, relies on the global runtime state, which is guaranteed to exist in the process, whereas a given interpreter state pointer may have already been freed.

That said, do we continually have the guarantees we might need relative to the global runtime state, since at a certain point we will have freed some of the state the proposed API would need, no? I suppose if we can rely on some final flag for already-finalized then we'd be okay.

As to interpreters, even if the target one has been finalized already, we can still know that. Interpreters may be looked up by ID, rather than referenced by pointer. It's an O(n) operation, of course, but I'm not sure that would be a huge obstacle. Likewise the pointer can be checked against the list of alive interpreters to check for validity. Would we need more than that?

gpshead reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it is safe to assume that there may be numerous race conditions still remaining in the finalization logic. In particular I'd assume that callingPy_Initialize again afterPy_FinalizeEx could definitely result in a lot of problems. I had put that in the category of things not to be done in production code. At least for the single interpreter case, single call toPy_Initialize, I think such bugs could likely be fixed without further API changes, though.

I am unclear on how sub-interpreters should be handled. Checking if thePyInterpreterState* is valid by checking if it is in the list of alive interpreters could fail if the interpreter is freed and then another allocated again at the same address, unless something is done to prevent that. In addition, if a given C/C++ extension only finds out afterwards that an interpreter has been destroyed, it is too late for it to free any PyObjects it has, so it would likely end up leaking memory. Therefore an atexit callback would seem to be more appropriate.

For the single-interpreter case we don't care about leaking memory because the program is about to exit anyway.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Since there are some API questions to resolve here, one option may be to split out the change to make threads hang rather than terminate, which can go in right away, and I expect will be sufficient for almost all single-interpreter use cases. The finalize block API, or other API changes to safely support multi-threading without leaks in the presence of interpreters stopping, could then be added later without blocking the fix for the common case.

ericsnowcurrently reacted with thumbs up emoji
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Agreed@jbms, can you make a PR splitting that part out?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done, see#105805

Comment on lines +40 to +49
#ifndef Py_LIMITED_API
/* Hangs the thread indefinitely without exiting it.
*
* bpo-42969: There is no safe way to exit a thread other than returning
* normally from its start function. This is used during finalization in lieu
* of actually exiting the thread. Since the program is expected to terminate
* soon anyway, it does not matter if the thread stack stays around until then.
*/
PyAPI_FUNC(void) _Py_NO_RETURN _PyThread_hang_thread(void);
#endif /* !Py_LIMITED_API */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

FWIW, this should probably go in Include/cpython/pythread.h.

Also, why the leading underscore? If it's not meant to public use then put it in Include/internal/pycore_pythread.h. Otherwise either drop the leading underscore or add the "PyUnstable_" prefix. (Seehttps://devguide.python.org/developer-workflow/c-api/#c-api, AKA PEP 689.)

gpshead reacted with thumbs up emoji
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done in#105805

@@ -1777,6 +1811,8 @@ Py_FinalizeEx(void)
_PyAtExit_Call(tstate->interp);
PyUnstable_PerfMapState_Fini();

wait_for_exit_blocks_to_be_released();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This should apply to each interpreter (seePy_EndInterpreter()), not just the main interpreter at global runtime finalization, no?

*
* This is only to be called from `Py_FinalizeEx`.
*/
static void wait_for_exit_blocks_to_be_released(void)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
staticvoidwait_for_exit_blocks_to_be_released(void)
staticvoidwait_for_exit_blocks_to_be_released(PyInterpreterState*interp)

Comment on lines +1758 to +1768
/* Note: It is possible that there is another concurrent call to
`Py_FinalizeEx` in a different thread. In that case, the LSB of
`runtime->finalize_blocks` may have already been set to 1. Finalization
will still work correctly, though, because only one thread will
successfully acquire the GIL from `Py_END_ALLOW_THREADS` below. That
thread will then initiate finalization, and the other thread will then
hang in `Py_END_ALLOW_THREADS` until the process exits.

Calling `Py_FinalizeEx` recursively, e.g. by an atexit handler, is *not*
allowed and will not work correctly.
*/

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It shouldn't be possible, but that's not where things are at yet. 😢

Comment on lines +309 to +347
/* Convenience interfaces that terminate the program in case of an error. */
#define MUTEX_INIT(mut) \
if (PyMUTEX_INIT(&(mut))) { \
Py_FatalError("PyMUTEX_INIT(" #mut ") failed"); };
#define MUTEX_FINI(mut) \
if (PyMUTEX_FINI(&(mut))) { \
Py_FatalError("PyMUTEX_FINI(" #mut ") failed"); };
#define MUTEX_LOCK(mut) \
if (PyMUTEX_LOCK(&(mut))) { \
Py_FatalError("PyMUTEX_LOCK(" #mut ") failed"); };
#define MUTEX_UNLOCK(mut) \
if (PyMUTEX_UNLOCK(&(mut))) { \
Py_FatalError("PyMUTEX_UNLOCK(" #mut ") failed"); };

#define COND_INIT(cond) \
if (PyCOND_INIT(&(cond))) { \
Py_FatalError("PyCOND_INIT(" #cond ") failed"); };
#define COND_FINI(cond) \
if (PyCOND_FINI(&(cond))) { \
Py_FatalError("PyCOND_FINI(" #cond ") failed"); };
#define COND_SIGNAL(cond) \
if (PyCOND_SIGNAL(&(cond))) { \
Py_FatalError("PyCOND_SIGNAL(" #cond ") failed"); };
#define COND_BROADCAST(cond) \
if (PyCOND_BROADCAST(&(cond))) { \
Py_FatalError("PyCOND_BROADCAST(" #cond ") failed"); };
#define COND_WAIT(cond, mut) \
if (PyCOND_WAIT(&(cond), &(mut))) { \
Py_FatalError("PyCOND_WAIT(" #cond ") failed"); };
#define COND_TIMED_WAIT(cond, mut, microseconds, timeout_result) \
{ \
int r = PyCOND_TIMEDWAIT(&(cond), &(mut), (microseconds)); \
if (r < 0) \
Py_FatalError("PyCOND_WAIT(" #cond ") failed"); \
if (r) /* 1 == timeout, 2 == impl. can't say, so assume timeout */ \
timeout_result = 1; \
else \
timeout_result = 0; \
} \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is there a good reason to put this in the public header? I'd expect it to go in Include/internal/pycore_condvar.h.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

Co-authored-by: Eric Snow <ericsnowcurrently@gmail.com>
@AlexWaygoodAlexWaygood changed the titlebpo-42969: Hang non-main threads that attempt to acquire the GIL during finalizationgh-87135: Hang non-main threads that attempt to acquire the GIL during finalizationJun 9, 2023
@jbms
Copy link
ContributorAuthor

I have made the requested changes; please review again

This PR can now be considered "on hold" or abandoned given the API questions regarding multiple interpreters. Please see the separate PR#105805 which contains just the minimal change to hang threads without adding any new public APIs.

@bedevere-bot
Copy link

Thanks for making the requested changes!

@gpshead,@ericsnowcurrently: please review the changes made to this pull request.

@ericsnowcurrently
Copy link
Member

It looks like this PR was replaced by#105805, which has been merged now. Is there anything to be done here?

@jbmsjbms closed thisOct 24, 2024
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@gstgstgst left review comments

@vstinnervstinnervstinner left review comments

@eryksuneryksuneryksun left review comments

@Yhg1sYhg1sYhg1s left review comments

@MaxwellDupreMaxwellDupreMaxwellDupre approved these changes

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

@gpsheadgpsheadAwaiting requested review from gpshead

Assignees

@gpsheadgpshead

@Yhg1sYhg1s

Labels
awaiting change reviewneeds backport to 3.12only security fixesneeds backport to 3.13bugs and security fixesstaleStale PR or inactive for long period of time.type-bugAn unexpected behavior, bug, or error
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

14 participants
@jbms@the-knights-who-say-ni@gpshead@vstinner@Skylion007@MaxwellDupre@exarkun@bedevere-bot@ericsnowcurrently@gst@eryksun@Yhg1s@serhiy-storchaka@ezio-melotti

[8]ページ先頭

©2009-2025 Movatter.jp