Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-114271: Fix race inThread.join()#114839

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
pitrou merged 34 commits intopython:mainfrommpage:gh-114271-remove-tstate_lock
Mar 16, 2024

Conversation

@mpage
Copy link
Contributor

@mpagempage commentedFeb 1, 2024
edited
Loading

This PR makes thread joining thread-safe in free-threaded builds andfixesgh-116372, which affects both default and free-threaded builds.

The combination of_tstate_lock and_ThreadHandle is replaced by_ThreadHandle and the supporting infrastructure for_tstate_lock is removed. Joining a thread now joins its_ThreadHandle. The_ThreadHandles for all threads that we want to join on shutdown (non-daemon threads that were created by the threading module) are tracked in a linked-list attached to the_thread module. At shutdown we iterate over this list and join the handles.

There is one wrinkle with this approach: the_MainThread instance. The_MainThread instance is joinable. Previously, it represented whatever thread imported thethreading module. Typically this is the main thread of the main interpreter, but in some rare circumstances (embedding scenarios?) it might not be. The previous approach allowed theMainThread instance to be joined even if it wasn't the main thread of the main interpreter, but this isn't possible with the current approach. To work around this we change the_MainThread to always represent the main thread of the runtime, also fixinggh-83223.

stonebig reacted with heart emoji
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following executioninvolving threads A, B, and C:1. A starts.2. B joins A, blocking on its `_tstate_lock`.3. C joins A, blocking on its `_tstate_lock`.4. A finishes and releases its `_tstate_lock`.5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped   out before calling `_stop()`.6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped   out before releasing it.7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.   However, C holds it, so the assertion fails.The race can be reproduced[^3] by inserting sleeps at the appropriate points inthe threading code. To do so, run the `repro_join_race.py` from the linked repo.There are two main parts to this PR:1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.   The event is set by the runtime prior to the thread being cleared (in the same   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the   event to be set.2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on   `_tstate_lock`s.[^1]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201[^2]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115[^3]:8194653
@mpage
Copy link
ContributorAuthor

Just a heads up - please don't look at this until it's marked as ready for review. There are a couple of failing CI jobs that need to be addressed.

When we have `typedef struct _PyEventRc _PyEventRc` in `pstate.h`the C analyzer can't decide between that and the definition in `pycore_lock.h`.
@mpagempage marked this pull request as ready for reviewFebruary 5, 2024 22:49
@mpage
Copy link
ContributorAuthor

@colesbury - This is good to go. Would you please have a look?

@mpagempage marked this pull request as draftFebruary 6, 2024 19:52
@mpage
Copy link
ContributorAuthor

Putting this back in draft. Going to see if I can move the join logic intoThreadHandle.

@mpagempage marked this pull request as ready for reviewMarch 5, 2024 20:30
@mpage
Copy link
ContributorAuthor

@colesbury - Could you take a look at this when you get a chance, please?

colesbury reacted with thumbs up emoji

Copy link
Contributor

@colesburycolesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Overall, the approach looks nice!

I'm still looking this over, but I have some questions aboutthread._handle. Currently,thread._handle starts out asNone and is then set to aThreadHandle after the thread is started. ThisNone ->ThreadHandle transition seems like it adds complexity and I think there may be a few bugs related to it (see comments below).

Would it be possible to always have a non-Noneself._handle by pushing a bit more state intoThreadHandle?

@ericsnowcurrently
Copy link
Member

I'd be glad to follow-through on the changes I posted ingh-116776, if that would help you.

@mpage
Copy link
ContributorAuthor

I'd be glad to follow-through on the changes I posted in#116776, if that would help you.

@ericsnowcurrently - Thanks! Is that something you think you can get done quickly (i.e. in the next few days)? If so, I'd be happy to stack on top of those changes once you're done.

@ericsnowcurrently
Copy link
Member

Yeah, I'll aim to wrap it up tomorrow.

mpage reacted with heart emoji

Copy link
Member

@pitroupitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

+1. I really like the simplification here!

mpage reacted with heart emoji
@ericsnowcurrently
Copy link
Member

I'll defer to@pitrou's assessment on this. I'd rather this land sooner than later and don't want my pace to hold things up. Again, thanks for working on this.

(I'll follow up separately on any cleanup, about the "main" thread, and about our approach to managing thread lifetimes. I actually appreciate not feeling so rushed to sort those out. 😄)

mpage reacted with thumbs up emoji

@ericsnowcurrently
Copy link
Member

FWIW, this change feels a bit closer to what I had in mind last year:#110848 (comment).

@mpage
Copy link
ContributorAuthor

mpage commentedMar 15, 2024
edited
Loading

TheCIFuzz (memory) failure is unrelated to this PR.

pitrou reacted with thumbs up emoji

@pitrou
Copy link
Member

pitrou commentedMar 15, 2024
edited
Loading

Do we have a Thread Sanitizer CI build somewhere? In my experience it's quite good at finding synchronization bugs.
@gpshead

Edit: oh, I see this is being proposed as#116872. Thank you@corona10 !

corona10 reacted with heart emoji

@pitrou
Copy link
Member

@gpshead Would you like to take a look at this?

@pitrou
Copy link
Member

Ok, TSAN tests passed after merging from main (though there are too few of them currently).

@pitrou
Copy link
Member

Updated again to get test_threading in the set of TSAN tests.

@pitroupitrou merged commit33da0e8 intopython:mainMar 16, 2024
vstinner pushed a commit to vstinner/cpython that referenced this pull requestMar 20, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following executioninvolving threads A, B, and C:1. A starts.2. B joins A, blocking on its `_tstate_lock`.3. C joins A, blocking on its `_tstate_lock`.4. A finishes and releases its `_tstate_lock`.5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped   out before calling `_stop()`.6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped   out before releasing it.7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.   However, C holds it, so the assertion fails.The race can be reproduced[^3] by inserting sleeps at the appropriate points inthe threading code. To do so, run the `repro_join_race.py` from the linked repo.There are two main parts to this PR:1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.   The event is set by the runtime prior to the thread being cleared (in the same   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the   event to be set.2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on   `_tstate_lock`s.[^1]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201[^2]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115[^3]:mpage@8194653---------Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>Co-authored-by: Antoine Pitrou <antoine@python.org>
adorilson pushed a commit to adorilson/cpython that referenced this pull requestMar 25, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following executioninvolving threads A, B, and C:1. A starts.2. B joins A, blocking on its `_tstate_lock`.3. C joins A, blocking on its `_tstate_lock`.4. A finishes and releases its `_tstate_lock`.5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped   out before calling `_stop()`.6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped   out before releasing it.7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.   However, C holds it, so the assertion fails.The race can be reproduced[^3] by inserting sleeps at the appropriate points inthe threading code. To do so, run the `repro_join_race.py` from the linked repo.There are two main parts to this PR:1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.   The event is set by the runtime prior to the thread being cleared (in the same   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the   event to be set.2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on   `_tstate_lock`s.[^1]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201[^2]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115[^3]:mpage@8194653---------Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>Co-authored-by: Antoine Pitrou <antoine@python.org>
diegorusso pushed a commit to diegorusso/cpython that referenced this pull requestApr 17, 2024
There is a race between when `Thread._tstate_lock` is released[^1] in `Thread._wait_for_tstate_lock()`and when `Thread._stop()` asserts[^2] that it is unlocked. Consider the following executioninvolving threads A, B, and C:1. A starts.2. B joins A, blocking on its `_tstate_lock`.3. C joins A, blocking on its `_tstate_lock`.4. A finishes and releases its `_tstate_lock`.5. B acquires A's `_tstate_lock` in `_wait_for_tstate_lock()`, releases it, but is swapped   out before calling `_stop()`.6. C is scheduled, acquires A's `_tstate_lock` in `_wait_for_tstate_lock()` but is swapped   out before releasing it.7. B is scheduled, calls `_stop()`, which asserts that A's `_tstate_lock` is not held.   However, C holds it, so the assertion fails.The race can be reproduced[^3] by inserting sleeps at the appropriate points inthe threading code. To do so, run the `repro_join_race.py` from the linked repo.There are two main parts to this PR:1. `_tstate_lock` is replaced with an event that is attached to `PyThreadState`.   The event is set by the runtime prior to the thread being cleared (in the same   place that `_tstate_lock` was released). `Thread.join()` blocks waiting for the   event to be set.2. `_PyInterpreterState_WaitForThreads()` provides the ability to wait for all   non-daemon threads to exit. To do so, an `is_daemon` predicate was added to   `PyThreadState`. This field is set each time a thread is created. `threading._shutdown()`   now calls into `_PyInterpreterState_WaitForThreads()` instead of waiting on   `_tstate_lock`s.[^1]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1201[^2]:https://github.com/python/cpython/blob/441affc9e7f419ef0b68f734505fa2f79fe653c7/Lib/threading.py#L1115[^3]:mpage@8194653---------Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>Co-authored-by: Antoine Pitrou <antoine@python.org>
intellij-monorepo-bot pushed a commit to JetBrains/intellij-community that referenced this pull requestJun 21, 2024
…hon 3.13`Thread._is_stopped` was removed inpython/cpython#114839 in Python 3.13(cherry picked from commit d568bc288aed00268ffeef137b9b901f480964ef)IJ-MR-137093GitOrigin-RevId: aa8b27b09db85f69b44d6fbbd333f8ec32489d33
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@colesburycolesburycolesbury approved these changes

@ericsnowcurrentlyericsnowcurrentlyericsnowcurrently requested changes

@pitroupitroupitrou approved these changes

@gpsheadgpsheadAwaiting requested review from gpshead

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Race inThread.join()

5 participants

@mpage@ericsnowcurrently@pitrou@colesbury@Eclips4

[8]ページ先頭

©2009-2025 Movatter.jp