NotificationsYou must be signed in to change notification settings
Fork32.3k
Star67.8k

gh-132942: Fix races in type lookup cache#133032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

nascheme merged 4 commits intopython:mainfromnascheme:gh-132942-tp-lookup-race

Apr 28, 2025

Merged

gh-132942: Fix races in type lookup cache#133032

nascheme merged 4 commits intopython:mainfromnascheme:gh-132942-tp-lookup-race

Apr 28, 2025

Conversation

Copy link

Member

nascheme commentedApr 27, 2025•
edited
Loading

Two races related to the type lookup cache, when used in the free-threaded build. This caused test_opcache to sometimes fail (as well as other hard to reproduce failures).

The first problem is thatfind_name_in_mro() can block on some mutex and then release critical sections. If that happens, the type version used for the cache entry can be wrong (too new). Assigning the version before doing the find fixes this issue. If it does race, you will add an entry that uses an out-of-date version.

The second problem was much harder to track down. There is a hard to trigger race inupdate_cache(), writing to cache, and_PyType_LookupStackRefAndVersion(), reading from cache. We use a sequence lock to avoid races. However, if the reader reads the old entry value and the new entry version, it will try to execute_Py_TryXGetStackRef() on a stale cache entry value. If that value has been deallocated,PyStackRef_XCLOSE() will crash. This could happen before because the version was written first and then new value second.

The fix is simply to write the entry value first and the version after. That way, the reader always sees a value at least as new as the version.

Possible scenarios for the reader of the cache entry, as it is being written to concurrently:

entry version	entry value	outcome
old	old	Okay, type version will not match
old	new	Okay, incref/decref works, seq check fails
new	old	Bad, incref/decref on old value might crash
new	new	Okay, incref/decref works, seq check fails

Issue:test_opcache fails randomly (failure or crash) #132942

pythongh-132942: Fix races in type lookup cache

90a7d35

Two races related to the type lookup cache, when used in thefree-threaded build.  This caused test_opcache to sometimes fail (aswell as other hard to re-produce failures).

nascheme added type-crash

A hard crash of the interpreter, possibly with a core dump

topic-free-threading labels

Apr 27, 2025

bedevere-appbot mentioned this pull request

Apr 27, 2025

test_opcache fails randomly (failure or crash)#132942

Open

nascheme added the skip news label

Apr 27, 2025

Add NEWS.

5cd03e8

nascheme requested a review fromcolesbury

April 27, 2025 00:52

nascheme marked this pull request as ready for review

April 27, 2025 01:38

nascheme requested a review frommarkshannon as acode owner

April 27, 2025 01:38

bedevere-appbot added the awaiting core review label

Apr 27, 2025

nascheme removed the skip news label

Apr 27, 2025

colesbury reviewed

Apr 28, 2025

View reviewed changes

Objects/typeobject.c OutdatedShow resolvedHide resolved

Copy link

MemberAuthor

nascheme commentedApr 28, 2025

Here is a script that triggers the crash. It can take a while, especially if running under "rr".

crash_mro_lookup.py.txt

nascheme added2 commits

April 28, 2025 11:13

Use release/acquire pair for entry->version.

6d30841

Merge 'origin/main' intopythongh-132942-tp-lookup-race

3e01796

colesbury approved these changes

Apr 28, 2025

View reviewed changes

Copy link

Contributor

colesbury left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

bedevere-appbot added awaiting merge and removed awaiting core review labels

Apr 28, 2025

nascheme merged commit31d1342 intopython:main

Apr 28, 2025

46 checks passed

bedevere-appbot removed the awaiting merge label

Apr 28, 2025

nascheme added the needs backport to 3.13bugs and security fixes label

Apr 28, 2025

Copy link

miss-islington-appbot commentedApr 28, 2025

Thanks@nascheme for the PR 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖

Copy link

miss-islington-appbot commentedApr 28, 2025

Sorry,@nascheme, I could not cleanly backport this to3.13 due to a conflict.
Please backport usingcherry_picker on command line.

cherry_picker 31d1342de9489f95384dbc748130c2ae6f092e84 3.13

miss-islington-appbot assignednascheme

Apr 28, 2025

hugovk removed the needs backport to 3.13bugs and security fixes label

May 22, 2025

Labels

topic-free-threading type-crash

A hard crash of the interpreter, possibly with a core dump

3 participants

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-132942: Fix races in type lookup cache#133032

gh-132942: Fix races in type lookup cache#133032

Uh oh!

Conversation

nascheme commentedApr 27, 2025•
edited
Loading

Uh oh!

Uh oh!