Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-132641: fix race inlru_cache under free-threading#133787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
kumaraditya303 merged 1 commit intopython:mainfromhawkinsp:lrucache
May 13, 2025

Conversation

hawkinsp
Copy link
Contributor

@hawkinsphawkinsp commentedMay 9, 2025
edited by picnixz
Loading

The bounded_lru_cache code was using a critical section on the lru cache object to protect dictionary accesses in some code paths, but using the critical section on the dictionary itself to protect accesses in other code paths. This led to races since not all threads agreed on which lock they needed to be holding.

Instead, always use a critical section on the underlying dictionary, rather than the lru cache object itself.

Fixes#132641

I do not have a small Python reproducer for this problem, but a test in the JAX test suite fails under 3.14 with a TSAN warning 27 out of 50 runs without this fix and 0 out of 50 runs with it.

@hawkinsp
Copy link
ContributorAuthor

@colesbury

@serhiy-storchaka
Copy link
Member

@tom-pytel

@picnixzpicnixz changed the titlegh-132641: Fix race in lru_cache due to inconsistent critical section…gh-132641: Fix race inlru_cacheMay 10, 2025
@picnixzpicnixz changed the titlegh-132641: Fix race inlru_cachegh-132641: fix race inlru_cache under free-threadingMay 10, 2025
Copy link
Member

@serhiy-storchakaserhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM.

@tom-pytel
Copy link
Contributor

tom-pytel commentedMay 10, 2025
edited
Loading

One of the LRU functions is chosen at LRU cache creation,infinite_lru_cache_wrapper,uncached_lru_cache_wrapper orbounded_lru_cache_wrapper. The uncached doesn't use critical section, theinfinite uses the dictionary critical section and thebounded uses a critical section on the cache object itself and as far as I can see it is never changed. Point being there shouldn't be confusion about lock as those functions shouldn't mix.

I do notice something though:

if (_PyDict_SetItem_KnownHash(self->cache,key, (PyObject*)link,

Maybe it should be_PyDict_SetItem_KnownHash_LockHeld, otherwise I think it might try to get the lock on the dict, removing the lock on the cache object temporarily and allowing race. Try that change and see what you get (the other one further down as well).

This would also explain why taking the lock on the dictionary object fixes it (which BTW is fine to do anyway, but maybe confirm source of original error).

@serhiy-storchaka
Copy link
Member

Ifbounded uses any non-_LockHeldPyDict API, there may be an issue.

@tom-pytel
Copy link
Contributor

tom-pytel commentedMay 10, 2025
edited
Loading

Ifbounded uses any non-_LockHeldPyDict API, there may be an issue.

#131758 (comment)

Holding the lock on the dict makes sense in general and should use the_PyDict_SetItem_KnownHash_LockHeld as well.

@hawkinsp
Copy link
ContributorAuthor

Yes, that was my point in#132641 (comment)
It's just simpler to always use the critical section on the dict, I think. You can then safely use the_LockHeld functions if you like, but it is not mandatory, since after this change the dictionary critical section will always be held when reading or writing the dictionary.

tom-pytel reacted with thumbs up emoji

Copy link
Contributor

@kumaraditya303kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This isn't correct, functions likelru_cache_append_link modify thelru_cache_object which isn't protected if critical section is only acquired on the underlying dict.

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@hawkinsp
Copy link
ContributorAuthor

This isn't correct, functions likelru_cache_append_link modify thelru_cache_object which isn't protected if critical section is only acquired on the underlying dict.

I don't understand or agree with this comment.

It does not matter onwhich object we hold a critical section, so long as we consistently hold thesame critical section whenever we are reading or writing fields of the object. After this PR, we arealways using the dict's critical section to protect the lru_cache object as well. The critical section on the lru_cache object itself would never be used.

@tom-pytel
Copy link
Contributor

It does not matter onwhich object we hold a critical section, so long as we consistently hold thesame critical section whenever we are reading or writing fields of the object.

Correct, the lru_cache fields wind up being protected by the dictionary lock. The only thing I would ask is to change the_PyDict_SetItem_KnownHash to_PyDict_SetItem_KnownHash_LockHeld for consistency, even if its not needed.

@kumaraditya303
Copy link
Contributor

The following code triggers TSAN warnings on this PR:

fromfunctoolsimportlru_cachefromthreadingimportThread,Barrierfromrandomimportrandintb=Barrier(100)@lru_cache(maxsize=128)deffunc(arg=0):func.cache_clear()returnobject()defthread_func():b.wait()foriinrange(10000):func(randint(0,10000))threads= []foriinrange(100):t=Thread(target=thread_func)threads.append(t)t.start()fortinthreads:t.join()

TSAN race:

==================WARNING: ThreadSanitizer: data race (pid=26580)  Write of size 8 at 0x7f250e4f7be8 by thread T5:    #0 lru_cache_unlink_list /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1589:16 (python+0x5bc97a) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #1 _functools__lru_cache_wrapper_cache_clear_impl /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1689:27 (python+0x5bc97a)    #2 _functools__lru_cache_wrapper_cache_clear /home/realkumaraditya/cpython/./Modules/clinic/_functoolsmodule.c.h:190:20 (python+0x5bc97a)    #3 method_vectorcall_NOARGS /home/realkumaraditya/cpython/Objects/descrobject.c:448:24 (python+0x20ef99) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #4 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1f61ba) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #5 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f61ba)    #6 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:1619:35 (python+0x40429e) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #7 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3ff170) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #8 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3ff170)    #9 _PyFunction_Vectorcall /home/realkumaraditya/cpython/Objects/call.c (python+0x1f681f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #10 _PyVectorcall_Call /home/realkumaraditya/cpython/Objects/call.c:273:16 (python+0x1f64af) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #11 _PyObject_Call /home/realkumaraditya/cpython/Objects/call.c:348:16 (python+0x1f64af)    #12 PyObject_Call /home/realkumaraditya/cpython/Objects/call.c:373:12 (python+0x1f6515) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #13 bounded_lru_cache_wrapper /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1493:14 (python+0x5bd50e) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #14 lru_cache_call /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1638:14 (python+0x5bbbad) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #15 _PyObject_MakeTpCall /home/realkumaraditya/cpython/Objects/call.c:242:18 (python+0x1f5658) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #16 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:167:16 (python+0x1f6278) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #17 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f6278)    #18 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:1619:35 (python+0x40429e) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #19 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3ff170) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #20 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3ff170)    #21 _PyFunction_Vectorcall /home/realkumaraditya/cpython/Objects/call.c (python+0x1f681f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #22 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1fb13f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #23 method_vectorcall /home/realkumaraditya/cpython/Objects/classobject.c:72:20 (python+0x1fb13f)    #24 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x4584e7) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #25 context_run /home/realkumaraditya/cpython/Python/context.c:728:29 (python+0x4584e7)    #26 method_vectorcall_FASTCALL_KEYWORDS /home/realkumaraditya/cpython/Objects/descrobject.c:421:24 (python+0x20ee99) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #27 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1f61ba) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #28 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f61ba)    #29 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:1619:35 (python+0x40429e) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #30 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3ff170) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #31 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3ff170)    #32 _PyFunction_Vectorcall /home/realkumaraditya/cpython/Objects/call.c (python+0x1f681f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #33 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1fb13f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #34 method_vectorcall /home/realkumaraditya/cpython/Objects/classobject.c:72:20 (python+0x1fb13f)    #35 _PyVectorcall_Call /home/realkumaraditya/cpython/Objects/call.c:273:16 (python+0x1f64af) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #36 _PyObject_Call /home/realkumaraditya/cpython/Objects/call.c:348:16 (python+0x1f64af)    #37 PyObject_Call /home/realkumaraditya/cpython/Objects/call.c:373:12 (python+0x1f6515) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #38 thread_run /home/realkumaraditya/cpython/./Modules/_threadmodule.c:353:21 (python+0x5acdd2) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #39 pythread_wrapper /home/realkumaraditya/cpython/Python/thread_pthread.h:242:5 (python+0x504887) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)  Previous write of size 8 at 0x7f250e4f7be8 by thread T4:    #0 lru_cache_append_link /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1248:16 (python+0x5bda15) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #1 bounded_lru_cache_update_lock_held /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1391:9 (python+0x5bda15)    #2 bounded_lru_cache_wrapper /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1500:14 (python+0x5bda15)    #3 lru_cache_call /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1638:14 (python+0x5bbbad) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #4 _PyObject_MakeTpCall /home/realkumaraditya/cpython/Objects/call.c:242:18 (python+0x1f5658) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #5 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:167:16 (python+0x1f6278) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #6 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f6278)    #7 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:1619:35 (python+0x40429e) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #8 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3ff170) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #9 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3ff170)    #10 _PyFunction_Vectorcall /home/realkumaraditya/cpython/Objects/call.c (python+0x1f681f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #11 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1fb13f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #12 method_vectorcall /home/realkumaraditya/cpython/Objects/classobject.c:72:20 (python+0x1fb13f)    #13 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x4584e7) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #14 context_run /home/realkumaraditya/cpython/Python/context.c:728:29 (python+0x4584e7)    #15 method_vectorcall_FASTCALL_KEYWORDS /home/realkumaraditya/cpython/Objects/descrobject.c:421:24 (python+0x20ee99) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #16 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1f61ba) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #17 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f61ba)    #18 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:1619:35 (python+0x40429e) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #19 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3ff170) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #20 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3ff170)    #21 _PyFunction_Vectorcall /home/realkumaraditya/cpython/Objects/call.c (python+0x1f681f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #22 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:169:11 (python+0x1fb13f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #23 method_vectorcall /home/realkumaraditya/cpython/Objects/classobject.c:72:20 (python+0x1fb13f)    #24 _PyVectorcall_Call /home/realkumaraditya/cpython/Objects/call.c:273:16 (python+0x1f64af) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #25 _PyObject_Call /home/realkumaraditya/cpython/Objects/call.c:348:16 (python+0x1f64af)    #26 PyObject_Call /home/realkumaraditya/cpython/Objects/call.c:373:12 (python+0x1f6515) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #27 thread_run /home/realkumaraditya/cpython/./Modules/_threadmodule.c:353:21 (python+0x5acdd2) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #28 pythread_wrapper /home/realkumaraditya/cpython/Python/thread_pthread.h:242:5 (python+0x504887) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)  Thread T5 'Thread-5 (threa' (tid=26589, running) created by main thread at:    #0 pthread_create <null> (python+0xe21df) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #1 do_start_joinable_thread /home/realkumaraditya/cpython/Python/thread_pthread.h:289:14 (python+0x503718) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #2 PyThread_start_joinable_thread /home/realkumaraditya/cpython/Python/thread_pthread.h:331:9 (python+0x50353a) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #3 ThreadHandle_start /home/realkumaraditya/cpython/./Modules/_threadmodule.c:439:9 (python+0x5ac967) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #4 do_start_new_thread /home/realkumaraditya/cpython/./Modules/_threadmodule.c:1849:9 (python+0x5ac967)    #5 thread_PyThread_start_joinable_thread /home/realkumaraditya/cpython/./Modules/_threadmodule.c:1964:14 (python+0x5ab731) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #6 cfunction_call /home/realkumaraditya/cpython/Objects/methodobject.c:565:18 (python+0x2a2fa7) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #7 _PyObject_MakeTpCall /home/realkumaraditya/cpython/Objects/call.c:242:18 (python+0x1f5658) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #8 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:167:16 (python+0x1f6278) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #9 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f6278)    #10 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:3234:35 (python+0x40a577) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #11 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3fecaf) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #12 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3fecaf)    #13 PyEval_EvalCode /home/realkumaraditya/cpython/Python/ceval.c:856:21 (python+0x3fecaf)    #14 run_eval_code_obj /home/realkumaraditya/cpython/Python/pythonrun.c:1365:12 (python+0x4e1041) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #15 run_mod /home/realkumaraditya/cpython/Python/pythonrun.c:1436:19 (python+0x4e1041)    #16 pyrun_file /home/realkumaraditya/cpython/Python/pythonrun.c:1293:15 (python+0x4dc530) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #17 _PyRun_SimpleFileObject /home/realkumaraditya/cpython/Python/pythonrun.c:521:13 (python+0x4dc530)    #18 _PyRun_AnyFileObject /home/realkumaraditya/cpython/Python/pythonrun.c:81:15 (python+0x4dbc88) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #19 pymain_run_file_obj /home/realkumaraditya/cpython/Modules/main.c:402:15 (python+0x52199f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #20 pymain_run_file /home/realkumaraditya/cpython/Modules/main.c:421:15 (python+0x52199f)    #21 pymain_run_python /home/realkumaraditya/cpython/Modules/main.c:686:21 (python+0x520ccf) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #22 Py_RunMain /home/realkumaraditya/cpython/Modules/main.c:767:5 (python+0x520ccf)    #23 pymain_main /home/realkumaraditya/cpython/Modules/main.c:797:12 (python+0x521238) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #24 Py_BytesMain /home/realkumaraditya/cpython/Modules/main.c:821:12 (python+0x5212bb) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #25 main /home/realkumaraditya/cpython/./Programs/python.c:15:12 (python+0x1607eb) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)  Thread T4 'Thread-4 (threa' (tid=26588, running) created by main thread at:    #0 pthread_create <null> (python+0xe21df) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #1 do_start_joinable_thread /home/realkumaraditya/cpython/Python/thread_pthread.h:289:14 (python+0x503718) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #2 PyThread_start_joinable_thread /home/realkumaraditya/cpython/Python/thread_pthread.h:331:9 (python+0x50353a) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #3 ThreadHandle_start /home/realkumaraditya/cpython/./Modules/_threadmodule.c:439:9 (python+0x5ac967) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #4 do_start_new_thread /home/realkumaraditya/cpython/./Modules/_threadmodule.c:1849:9 (python+0x5ac967)    #5 thread_PyThread_start_joinable_thread /home/realkumaraditya/cpython/./Modules/_threadmodule.c:1964:14 (python+0x5ab731) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #6 cfunction_call /home/realkumaraditya/cpython/Objects/methodobject.c:565:18 (python+0x2a2fa7) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #7 _PyObject_MakeTpCall /home/realkumaraditya/cpython/Objects/call.c:242:18 (python+0x1f5658) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #8 _PyObject_VectorcallTstate /home/realkumaraditya/cpython/./Include/internal/pycore_call.h:167:16 (python+0x1f6278) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #9 PyObject_Vectorcall /home/realkumaraditya/cpython/Objects/call.c:327:12 (python+0x1f6278)    #10 _PyEval_EvalFrameDefault /home/realkumaraditya/cpython/Python/generated_cases.c.h:3234:35 (python+0x40a577) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #11 _PyEval_EvalFrame /home/realkumaraditya/cpython/./Include/internal/pycore_ceval.h:119:16 (python+0x3fecaf) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #12 _PyEval_Vector /home/realkumaraditya/cpython/Python/ceval.c:1964:12 (python+0x3fecaf)    #13 PyEval_EvalCode /home/realkumaraditya/cpython/Python/ceval.c:856:21 (python+0x3fecaf)    #14 run_eval_code_obj /home/realkumaraditya/cpython/Python/pythonrun.c:1365:12 (python+0x4e1041) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #15 run_mod /home/realkumaraditya/cpython/Python/pythonrun.c:1436:19 (python+0x4e1041)    #16 pyrun_file /home/realkumaraditya/cpython/Python/pythonrun.c:1293:15 (python+0x4dc530) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #17 _PyRun_SimpleFileObject /home/realkumaraditya/cpython/Python/pythonrun.c:521:13 (python+0x4dc530)    #18 _PyRun_AnyFileObject /home/realkumaraditya/cpython/Python/pythonrun.c:81:15 (python+0x4dbc88) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #19 pymain_run_file_obj /home/realkumaraditya/cpython/Modules/main.c:402:15 (python+0x52199f) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #20 pymain_run_file /home/realkumaraditya/cpython/Modules/main.c:421:15 (python+0x52199f)    #21 pymain_run_python /home/realkumaraditya/cpython/Modules/main.c:686:21 (python+0x520ccf) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #22 Py_RunMain /home/realkumaraditya/cpython/Modules/main.c:767:5 (python+0x520ccf)    #23 pymain_main /home/realkumaraditya/cpython/Modules/main.c:797:12 (python+0x521238) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #24 Py_BytesMain /home/realkumaraditya/cpython/Modules/main.c:821:12 (python+0x5212bb) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)    #25 main /home/realkumaraditya/cpython/./Programs/python.c:15:12 (python+0x1607eb) (BuildId: 6a926f53d2222342de04cdb51d70c5f1b710c0bd)SUMMARY: ThreadSanitizer: data race /home/realkumaraditya/cpython/./Modules/_functoolsmodule.c:1589:16 in lru_cache_unlink_list==================

@tom-pytel
Copy link
Contributor

The following code triggers TSAN warnings on this PR:

The@critical_sections on_functools__lru_cache_wrapper_cache_info_impl and_functools__lru_cache_wrapper_cache_clear_impl would need to be changed to the dict lock, so that's a point for keeping the lock on the lru_cache object.

@kumaraditya303
Copy link
Contributor

It does not matter on which object we hold a critical section, so long as we consistently hold the same critical section whenever we are reading or writing fields of the object. After this PR, we are always using the dict's critical section to protect the lru_cache object as well. The critical section on the lru_cache object itself would never be used.

Acquiring critical section on dict only provides exclusive access to the dict not the lru cache object, as you can see there are new data races introduced by this PR because it doesn't lock the lru cache object.

See critical section section athttps://peps.python.org/pep-0703/#python-critical-sections

@hawkinsp
Copy link
ContributorAuthor

hawkinsp commentedMay 11, 2025
edited
Loading

It does not matter on which object we hold a critical section, so long as we consistently hold the same critical section whenever we are reading or writing fields of the object. After this PR, we are always using the dict's critical section to protect the lru_cache object as well. The critical section on the lru_cache object itself would never be used.

Acquiring critical section on dict only provides exclusive access to the dict not the lru cache object, as you can see there are new data races introduced by this PR because it doesn't lock the lru cache object.

See critical section section athttps://peps.python.org/pep-0703/#python-critical-sections

Again, it does not matter which lock we use to provide exclusive access, so long as we are consistent. Just because a mutex is attached to a particular object, it does not mean it has to protect that object alone.

The dict lock will work fine, so long as it is used consistently. Not being a frequent CPython contributor, I did not know that@critical_section was a thing and that's why I didn't update it in this PR. I can fix that, if we like.

You could use the lru cache critical section, but then you have be carefulnever to mix the unlocked and locked dict APIs. e.g., I think callingPyDict_Clear() here is unsafe

PyDict_Clear(_self->cache);

because it will acquire the dict's critical section, and that can release the critical section on the lru cache, leading to the sort of race I originally reported: some callers will hold the dict lock when updating the dict and not the lru cache lock, whereas others will hold the lru cache lock but not the dict lock. Similarly you'd have the same problem for:
https://github.com/python/cpython/blob/0eb448cae5e9008f815204d8b46bfd7cd641a152/Modules/_functoolsmodule.c#L1416C11-L1416C32

It would be fine if all callers used thelocked dictionary APIs, of course. But as far as I can tell there is no_PyDict_Clear_LockHeld, for example, so we'd have to add one. If you are happy for me to go and add_LockHeld variants of everyPyDict API that the LRU cache calls, we could use that lock instead.

I admit I am slightly vague on under exactly what condition acquiring an inner critical section will release the outer critical section. However, it certainly can be triggered in practice, see the original bug report.

@tom-pytel
Copy link
Contributor

Similarly you'd have the same problem for:
https://github.com/python/cpython/blob/0eb448cae5e9008f815204d8b46bfd7cd641a152/Modules/_functoolsmodule.c#L1416C11-L1416C32

In this case no because this function assumes lock, should really be called_PyDict_Pop_KnownHash_LockHeld:

ASSERT_DICT_LOCKED(mp);

But you are right aboutPyDict_Clear(), it would drop the lock on lru_cache as soon as it took its own, so that's kind of a brick wall. You could lock both simultaneously there with aPy_BEGIN_CRITICAL_SECTION2(lru_cache, dict) maybe, not sure if that would drop the lru_cache lock when dict took its own already-locked. But if locking the dict then why bother locking the lru_cache at all? Only to keep the all-objects-use-own-lock paradigm? Is that a something we are trying to enforce?

@hawkinsp
Copy link
ContributorAuthor

Please add some tests like the one I posted above totest_free_threading

Done.

@hawkinsp
Copy link
ContributorAuthor

Done.

@hawkinsp
Copy link
ContributorAuthor

Done. Incidentally, with that fix, this test now reliably triggers#129748 with high probability run under tsan. However, that is not a new problem in this PR.

…ection use.The bounded_lru_cache code was using a critical section on the lru cacheobject to protect dictionary accesses in some code paths, but using thecritical section on the dictionary itself to protect accesses in othercode paths. This led to races since not all threads agreed on which lockthey needed to be holding.Instead, always use a critical section on the underlying dictionary,rather than the lru cache object itself.Fixespython#132641
Copy link
Contributor

@kumaraditya303kumaraditya303 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM, thanks!

@kumaraditya303kumaraditya303enabled auto-merge (squash)May 13, 2025 17:32
@kumaraditya303kumaraditya303 merged commit9ad0c7b intopython:mainMay 13, 2025
46 of 47 checks passed
@miss-islington-app
Copy link

Thanks@hawkinsp for the PR, and@kumaraditya303 for merging it 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMay 13, 2025
…GH-133787)Fix race in `lru_cache` by acquiring critical section on the cache object itself and call the lock held variant of dict functions to modify the underlying dict.(cherry picked from commit9ad0c7b)Co-authored-by: Peter Hawkins <phawkins@google.com>
@bedevere-app
Copy link

GH-133979 is a backport of this pull request to the3.14 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.14bugs and security fixes labelMay 13, 2025
kumaraditya303 pushed a commit that referenced this pull requestMay 14, 2025
…3787) (#133979)gh-132641: fix race in `lru_cache` under free-threading (GH-133787)Fix race in `lru_cache` by acquiring critical section on the cache object itself and call the lock held variant of dict functions to modify the underlying dict.(cherry picked from commit9ad0c7b)Co-authored-by: Peter Hawkins <phawkins@google.com>
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@eendebakpteendebakpteendebakpt left review comments

@serhiy-storchakaserhiy-storchakaserhiy-storchaka approved these changes

@kumaraditya303kumaraditya303kumaraditya303 approved these changes

@rhettingerrhettingerAwaiting requested review from rhettingerrhettinger is a code owner

@methanemethaneAwaiting requested review from methanemethane is a code owner

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

Assignees

@kumaraditya303kumaraditya303

Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

Data race between compare_generic and insert_combined_dict under free-threading
5 participants
@hawkinsp@serhiy-storchaka@tom-pytel@kumaraditya303@eendebakpt

[8]ページ先頭

©2009-2025 Movatter.jp