Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34.1k
Closed
Description
Bug report
Bug description:
I don't have a succinct reproducer for this bug yet, but I saw the following race in JAX CI:
WARNING: ThreadSanitizer: data race (pid=208275) Write of size 8 at 0x555555d43b60 by thread T121: #0 grow_thread_array /__w/jax/jax/cpython/Python/qsbr.c:101:19 (python3.13+0x4a3905) (BuildId: 8f8869b5f3143bd14dda26aa2bf37336b4902370) #1 _Py_qsbr_reserve /__w/jax/jax/cpython/Python/qsbr.c:203:13 (python3.13+0x4a3905) #2 new_threadstate /__w/jax/jax/cpython/Python/pystate.c:1569:27 (python3.13+0x497df2) (BuildId: 8f8869b5f3143bd14dda26aa2bf37336b4902370) #3 PyGILState_Ensure /__w/jax/jax/cpython/Python/pystate.c:2766:16 (python3.13+0x49af78) (BuildId: 8f8869b5f3143bd14dda26aa2bf37336b4902370) #4 nanobind::gil_scoped_acquire::gil_scoped_acquire() /proc/self/cwd/external/nanobind/include/nanobind/nb_misc.h:15:43 (xla_extension.so+0xa4fe551) (BuildId: 32eac14928efa68545d22a6013f16aa63a686fef) #5 xla::CpuCallback::PrepareAndCall(void*, void**) /proc/self/cwd/external/xla/xla/python/callback.cc:67:26 (xla_extension.so+0xa4fe551) #6 xla::XlaPythonCpuCallback(void*, void**, XlaCustomCallStatus_*) /proc/self/cwd/external/xla/xla/python/callback.cc:177:22 (xla_extension.so+0xa500c9a) (BuildId: 32eac14928efa68545d22a6013f16aa63a686fef)... Previous read of size 8 at 0x555555d43b60 by thread T124: #0 _Py_qsbr_reserve /__w/jax/jax/cpython/Python/qsbr.c:216:47 (python3.13+0x4a3ad7) (BuildId: 8f8869b5f3143bd14dda26aa2bf37336b4902370) #1 new_threadstate /__w/jax/jax/cpython/Python/pystate.c:1569:27 (python3.13+0x497df2) (BuildId: 8f8869b5f3143bd14dda26aa2bf37336b4902370) #2 PyGILState_Ensure /__w/jax/jax/cpython/Python/pystate.c:2766:16 (python3.13+0x49af78) (BuildId: 8f8869b5f3143bd14dda26aa2bf37336b4902370) #3 nanobind::gil_scoped_acquire::gil_scoped_acquire() /proc/self/cwd/external/nanobind/include/nanobind/nb_misc.h:15:43 (xla_extension.so+0xa4fe551) (BuildId: 32eac14928efa68545d22a6013f16aa63a686fef) #4 xla::CpuCallback::PrepareAndCall(void*, void**) /proc/self/cwd/external/xla/xla/python/callback.cc:67:26 (xla_extension.so+0xa4fe551) #5 xla::XlaPythonCpuCallback(void*, void**, XlaCustomCallStatus_*) /proc/self/cwd/external/xla/xla/python/callback.cc:177:22 (xla_extension.so+0xa500c9a) (BuildId: 32eac14928efa68545d22a6013f16aa63a686fef)...I think what's happening here is that two threads that werenot created by Python are callingPyGILState_Ensure concurrently, so they can call into CPython APIs.
This appears to be an unlocked access onshared->array and it would probably be sufficient to move that read under the mutex in_Py_qsbr_reserve.
CPython versions tested on:
3.13
Operating systems tested on:
Linux