Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue36402

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:threading._shutdown() race condition: test_threading test_threads_join_2() fails randomly
Type:Stage:resolved
Components:Library (Lib)Versions:Python 3.9, Python 3.8, Python 3.7
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: eric.snow, koobs, kristjan.jonsson, pablogsal, pitrou, vstinner
Priority:normalKeywords:patch

Created on2019-03-22 23:18 byvstinner, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
py_finalize.patchvstinner,2019-06-10 23:10
Pull Requests
URLStatusLinkedEdit
PR 13889closedvstinner,2019-06-07 10:53
PR 13946closedvstinner,2019-06-10 20:45
PR 13948mergedvstinner,2019-06-10 22:39
PR 13949closedvstinner,2019-06-10 23:04
PR 14033closedmiss-islington,2019-06-12 23:30
PR 14034closedmiss-islington,2019-06-12 23:30
PR 14047mergedvstinner,2019-06-13 08:58
PR 14050mergedvstinner,2019-06-13 10:11
PR 14054mergedmiss-islington,2019-06-13 11:45
Messages (16)
msg338622 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-03-22 23:18
https://buildbot.python.org/all/#/builders/168/builds/8010:23:17 load avg: 5.00 [334/420/1] test_threading crashed (Exit code -6) -- running: test_tools (8 min 42 sec), test_multiprocessing_spawn (5 min 41 sec), test_zipfile (30 sec 787 ms)Fatal Python error: Py_EndInterpreter: not the last threadCurrent thread 0x0000000800acd000 (most recent call first):  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/support/__init__.py", line 2778 in run_in_subinterp  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_threading.py", line 917 in test_threads_join_2  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/case.py", line 642 in run  ...The test crashed once, but then passed when run again in verbose mode ("Re-running test 'test_threading' in verbose mode").
msg338623 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-03-22 23:20
As far as I remember, test_threads_join_2() was already unstable.I created this issue to try to track if it's a regression or not.If it's a regression, I would suggest to have a look at Eric Snow's recent commits.At this point, I simply have no idea if the test fails exactly one in the lifetime of the buildbot worker, or if it started to fail frequently on this FreeBSD buildbot.
msg344611 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-04 17:36
New failure:AMD64 FreeBSD CURRENT Shared 3.x:https://buildbot.python.org/all/#/builders/168/builds/1154...test_thread_leak (test.test_threading.PyRLockTests) ... oktest_timeout (test.test_threading.PyRLockTests) ... oktest_try_acquire (test.test_threading.PyRLockTests) ... oktest_try_acquire_contended (test.test_threading.PyRLockTests) ... oktest_weakref_deleted (test.test_threading.PyRLockTests) ... oktest_weakref_exists (test.test_threading.PyRLockTests) ... oktest_with (test.test_threading.PyRLockTests) ... oktest_acquire (test.test_threading.SemaphoreTests) ... oktest_acquire_contended (test.test_threading.SemaphoreTests) ... oktest_acquire_destroy (test.test_threading.SemaphoreTests) ... oktest_acquire_timeout (test.test_threading.SemaphoreTests) ... oktest_constructor (test.test_threading.SemaphoreTests) ... oktest_default_value (test.test_threading.SemaphoreTests) ... oktest_release_unacquired (test.test_threading.SemaphoreTests) ... oktest_try_acquire (test.test_threading.SemaphoreTests) ... oktest_try_acquire_contended (test.test_threading.SemaphoreTests) ... oktest_with (test.test_threading.SemaphoreTests) ... oktest_daemon_threads_fatal_error (test.test_threading.SubinterpThreadingTests) ... oktest_threads_join (test.test_threading.SubinterpThreadingTests) ... okFatal Python error: Py_EndInterpreter: not the last threadCurrent thread 0x0000000800ac3000 (most recent call first):  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/support/__init__.py", line 2794 in run_in_subinterp  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_threading.py", line 923 in test_threads_join_2test_threads_join_2 (test.test_threading.SubinterpThreadingTests) ... *** Signal 6Stop.make: stopped in /usr/home/buildbot/python/3.x.koobs-freebsd-current/build  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/case.py", line 628 in _callTestMethod  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/case.py", line 671 in run  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/case.py", line 731 in __call__  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/suite.py", line 122 in run  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/suite.py", line 84 in __call__  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/suite.py", line 122 in run  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/suite.py", line 84 in __call__  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/suite.py", line 122 in run  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/suite.py", line 84 in __call__  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/unittest/runner.py", line 176 in run  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/support/__init__.py", line 1984 in _run_suite  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/support/__init__.py", line 2080 in run_unittest  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/runtest.py", line 203 in _test_module  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/runtest.py", line 228 in _runtest_inner2  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/runtest.py", line 264 in _runtest_inner  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/runtest.py", line 148 in _runtest  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/runtest.py", line 187 in runtest  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/main.py", line 291 in rerun_failed_tests  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/main.py", line 646 in _main  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/main.py", line 588 in main  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/libregrtest/main.py", line 663 in main  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/__main__.py", line 2 in <module>  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/runpy.py", line 85 in _run_code  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/runpy.py", line 192 in _run_module_as_mainprogram finished with exit code 1elapsedTime=3523.834729
msg344613 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-04 17:39
In the same build:https://buildbot.python.org/all/#builders/168/builds/11540:28:57 load avg: 12.93 [208/423/1] test_threading crashed (Exit code 1) -- running: test_lib2to3 (7 min 9 sec), test_multiprocessing_spawn (1 min 36 sec)Timeout (0:25:00)!Thread 0x0000000800ac3000 (most recent call first):  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/threading.py", line 1015 in _wait_for_tstate_lock  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/threading.py", line 999 in join  File "/usr/home/buildbot/python/3.x.koobs-freebsd-current/build/Lib/test/test_threading.py", line 1287 in test_interrupt_main_subthread  (...)
msg344922 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-07 11:00
I wrotePR 13889: with this change, I can easily reproduce the crash on Linux:$ ./python -m test test_threading -m test_threads_join_2 -FRun tests sequentially0:00:00 load avg: 0.51 [  1] test_threadingFatal Python error: Py_EndInterpreter: not the last threadCurrent thread 0x00007f84ad74d740 (most recent call first):  File "/home/vstinner/prog/python/master/Lib/test/support/__init__.py", line 2794 in run_in_subinterp  File "/home/vstinner/prog/python/master/Lib/test/test_threading.py", line 923 in test_threads_join_2...Py_EndInterpreter() calls wait_for_thread_shutdown() to wait until threading._shutdown() completes.When the assertion fails, threading.enumerate() only contains the main thread: the spawned thread is already gone. But the assertion fails, which means that the Python thread state of the thread (which looks to be completed) is still around.This unit test comes frombpo-18808:commit7b4769937fb612d576b6829c3b834f3dd31752f1Author: Antoine Pitrou <solipsis@pitrou.net>Date:   Sat Sep 7 23:38:37 2013 +0200    Issue#18808: Thread.join() now waits for the underlying thread state to be destroyed before returning.    This prevents unpredictable aborts in Py_EndInterpreter() when some non-daemon threads are still running.
msg345143 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-10 20:59
Oh. UsingPR 13889, I'm able to reproduce the bug up to Python 3.4. Example at commit commite76cbc781044ee01b059f3702c580e66266b84c5 (tag: v3.4.10):$ wget 'https://github.com/python/cpython/pull/13889.patch'$ git apply 13889.patch$ ./python -m test  -F -m test_threads_join_2  test_threading[  1] test_threading[  2] test_threading(...)[ 10] test_threading[ 11] test_threadingFatal Python error: Py_EndInterpreter: not the last threadCurrent thread 0x00007f418b3280c0 (most recent call first):  File "/home/vstinner/prog/python/3.4/Lib/test/support/__init__.py", line 2304 in run_in_subinterp  File "/home/vstinner/prog/python/3.4/Lib/test/test_threading.py", line 877 in test_threads_join_2  (...)Aborted (core dumped)
msg345146 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-10 21:57
threading._shutdown() uses threading.enumerate() which iterations on threading._active.threading.Thread registers itself into threading._active using its _bootstrap_inner() method. It unregisters itself when _bootstrap_inner() completes, whereas its is_alive() method still returns true: since the underlying native thread still runs and the Python thread state still exists._thread._set_sentinel() creates a lock and registers a tstate->on_delete callback to release this lock. It's called by threading.Thread._set_tstate_lock() to set threading.Thread._tstate_lock. This lock is used by threading.Thread.join() to wait until the thread completes._thread.start_new_thread() calls the C function t_bootstrap() which ends with:    tstate->interp->num_threads--;    PyThreadState_Clear(tstate);    PyThreadState_DeleteCurrent();    PyThread_exit_thread();_PyThreadState_DeleteCurrent() calls tstate->on_delete() which releases threading.Thread._tstate_lock lock.In test_threads_join_2() test, PyThreadState_Clear() blocks on clearing thread variables: the Sleeper destructor of the Sleeper instance sleeps.The race condition is that:* threading._shutdown() rely on threading._alive* Py_EndInterpreter() rely on the interpreter linked list of Python thread states: interp->tstate_head.
msg345149 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-10 22:52
Other references to test_threads_join_2() failures:*bpo-36989: fails once AIX*bpo-27791: 2 failures in 2016 in "AMD64 FreeBSD 10 Shared 3.x" buildbots (I closed the issue with: "Sadly, such bug is hard to reproduce. I didn't look at buildbots recently, and I don't have access to FreeBSD, so I just close the issue :-/")*bpo-28084: once failure on "AMD64 FreeBSD 9.x 3.x" buildbot (closed as a duplicate ofbpo-27791)See alsobpo-18808: "Thread.join returns before PyThreadState is destroyed" (issue which added the test).
msg345150 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-10 23:10
test_threading: test_threads_join_2() was added by commit 7b47699 in 2013, but the test failed randomly since it was added. It's just that failures were ignored until I createdhttps://bugs.python.org/issue36402 last March.In fact, when the test failed randomly on buildbot (with tests run in parallel), it was fine since test_threading was re-run alone and then the test passed. The buildbot build was seen overall as a success. Previous issues were closed (see my previous comment).The test shows the bug using subinterpreters (Py_EndInterpreter), but the bug also exists in Py_Finalize() which hash the same race condition (it also calls threading._shutdown()). It's just that Py_EndInterpreter() is stricter, it contains this assertion:    if (tstate != interp->tstate_head || tstate->next != NULL)        Py_FatalError("Py_EndInterpreter: not the last thread");Attached py_finalize.patch adds the same assertion to Py_Finalize.I added test_threading.test_finalization_shutdown() toPR 13948. If you run test_finalization_shutdown() with py_finalize.patch, Py_Finalize() fails with a similar assertion error.But py_finalize.patch is incompatible with the principle of daemon threads and so cannot be commited.
msg345429 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-12 23:30
New changeset468e5fec8a2f534f1685d59da3ca4fad425c38dd by Victor Stinner in branch 'master':bpo-36402: Fix threading._shutdown() race condition (GH-13948)https://github.com/python/cpython/commit/468e5fec8a2f534f1685d59da3ca4fad425c38dd
msg345492 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-13 10:06
New changeset6f75c873752a16a7ad8f35855b1e29f59d048e84 by Victor Stinner in branch 'master':tbpo-36402: Fix threading.Thread._stop() (GH-14047)https://github.com/python/cpython/commit/6f75c873752a16a7ad8f35855b1e29f59d048e84
msg345502 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-13 11:44
New changesete40a97a721d46307dfdc2b0322028ccded6eb571 by Victor Stinner in branch '3.8':[3.8]bpo-36402: Fix threading._shutdown() race condition (GH-13948) (GH-14050)https://github.com/python/cpython/commit/e40a97a721d46307dfdc2b0322028ccded6eb571
msg345508 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-13 12:22
New changeset6eb2878e42152e9c45d7ee5e6f889532d753e67c by Victor Stinner (Miss Islington (bot)) in branch '3.7':bpo-36402: Fix threading._shutdown() race condition (GH-13948) (GH-14050) (GH-14054)https://github.com/python/cpython/commit/6eb2878e42152e9c45d7ee5e6f889532d753e67c
msg345509 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-13 12:27
Thebpo-18808 "Thread.join returns before PyThreadState is destroyed" was not fixed in Python 2.7: threading.Thread has no _tstate_lock attribute. I'm not comfortable to backportbpo-18808 "feature" or "bugfix" to Python 2.7, not to backport this change.Python 2.7 works as it is, and it's going to reach its end of life at the end of the year. I guess that people learnt how to work around Python 2.7 limitation likebpo-18808.
msg345510 -(view)Author: STINNER Victor (vstinner)*(Python committer)Date: 2019-06-13 12:28
Ok, the root issue (threading._shutdown() race condition) has been fixed in Python 3.7, 3.8 and master branches. I close the issue. Thanks for the reviews!
msg349281 -(view)Author: Kristján Valur Jónsson (kristjan.jonsson)*(Python committer)Date: 2019-08-09 11:08
Please note that this fix appears to be the cause of#37788
History
DateUserActionArgs
2022-04-11 14:59:12adminsetgithub: 80583
2020-03-27 00:26:31vstinnerlinkissue36989 superseder
2019-08-09 11:08:52kristjan.jonssonsetnosy: +kristjan.jonsson
messages: +msg349281
2019-06-13 12:28:09vstinnersetstatus: open -> closed
messages: +msg345510

components: + Library (Lib), - Tests
resolution: fixed
stage: patch review -> resolved
2019-06-13 12:27:13vstinnersetmessages: +msg345509
2019-06-13 12:22:25vstinnersetmessages: +msg345508
2019-06-13 11:45:00miss-islingtonsetpull_requests: +pull_request13916
2019-06-13 11:44:29vstinnersetmessages: +msg345502
2019-06-13 10:11:05vstinnersetpull_requests: +pull_request13912
2019-06-13 10:06:30vstinnersetmessages: +msg345492
2019-06-13 08:58:12vstinnersetpull_requests: +pull_request13909
2019-06-12 23:31:59vstinnersetversions: + Python 3.7, Python 3.9
2019-06-12 23:30:37miss-islingtonsetpull_requests: +pull_request13898
2019-06-12 23:30:28miss-islingtonsetpull_requests: +pull_request13897
2019-06-12 23:30:20vstinnersetmessages: +msg345429
2019-06-11 01:06:28koobssetnosy: +koobs
2019-06-10 23:12:29vstinnersettitle: test_threading: test_threads_join_2() failed with "Fatal Python error: Py_EndInterpreter: not the last thread" on AMD64 FreeBSD CURRENT Shared 3.x -> threading._shutdown() race condition: test_threading test_threads_join_2() fails randomly
2019-06-10 23:10:50vstinnersetfiles: +py_finalize.patch

messages: +msg345150
2019-06-10 23:04:33vstinnersetpull_requests: +pull_request13817
2019-06-10 22:52:15vstinnersetmessages: +msg345149
2019-06-10 22:39:36vstinnersetpull_requests: +pull_request13816
2019-06-10 21:57:27vstinnersetmessages: +msg345146
2019-06-10 20:59:51vstinnersetmessages: +msg345143
2019-06-10 20:45:58vstinnersetpull_requests: +pull_request13813
2019-06-07 11:00:36vstinnersetnosy: +pitrou,eric.snow
2019-06-07 11:00:19vstinnersetmessages: +msg344922
2019-06-07 10:53:05vstinnersetkeywords: +patch
stage: patch review
pull_requests: +pull_request13763
2019-06-04 17:39:53vstinnersetmessages: +msg344613
2019-06-04 17:36:29vstinnersetmessages: +msg344611
2019-03-22 23:20:28vstinnersetnosy: +pablogsal
messages: +msg338623
2019-03-22 23:18:24vstinnercreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp