Movatterモバイル変換

Issue36560

➜

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

This issue has been migrated to GitHub: https://github.com/python/cpython/issues/80741

classification

Title:	test_functools leaks randomly 1 memory block
Type:		Stage:	resolved
Components:	Tests	Versions:	Python 3.8

process

Status:	closed	Resolution:	out of date
Dependencies:		Superseder:
Assigned To:		Nosy List:	kj, levkivskyi, methane, serhiy.storchaka, vstinner
Priority:	normal	Keywords:	patch

Created on2019-04-08 14:22 byvstinner, last changed2022-04-11 14:59 byadmin. This issue is nowclosed.

Files
File name	Uploaded	Description	Edit
test_functools.py	vstinner,2019-04-08 15:33
test_functools2.py	vstinner,2019-04-08 17:11

Pull Requests
URL	Status	Linked	Edit
PR 12743	closed	vstinner,2019-04-09 10:55
PR 12744	merged	vstinner,2019-04-09 12:01
PR 12745	merged	vstinner,2019-04-09 12:42
PR 12747	merged	vstinner,2019-04-09 16:01
PR 12749	merged	vstinner,2019-04-09 16:31

Messages (21)
msg339643 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-08 14:22
Sometimes, each run of test_functools leaks exactly 1 memory block, even when the whole test is "re-run in verbose mode". Sometimes, it doesn't leak.https://buildbot.python.org/all/#/builders/80/builds/550test_functools leaked [1, 1, 1] memory blocks, sum=3Re-running test 'test_functools' in verbose modetest_functools leaked [1, 1, 1] memory blocks, sum=3Maybe the problem comes from Example on Linux:$ ./python -m test -F -r -j1 -R 3:3 test_functools Using random seed 3891892Run tests in parallel using 1 child processes0:00:01 load avg: 2.38 [ 1] test_functools passedbeginning 6 repetitions123456......(...)0:00:06 load avg: 2.27 [ 6] test_functools passedbeginning 6 repetitions123456......0:00:07 load avg: 2.27 [ 7/1] test_functools failedbeginning 6 repetitions123456......test_functools leaked [1, 2, 1] memory blocks, sum=40:00:08 load avg: 2.27 [ 8/1] test_functools passedbeginning 6 repetitions123456......== Tests result: FAILURE ==7 tests OK.1 test failed: test_functoolsTotal duration: 8 sec 333 msTests result: FAILURE
msg339651 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-08 15:33
Current bisection progress: I'm able to reproduce a failure with attached test_functools.py (211 lines) using:./python -m test -F -j5 -R 3:3 test_functools It takes between 4 and 200 runs to reproduce the failure, I don't understand what triggers the bug.I failed to identify if the bug is more likely when the system is idle or when the system is busy. It looks purely random.
msg339662 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-08 17:11
Some progress.I'm now able to reproduce the issue with attached test_functools2.py (103 lines, a single test method):$ cp (...)test_functools2.pyLib/test/$ ./configure -C --with-pydebug CFLAGS=-O0$ make$ ./python -m test -F -j5 -R 3:3 test_functools2(...)0:00:02 load avg: 0.95 [ 19/1] test_functools failedbeginning 6 repetitions123456......test_functools leaked [1, 1, 1] memory blocks, sum=3(...)Tests result: FAILUREI disabled the C accelerator _functools.I'm now trying to disable the C accelerator of abc, but I had a small issue:bpo-36565.
msg339663 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-08 17:14
The issue comes maybe from the complex code inLib/test/libregrtest/refleak.py to handle the ABC cache: abcs = {} for abc in [getattr(collections.abc, a) for a in collections.abc.__all__]: if not isabstract(abc): continue for obj in abc.__subclasses__() + [abc]: abcs[obj] = _get_dump(obj)[0]... # clear type cache sys._clear_type_cache() # Clear ABC registries, restoring previously saved ABC registries. abs_classes = [getattr(collections.abc, a) for a in collections.abc.__all__] abs_classes = filter(isabstract, abs_classes) for abc in abs_classes: for obj in abc.__subclasses__() + [abc]: for ref in abcs.get(obj, set()): if ref() is not None: obj.register(ref()) obj._abc_caches_clear() clear_caches()
msg339731 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 11:20
This issue isn't a real memory leak: if I use -R 3:10 instead of -R 3:3, the test doesn't fail anymore.But the issue is still annoying since it makes Refleaks buildbot workers fail randomly :-/This issue remembers me the unstable multiprocessing tests:bpo-33735: test_multiprocessing_spawn leaked [1, 2, 1] memory blocks on AMD64 Windows8.1 Refleaks 3.7bpo-33984: test_multiprocessing_forkserver leaked [1, 2, 1] memory blocks on x86 Gentoo Refleaks 3.xPatch to always display memory allocations differences:diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.pyindexd68ea63b5b..997be819fa 100644--- a/Lib/test/libregrtest/refleak.py+++ b/Lib/test/libregrtest/refleak.py@@ -118,6 +118,8 @@ def dash_R(the_module, test, indirect_test, huntrleaks): print(msg, file=refrep) refrep.flush() failed = True+ if not failed:+ print(alloc_deltas[nwarmup:]) return failed Truncated output with the patch:vstinner@apu$ ./python -m test -F -r -j1 -R 3:10 test_functoolsUsing random seed 4308771Run tests in parallel using 1 child processes0:00:04 load avg: 0.91 [ 1] test_functools passed[0, 1, 2, 0, 0, 0, 0, 0, 0, 0]...0:00:13 load avg: 0.92 [ 3] test_functools passed[2, 1, 0, 0, 0, 0, 0, 0, 0, 0]...0:00:17 load avg: 0.93 [ 4] test_functools passed[0, 3, 0, 0, 0, 0, 0, 0, 0, 0]...0:00:21 load avg: 0.93 [ 5] test_functools passed[0, 1, 0, 0, 2, 0, 0, 0, 0, 0]...0:00:26 load avg: 0.93 [ 6] test_functools passed[0, 4, 0, 0, 0, 0, 0, 0, 0, 0]...0:00:34 load avg: 0.87 [ 8] test_functools passed[0, 1, 0, 2, 0, 0, 0, 0, 0, 0]...0:01:06 load avg: 1.15 [ 15] test_functools passed[0, 1, 0, 2, 0, -1, 1, 0, 0, 0]...0:01:10 load avg: 1.46 [ 16] test_functools passed[0, 4, 0, 0, 0, 0, 0, 0, -1, 1]...The maximum sum() of these list is around 5 on 10 runs: not every run leaks a memory block. It looks more like a internal cache which is "unstable" if you look at the number of allocated memory blocks.
msg339732 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 11:22
I wrotePR 12743 to modify _abc: use weakref.WeakSet rather than using directly a set of weak references. With this PR, calling _get_dump() should clear all broken weak references. I'm not sure if it's really useful for this specific issue, since dash_R_cleanup() calls obj._abc_caches_clear() on all abstract base classes (ABC): all caches must be removed before reading the total number of allocated memory blocks (sys.getallocatedblocks()).
msg339733 -(view)	Author: Serhiy Storchaka (serhiy.storchaka)*	Date: 2019-04-09 11:28
_abc was written for optimization. Your PR makes it using slower Python code. Could you please test how much this hits the performance of abc?
msg339735 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 12:05
If I modify libregrtest with the following patch:diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.pyindex0bb8a0a2bf..f0225a9768 100644--- a/Lib/test/libregrtest/refleak.py+++ b/Lib/test/libregrtest/refleak.py@@ -128,7 +128,7 @@ def dash_R(ns, the_module, test_name, test_func): failed = False for deltas, item_name, checker in [ (rc_deltas, 'references', check_rc_deltas),- (alloc_deltas, 'memory blocks', check_rc_deltas),+ (alloc_deltas, 'memory blocks', check_fd_deltas), (fd_deltas, 'file descriptors', check_fd_deltas) ]: # ignore warmup runsAnd I add the following fileLib/test/test_noop.py:import unittestclass NoopTests(unittest.TestCase): def test_noop(self): passregrtest detects a "leak":$ ./python -m test -R 3:3 test_noopRun tests sequentially0:00:00 load avg: 0.55 [1/1] test_noopbeginning 6 repetitions123456......test_noop leaked [0, 1, 0] memory blocks, sum=1test_noop failed== Tests result: FAILURE ==1 test failed: test_noopTotal duration: 113 msTests result: FAILUREThe issue comes from this look inLib/test/libregrtest/refleak.py: for i in range(repcount): indirect_test() alloc_after, rc_after, fd_after = dash_R_cleanup(fs, ps, pic, zdc, abcs) print('.', end='', file=sys.stderr, flush=True) if i >= nwarmup: rc_deltas[i] = get_pooled_int(rc_after - rc_before) alloc_deltas[i] = get_pooled_int(alloc_after - alloc_before) fd_deltas[i] = get_pooled_int(fd_after - fd_before) alloc_before = alloc_after rc_before = rc_after fd_before = fd_afterBecause of "if i >= nwarmup:", get_pooled_int() isn't call during "warmup", whereas the purpose of the warmup is to warmup everything.Maybe get_pooled_int() allocates one frame object and keeps it alive in its "zombi frame". Maybe something else is allocated and kept alive.Anything, removing "if i >= nwarmup:" to always compute deltas fix this specific issue.AttachedPR 12744 fix this bug.
msg339738 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 12:23
New changeset5aaac94eeb44697e92b0951385cd557bc27e0f6a by Victor Stinner in branch 'master':bpo-36560: Fix reference leak hunting in regrtest (GH-12744)https://github.com/python/cpython/commit/5aaac94eeb44697e92b0951385cd557bc27e0f6a
msg339768 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 16:01
New changeset9c14061a2c2df9a9b84d0aab190a50c24a0d52f4 by Victor Stinner in branch '2.7':bpo-36560: Fix reference leak hunting in regrtest (GH-12744) (GH-12745)https://github.com/python/cpython/commit/9c14061a2c2df9a9b84d0aab190a50c24a0d52f4
msg339772 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 16:26
New changesetbb4447897a5f141eecf42987a1191a3330c5d7ed by Victor Stinner in branch 'master':bpo-36560: regrtest: don't collect the GC twice (GH-12747)https://github.com/python/cpython/commit/bb4447897a5f141eecf42987a1191a3330c5d7ed
msg339777 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 16:43
> _abc was written for optimization. Your PR makes it using slower Python code. Could you please test how much this hits the performance of abc?I mostly wrotePR 12743 to debug this issue. Please see the discussion directly on the PR ;-)
msg339781 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-09 16:56
New changeset86f0354fcb815312295b923c55e39364d85d0388 by Victor Stinner in branch '3.7':[3.7]bpo-36560: regrtest: don't collect the GC twice (GH-12747) (GH-12749)https://github.com/python/cpython/commit/86f0354fcb815312295b923c55e39364d85d0388
msg339965 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-11 11:21
Patch making check on memory block leaks stricter:diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.pyindex235d6bfd3a..dfadabdef6 100644--- a/Lib/test/libregrtest/refleak.py+++ b/Lib/test/libregrtest/refleak.py@@ -130,7 +130,7 @@ def dash_R(ns, the_module, test_name, test_func): failed = False for deltas, item_name, checker in [ (rc_deltas, 'references', check_rc_deltas),- (alloc_deltas, 'memory blocks', check_rc_deltas),+ (alloc_deltas, 'memory blocks', check_fd_deltas), (fd_deltas, 'file descriptors', check_fd_deltas) ]: # ignore warmup runsUsing this patch, at least the following tests fail:* test_asyncio* test_code* test_collections* test_contextlib* test_contextlib_async* test_ctypes* test_functools* test_multiprocessing_forkserver* test_multiprocessing_spawn* test_regrtest* test_statistics* test_typing* test_xml_etree_cI didn't analyze why yet. I guess that they are not real memory leaks, but more minor issue in the code checking for memory leaks. Sadly, it seems like such small glitch can cause a whole Refleak buildbot worker to fail :-(
msg340386 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-04-17 10:16
Another example:AMD64 Fedora Rawhide Refleaks 3.6https://buildbot.python.org/all/#/builders/193/builds/5test_contextlib leaked [8, 1, 1] memory blocks, sum=10I cannot reproduce the issue on my ("idle") laptop. It sounds like a random failure (race condition).
msg353071 -(view)	Author: STINNER Victor (vstinner)*	Date: 2019-09-24 10:45
Recent fail on AMD64 Fedora Rawhide Refleaks 3.x:https://buildbot.python.org/all/#builders/189/builds/162test_functools leaked [1, 2, 1] memory blocks, sum=4
msg371286 -(view)	Author: STINNER Victor (vstinner)*	Date: 2020-06-11 14:23
AMD64 Fedora Rawhide Refleaks 3.8:https://buildbot.python.org/all/#/builders/79/builds/196test_functools leaked [1, 1, 1] memory blocks, sum=3
msg394485 -(view)	Author: STINNER Victor (vstinner)*	Date: 2021-05-26 22:40
I didn't see this failure recently. I close the issue.
msg400739 -(view)	Author: STINNER Victor (vstinner)*	Date: 2021-08-31 14:53
Recent failure on AMD64 Fedora Rawhide Refleaks 3.x:https://buildbot.python.org/all/#/builders/565/builds/131test_typing leaked [1, 1, 1] memory blocks, sum=3(...)0:46:40 load avg: 0.00 Re-running test_typing in verbose modetest_typing leaked [1, 1, 1] memory blocks, sum=3I don't reopen the issue since I failed to reproduce it.
msg400742 -(view)	Author: STINNER Victor (vstinner)*	Date: 2021-08-31 15:05
I can see the test_typing issue with this patch:diff --git a/Lib/test/libregrtest/refleak.py b/Lib/test/libregrtest/refleak.pyindexb94826a5da..49e5f03414 100644--- a/Lib/test/libregrtest/refleak.py+++ b/Lib/test/libregrtest/refleak.py@@ -124,7 +124,7 @@ def check_rc_deltas(deltas): # # [5, 5, 6] # [10, 1, 1]- return all(delta >= 1 for delta in deltas)+ return any(delta >= 1 for delta in deltas) def check_fd_deltas(deltas): return any(deltas)Examples:$ ./python -m test test_typing -R 3:20(...)beginning 23 repetitions12345678901234567890123.......................test_typing leaked [1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] memory blocks, sum=3test_typing failed (reference leak)$ ./python -m test test_typing -R 1:20WARNING: Running tests with --huntrleaks/-R and less than 3 warmup repetitions can give false positives!(...)beginning 21 repetitions123456789012345678901.....................test_typing leaked [6, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] references, sum=6test_typing leaked [4, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] memory blocks, sum=7test_typing failed (reference leak)(...)
msg400754 -(view)	Author: Ken Jin (kj)*	Date: 2021-08-31 16:05
@Victor, with your patch applied, for test_typing (locally, Windows x64):$ ./python -m test test_typing -R 3:20Different result on every run (race condition maybe?)$ ./python -m test test_typing -R 1:20Same result as yours on every run.

History
Date	User	Action	Args
2022-04-11 14:59:13	admin	set	github: 80741
2021-08-31 16:05:56	kj	set	nosy: +kj messages: +msg400754
2021-08-31 15:05:41	vstinner	set	messages: +msg400742
2021-08-31 14:53:03	vstinner	set	messages: +msg400739
2021-05-26 22:40:21	vstinner	set	status: open -> closed resolution: out of date messages: +msg394485 stage: patch review -> resolved
2020-06-11 14:23:37	vstinner	set	messages: +msg371286
2019-09-24 10:45:21	vstinner	set	messages: +msg353071
2019-04-17 10:16:39	vstinner	set	messages: +msg340386
2019-04-11 11:21:06	vstinner	set	messages: +msg339965
2019-04-09 16:56:07	vstinner	set	messages: +msg339781
2019-04-09 16:43:06	vstinner	set	messages: +msg339777
2019-04-09 16:31:39	vstinner	set	pull_requests: +pull_request12676
2019-04-09 16:26:21	vstinner	set	messages: +msg339772
2019-04-09 16:01:27	vstinner	set	pull_requests: +pull_request12674
2019-04-09 16:01:19	vstinner	set	messages: +msg339768
2019-04-09 12:42:33	vstinner	set	pull_requests: +pull_request12668
2019-04-09 12:23:51	vstinner	set	messages: +msg339738
2019-04-09 12:05:09	vstinner	set	messages: +msg339735
2019-04-09 12:01:30	vstinner	set	pull_requests: +pull_request12667
2019-04-09 11:28:48	serhiy.storchaka	set	nosy: +methane,levkivskyi messages: +msg339733
2019-04-09 11:22:53	vstinner	set	messages: +msg339732
2019-04-09 11:20:13	vstinner	set	messages: +msg339731
2019-04-09 10:55:15	vstinner	set	keywords: +patch stage: patch review pull_requests: +pull_request12666
2019-04-08 17:23:43	serhiy.storchaka	set	nosy: +serhiy.storchaka
2019-04-08 17:14:54	vstinner	set	messages: +msg339663
2019-04-08 17:11:59	vstinner	set	files: +test_functools2.py messages: +msg339662
2019-04-08 15:33:24	vstinner	set	files: +test_functools.py messages: +msg339651
2019-04-08 14:22:16	vstinner	create

Supported byThe Python Software Foundation,
Powered byRoundup