Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.4k
gh-135552: Make the GC clear weakrefs later.#136189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
Clear the weakrefs to unreachable objects after finalizers are called.
I can confirm this PR fixes thegh-132413 issue as well. |
I think this fixes (or mostly fixes)gh-91636 as well. |
bedevere-bot commentedJul 1, 2025
🤖 New build scheduled with the buildbot fleet by@nascheme for commit12f0b5c 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F136189%2Fmerge If you want to schedule another build, you need to add the🔨 test-with-buildbots label again. |
This introduces refleaks, it seems. One of the leaking tests: |
This is the smallest leaking example I could make so far. Something with
|
neonene commentedJul 2, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Other leaking examples (on Windows): 1. test_logging:importloggingimportlogging.configimportlogging.handlersfrommultiprocessingimportQueue,ManagerclassConfigDictTest(unittest.TestCase):deftest_multiprocessing_queues_XXX(self):config= {'version':1,'handlers' : {'spam' : {'class':'logging.handlers.QueueHandler','queue':Manager().Queue() ,# Leak# 'queue': Manager().JoinableQueue() # Leak# 'queue': Queue(), # No leak }, },'root': {'handlers': ['spam']} }logger=logging.getLogger()logging.config.dictConfig(config)whilelogger.handlers:h=logger.handlers[0]logger.removeHandler(h)h.close() 2. test_interpreters.test_api:importcontextlibimportthreadingimporttypesfromconcurrentimportinterpretersdeffunc():raiseException('spam!')@contextlib.contextmanagerdefcaptured_thread_exception():ctx=types.SimpleNamespace(caught=None)defexcepthook(args):ctx.caught=argsorig_excepthook=threading.excepthookthreading.excepthook=excepthooktry:yieldctxfinally:threading.excepthook=orig_excepthookclassTestInterpreterCall(unittest.TestCase):deftest_call_in_thread_XXX(self):interp=interpreters.create()call= (interp._call,interp.call)[1]# 0: No leak, 1: Leakwithcaptured_thread_exception()as_:t=threading.Thread(target=call,args=(interp,func, (), {}))t.start()t.join() UPDATE: importweakref,_interpreterswd=weakref.WeakValueDictionary()classTestInterpreterCall(unittest.TestCase):deftest_call_in_thread_XXX(self):id=_interpreters.create()wd[id]=type("", (), {})_interpreters.destroy(id) |
The majority (maybe all) of these leaks are caused by the |
This avoids breaking tests while fixing the issue with tp_subclasses. Inthe long term, it would be better to defer the clear of all weakrefs,not just the ones referring to classes. However, that is a moredistruptive change and would seem to have a higher chance of breakinguser code. So, it would not be something to do in a bugfix release.
Nope, that doesn't fix all the leaks. And having to explicitly clean the weakrefs from the WeakValueDictionary really shouldn't be needed, I think. The For the purposes of having a fix that we can backport (should probably be backported to all maintained Python versions), a less disruptive fix would be better. To that end, I've changed this PR to only defer clearing weakrefs to class objects. That fixes the |
bedevere-bot commentedJul 3, 2025
🤖 New build scheduled with the buildbot fleet by@nascheme for commit2f3daba 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F136189%2Fmerge If you want to schedule another build, you need to add the🔨 test-with-buildbots label again. |
We need to clear those before executing the callback. Since thisensures they can't run a second time, we don't need_PyGC_SET_FINALIZED(). Revise comments.
Ah, the I revised this PR to be something that is potentially suitable for backporting. To minimize the behaviour change, I'm only deferring the clear of weakrefs that refer to types (in order to allow tp_subclasses to work) or with callbacks. I still have an extra clearing pass that gets done after the finalizers are called. That avoids bugs like#91636. If this gets merged, I think we should consider deferring clearing all weakrefs (without callbacks) until after finalizers are executed. I think that's more correct since it gives the finalizers a better opportunity to do their job. |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
* *not* be cleared so that caches based on the type version are correctly | ||
* invalidated (see GH-135552 as a bug caused by this). | ||
*/ | ||
clear_weakrefs(&final_unreachable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ough! Nice trick withfinal_unreachable
!
This is a bit trickly because the wrlist can be changing as weiterate over it.
IIUC, there maybe cases when IIUC, becausePEP-442 landed we can remove weakrefs here Line 105 in0c3e3da
and there cpython/Lib/importlib/_bootstrap.py Line 65 in0c3e3da
and use a strong reference that will outlive the KeyedRefs in the dictionaries. I replaced the weakref with a strong reference in the _bootstrap.py and checked with refleak buildbot (#136345). No leaks were found. @nascheme WDYT? |
Weakrefwith a callback needs to be cleared early (as-is), accounting for the 3rd-party applications that have similar logic to |
I'm not against this. But IIUC (feel free to correct me) we can face a situation in which weakref to the |
The experiment#136345 seems currently invalid, which is based on main and not effective on this PR where I omitted calling |
If we use a strong reference, then it doesn't matter if we omit |
Before discussing, have you checked on this PR? I can see the leaks. |
sergey-miryanov commentedJul 7, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Results
|
Let's move on to#136345. (The result above looks long here.) |
GH-136401 was merged. I think it should be backported as well since it should be pretty low risk. The main branch has been merged into this PR. I've updated the PR to defer clearing all weakrefs without callbacks (not just weakrefs to classes). This seems more correct to me (rather than special casing weakrefs to classes) and seems the best way to fixGH-132413. I thinkGH-132413 is actually more likely to be encountered in real code. For example, if you have some logging function in a |
Uh oh!
There was an error while loading.Please reload this page.
Looks good to me. Thanks for detailed comments! |
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
@pablogsal are you planning to take another look at this? If so, no problem, there is no great rush. If not, I think I'll do another review pass myself and then merge. |
pablogsal commentedJul 14, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Yeah i have it pending for this week (sorry fixing a lot of bugs for 3.14 currently) but if you feel I am taking too much time I am happy if you go ahead I don't want to block it on this |
@nascheme I see you updated comment for Would you mind also removing another one from Lines 572 to 576 in9363703
|
Do we want to merge this PR w/o tests on subclasses destruction from#135552? |
Sergey has tests in another PR and I'll merge that right after I merge this one. |
self.assertIs(wr(), None) | ||
# This used to be None because weakrefs were cleared before | ||
# calling finalizers. Now they are cleared after. | ||
self.assertIsNot(wr(), None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
self.assertIsNot(wr(),None) | |
self.assertIsNotNone(wr()) |
Uh oh!
There was an error while loading.Please reload this page.
Fix two bugs related to the interaction of weakrefs and the garbage
collector. The weakrefs in the
tp_subclasses
dictionary are needed inorder to correctly invalidate type caches (for example, by calling
PyType_Modified()
). Clearing weakrefs before calling finalizers causesthe caches to not be correctly invalidated. That can cause crashes since the
caches can refer to invalid objects. This is fixed by deferring the
clearing of weakrefs to classes and without callbacks until after finalizers
are executed.
The second bug is caused by weakrefs created while running finalizers. Those
weakrefs can be outside the set of unreachable garbage and therefore survive
the
delete_garbage()
phase (wheretp_clear()
is called on objects).Those weakrefs can expose to Python-level code objects that have had
tp_clear()
called on them. SeeGH-91636 as an example of this kind ofbug. This is fixed be clearing all weakrefs to unreachable objects after
finalizers have been executed.
datetime.timedelta
(possibly from datetime's Cdelta_new
) #132413