Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork7.9k
FIX: do not try to help CPython with garbage collection#23712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
Matplotlib has a large number of circular references (between figure andmanager, between axes and figure, axes and artist, figure and canvas, and ...)so when the user drops their last reference to a `Figure` (and clears it frompyplot's state), the objects will not immediately deleted.To account for this we have long (goes back toe34a333 the "reorganize code" commit in 2004which is the end of history for much of the code) had a `gc.collect()` in theclose logic in order to promptly clean up after our selves.However, unconditionally calling `gc.collect` and be a major performanceissue (seematplotlib#3044 andmatplotlib#3045) because if there are alarge number of long-lived user objects Python will spend a lot of timechecking objects that are not going away are never going away.Instead of doing a full collection we switched to clearing out the lowest twogenerations. However this both not doing what we want (as most of our objectswill actually survive) and due to clearing out the first generation opened usup to having unbounded memory usage.In cases with a very tight loop between creating the figure and destroyingit (e.g. `plt.figure(); plt.close()`) the first generation will never growlarge enough for Python to consider running the collection on the highergenerations. This will lead to un-bounded memory usage as the long-livedobjects are never re-considered to look for reference cycles and hence arenever deleted because their reference counts will never go to zero.closesmatplotlib#23701
This might also fix#22448. |
Not to completely nerd-snipe this issue, but I had wondered some time ago (basically when thinking about this gc.collect call, although I certainly didn't go through all the analysis you did!) whether CPython could have an API like |
I'm also not sure. I do not think there is currently enough information to do that but am not clear on how much more you would have to track to get there. My instinct is that this would add more overhead in cases where it is not used than you would ever get back in the cases when it is used. I think a Python exposed function for "run the full collection, but only if you would have otherwise" is what we really want here. The other thing that just connected in my brain is that dictionaries (which I assume includes instance dictionaries) are only swept in the oldest generation (which is why our objects all survived the I had a thought about being aggressive about calling fig,ax=plt.subplots()ax.plot(...)plt.show(block=True)fig.savefig(...) Similarly, any way I can think of putting in weak references is going to break someone terribly. |
Perhaps we could call |
That is reasonable. It will be a bunch of work, but something like
might be a deprecation path to that? That said, it leaves a weird inconsistency between doing |
Doesn't the inline backend auto-close figures? I'm pretty sure many people use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Probably target 3.6 on this and not 3.5.4.
…712-on-v3.6.xBackport PR#23712 on branch v3.6.x (FIX: do not try to help CPython with garbage collection)
Uh oh!
There was an error while loading.Please reload this page.
Matplotlib has a large number of circular references (between figure and
manager, between axes and figure, axes and artist, figure and canvas, and ...)
so when the user drops their last reference to a
Figure
(and clears it frompyplot's state), the objects will not immediately deleted.
To account for this we have long (goes back to
e34a333 the "reorganize code" commit in 2004
which is the end of history for much of the code) had a
gc.collect()
in theclose logic in order to promptly clean up after our selves.
However, unconditionally calling
gc.collect
and be a major performanceissue (see#3044 and
#3045) because if there are a
large number of long-lived user objects Python will spend a lot of time
checking objects that are not going away are never going away.
Instead of doing a full collection we switched to clearing out the lowest two
generations. However this both not doing what we want (as most of our objects
will actually survive) and due to clearing out the first generation opened us
up to having unbounded memory usage.
In cases with a very tight loop between creating the figure and destroying
it (e.g.
plt.figure(); plt.close()
) the first generation will never growlarge enough for Python to consider running the collection on the higher
generations. This will lead to un-bounded memory usage as the long-lived
objects are never re-considered to look for reference cycles and hence are
never deleted because their reference counts will never go to zero.
closes#23701
I'm not sure how to test this, I do not want to put a maybe memory exhausting test in. There might be something that can be done using gc's logging / debugging / callback hooks.