Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

FIX: do not try to help CPython with garbage collection#23712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
greglucas merged 1 commit intomatplotlib:mainfromtacaswell:fix_no_gc_collect
Aug 25, 2022

Conversation

tacaswell
Copy link
Member

@tacaswelltacaswell commentedAug 23, 2022
edited
Loading

Matplotlib has a large number of circular references (between figure and
manager, between axes and figure, axes and artist, figure and canvas, and ...)
so when the user drops their last reference to aFigure (and clears it from
pyplot's state), the objects will not immediately deleted.

To account for this we have long (goes back to
e34a333 the "reorganize code" commit in 2004
which is the end of history for much of the code) had agc.collect() in the
close logic in order to promptly clean up after our selves.

However, unconditionally callinggc.collect and be a major performance
issue (see#3044 and
#3045) because if there are a
large number of long-lived user objects Python will spend a lot of time
checking objects that are not going away are never going away.

Instead of doing a full collection we switched to clearing out the lowest two
generations. However this both not doing what we want (as most of our objects
will actually survive) and due to clearing out the first generation opened us
up to having unbounded memory usage.

In cases with a very tight loop between creating the figure and destroying
it (e.g.plt.figure(); plt.close()) the first generation will never grow
large enough for Python to consider running the collection on the higher
generations. This will lead to un-bounded memory usage as the long-lived
objects are never re-considered to look for reference cycles and hence are
never deleted because their reference counts will never go to zero.

closes#23701

I'm not sure how to test this, I do not want to put a maybe memory exhausting test in. There might be something that can be done using gc's logging / debugging / callback hooks.

nschloe reacted with thumbs up emoji
Matplotlib has a large number of circular references (between figure andmanager, between axes and figure, axes and artist, figure and canvas, and ...)so when the user drops their last reference to a `Figure` (and clears it frompyplot's state), the objects will not immediately deleted.To account for this we have long (goes back toe34a333 the "reorganize code" commit in 2004which is the end of history for much of the code) had a `gc.collect()` in theclose logic in order to promptly clean up after our selves.However, unconditionally calling `gc.collect` and be a major performanceissue (seematplotlib#3044 andmatplotlib#3045) because if there are alarge number of long-lived user objects Python will spend a lot of timechecking objects that are not going away are never going away.Instead of doing a full collection we switched to clearing out the lowest twogenerations.  However this both not doing what we want (as most of our objectswill actually survive) and due to clearing out the first generation opened usup to having unbounded memory usage.In cases with a very tight loop between creating the figure and destroyingit (e.g. `plt.figure(); plt.close()`) the first generation will never growlarge enough for Python to consider running the collection on the highergenerations.  This will lead to un-bounded memory usage as the long-livedobjects are never re-considered to look for reference cycles and hence arenever deleted because their reference counts will never go to zero.closesmatplotlib#23701
@tacaswelltacaswell added this to thev3.5.4 milestoneAug 23, 2022
@tacaswelltacaswell marked this pull request as ready for reviewAugust 23, 2022 01:40
@nschloe
Copy link
Contributor

This might also fix#22448.

@anntzer
Copy link
Contributor

Not to completely nerd-snipe this issue, but I had wondered some time ago (basically when thinking about this gc.collect call, although I certainly didn't go through all the analysis you did!) whether CPython could have an API likegc.try_collect_at_frame_exit(obj) which says, at the end of this frame, after all decrefs for frame locals have been done, check whetherobj could be gc'ed (i.e. whether it is only kept alive by an isolated refcycle, and if so, gc that cycle).

@tacaswell
Copy link
MemberAuthor

whether CPython could have an API like...

I'm also not sure. I do not think there is currently enough information to do that but am not clear on how much more you would have to track to get there. My instinct is that this would add more overhead in cases where it is not used than you would ever get back in the cases when it is used.

I think a Python exposed function for "run the full collection, but only if you would have otherwise" is what we really want here.

The other thing that just connected in my brain is that dictionaries (which I assume includes instance dictionaries) are only swept in the oldest generation (which is why our objects all survived thegc.collect(1).

I had a thought about being aggressive about callingfig.cla() when we destroy the manager, but I think that would break the workflow of

fig,ax=plt.subplots()ax.plot(...)plt.show(block=True)fig.savefig(...)

Similarly, any way I can think of putting in weak references is going to break someone terribly.

@tacaswelltacaswell modified the milestones:v3.5.4,v3.6.0Aug 25, 2022
@tacaswelltacaswell added the Release criticalFor bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions. labelAug 25, 2022
@anntzer
Copy link
Contributor

Perhaps we could callcla() (and recursively detach artists from the parent figure) whenplt.close(fig) is called, but let pressing-on-the-X-to-close just unregister the manager and not bother to callcla()? I think a user that programatically closes the figure is more likely to be running a loop where the memleak may matter, whereas interactive users are more likely to be OK with a delayed GC?

@tacaswell
Copy link
MemberAuthor

That is reasonable. It will be a bunch of work, but something like

  • 3.7
    • add a "and recursively destroy" flag toplt.close
    • inplt.closeif the flag is not set by the user mark theFigure with afig._zombie flag
    • wrap all of the figure methods to warn if the zombie flag is set
  • 3.8 flip the flag to default to True and remove the zombie warnings

might be a deprecation path to that?

That said, it leaves a weird inconsistency between doingplt.close() and hitting the 'x' if you have doneplt.ion() and are working interactively.

@QuLogic
Copy link
Member

Doesn't the inline backend auto-close figures? I'm pretty sure many people useplt.savefig after that, so we can't really break that.

Copy link
Contributor

@greglucasgreglucas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Probably target 3.6 on this and not 3.5.4.

@greglucasgreglucas merged commit87b801b intomatplotlib:mainAug 25, 2022
meeseeksmachine pushed a commit to meeseeksmachine/matplotlib that referenced this pull requestAug 25, 2022
@tacaswelltacaswell deleted the fix_no_gc_collect branchAugust 25, 2022 19:41
tacaswell added a commit that referenced this pull requestAug 25, 2022
…712-on-v3.6.xBackport PR#23712 on branch v3.6.x (FIX: do not try to help CPython with garbage collection)
@QuLogicQuLogic mentioned this pull requestSep 9, 2022
2 tasks
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@efiringefiringefiring approved these changes

@greglucasgreglucasgreglucas approved these changes

@QuLogicQuLogicAwaiting requested review from QuLogic

Assignees
No one assigned
Labels
PerformanceRelease criticalFor bugs that make the library unusable (segfaults, incorrect plots, etc) and major regressions.
Projects
None yet
Milestone
v3.6.0
Development

Successfully merging this pull request may close these issues.

[Bug]: plt.figure(), plt.close() leaks memory
6 participants
@tacaswell@nschloe@anntzer@QuLogic@efiring@greglucas

[8]ページ先頭

©2009-2025 Movatter.jp