Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
gh-132657: Add maybe_enable_deferred_ref_count().#142843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
If we are specializing to LOAD_GLOBAL_MODULE, set deferred referencecounting for the value, if it meets criteria. For now, it's only donefor frozenset objects.
Use it for LOAD_ATTR_MODULE in addition to LOAD_GLOBAL_MODULE. Don'tenable deferred ref counts if the object is owned by the current thread.Specialized bytecode is per-thread so this works. Enable forfrozensets, tuples and type objects.
colesbury left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think this makes sense and is worth doing. Let's also add a scaling test to ftscalingbench.py
Python/specialize.c Outdated
| PyObject*op; | ||
| if (PyDict_GetItemRef(dict,name,&op)!=1) { | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Let's combine this lookup with the earlier_PyDict_LookupIndex._PyDict_LookupIndex already gets the value internally and throws it away.
Python/specialize.c Outdated
| if (_Py_IsOwnedByCurrentThread(op)|| | ||
| !PyType_IS_GC(Py_TYPE(op))|| | ||
| _PyObject_HasDeferredRefcount(op)) { | ||
| Py_DECREF(op); | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The!PyType_IS_GC and_PyObject_HasDeferredRefcount checks aren't really necessary becausePyUnstable_Object_EnableDeferredRefcount handles those cases internally.
Python/specialize.c Outdated
| if (PyFrozenSet_Check(op)||PyTuple_Check(op)||PyType_Check(op)) { | ||
| PyUnstable_Object_EnableDeferredRefcount(op); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I think we should do this for all objects, not just a few types. Otherwise, I think we will keep run into scaling bottlenecks with global variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Hm... That's going to push a lot of objects to the GC for deallocation in programs that do a lot of work at the global level (but do have a few functions that are called enough that use those globals.) I'm thinking of naively written "scripts". Then again, it would only do that for objects that participate in GC (i.e. not strings or integers), so it'sprobably fine. I mean, nobody's too surprised when things aren't deallocated quite as quickly as they expect... right?
Yeah, I think this should be fine for main. I definitely wouldn't backport it to 3.14.
colesburyDec 17, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Definitely agree regarding not backporting to 3.14.
I think the above check that excludes owned by the current thread should help with most scripts -- it'll only affect global variables accessed by multiple threads.
Another thing that may help is that I think thePyUnstable_Object_EnableDeferredRefcount call only happens at specialization time, not every time a variable is updated or accessed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Yeah, I'm a bit worried about increased memory usage. Since it's only for objects shared between threads and only for hot LOAD_GLOBAL/LOAD_ATTR instructions, maybe it's okay. Perhaps we should consider changinggc_should_collect() to check not only the young object count but also the process memory size increase as well. The condition withcount < gcstate->long_lived_total / 4 could be done with an "or" condition with the process size check.
Also, avoid PyDict_GetItemRef() call by returning the value when theindex is looked up (as previously discarded).
This is taken from the PRpythonGH-132658.
Uh oh!
There was an error while loading.Please reload this page.
If we are specializing to LOAD_GLOBAL_MODULE, set deferred reference counting for the value, if it meets criteria. For now, it's only done for frozenset objects.