Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.3k
gh-128807: Add marking phase for free-threaded cyclic GC#128808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
ghost commentedJan 14, 2025 • edited by ghost
Loading Uh oh!
There was an error while loading.Please reload this page.
edited by ghost
Uh oh!
There was an error while loading.Please reload this page.
If the object is already marked as reachable, we shouldn't traverse itagain.
These are not strictly needed, simplify PR.
More code cleanup (better names, comments, simplify error handling).Fix bug in that "alive" bit must be checked in mark_alive_stack_push()to avoid visiting already visited objects.
Make sure we still do this optimization. There is also a unit test thatchecks for this.
Reduces duplicate code.
ccc5a11 to680f80fCompareThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks great! A few comments below
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
Use gc_abort_mark_alive() helper in case of OOM. In addition tofreeing the stack, we need to ensure that no object has the alive bitset on it. This also adding missing error handling in the case thatpropagate_alive_bits() fails.
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
080f444 intopython:mainUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
This is conceptually similar to the phase that was added to the non-free-threaded GC. Start with a set of known root objects, like sysdict and mark all objects reachable from those (revealed by the tp_traverse method) as "alive". We know anything marked alive cannot be garbage and can be excluded from the regular cyclic GC process. For most programs, this saves a moderate amount of computation since the marking pass is relatively cheaper per object.
If
gc.freeze()is used, it's unlikely that this marking phase will be a win since it's expected that the majority of objects will be frozen. Disable the marking phase if freeze is used.Seegh-126491 for the non-free-threaded version of this technique.
pyperformanceresults vs merge base. I suspect the slowdown on some benchmarks is not real. For example, regex_v8 should not be slower.
Here are the pyperformance results for a bare-metal AMD Ryzen machine. It does not show a slowdown on regex_v8, for example.
To better show the expected improvement, I ran a "sphinx build" benchmark, like ingh-124567. Results are: