Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork34k
GH-118036: Fix a bug with CALL_STAT_INC#117933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
We were under-counting calls in `_PyEvalFramePushAndInit`because the `CALL_STAT_INC` macro was redefined to a no-opfor the Tier 2 interpreter. The fix is a little convoluted.This ought to result in ~37% more "Frames pushed" reportedunder "Call stats". The new count is the correct one(I presume).
mdboom left a comment• edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I understand how this is broken and why this change fixes it, so I'm marking as approved, but I agree it's "weird" / more convoluted than it needs to be. Moving the order of functions inceval.c would result in less weirdness (probably just moving_PyEval_EvalFrameDefault to the bottom since that's where all the macro updates happen) but that would create a lot of churn.
Alternatively, what if you renameREAL_CALL_STAT_INC toCALL_STAT_INC_ALWAYS and then just useCALL_STAT_INC_ALWAYS directly from_PyEvalFramePushAndInit (which I think is the only call site impacted by this change). That would get rid of the weird dance of#undef and restoring justCALL_STAT_INC (but admittedly that would just replace it with another subtlety in_PyEvalFramePushAndInit).
Let's try a different fix instead.
This should fix the issue with CALL_STAT_INC in a cleaner way (even if the diff is much larger).
gvanrossum commentedApr 16, 2024 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Here's another version, where I moved the interpreter loop (and a small assortment of related stuff) to the end of the file. I ran Pystats on a single loop of the Richards benchmark, and found that most stats are approximately the same (not completely) except that "Frames pushed" is about 10% larger, which indicates that the fix works. (The diff is much more annoying to review, but I promise I just moved the stuff from the original lines 606 - 1120 to the bottom of the file, except I had to move |
gvanrossum commentedApr 17, 2024
Benchmark says speed and memory unchanged. |
markshannon commentedApr 17, 2024
I think the correct fix for In executor_cases.c.h |
gvanrossum commentedApr 17, 2024
Okay, I'll try that next. |
I'm going to try yet another approach.
gvanrossum commentedApr 17, 2024
The proof will be in the pudding. I'll fire off two benchmark runs, with pystats, one plain, one with Tier 2. (The JIT pystats ought to be similar but I don't want to wait.) |
gvanrossum commentedApr 17, 2024
Benchmark using Tier 1 only shows 36.8% more frames pushed, which is very close to what I measured with the first version of this PR, so I think that suggests this fixes that issue. Everything else I looked at is basically unchanged, suggesting I'm not breaking anything. Still waiting for the Tier 2 benchmark, will update when I see those numbers. |
gvanrossum commentedApr 17, 2024
Benchmark with Tier 2 poses a bit of a mystery, at least the pystats diff. Go toCall stats and open the details box.
@markshannon,@mdboom,@brandtbucher -- could there be some kind of double counting going on? Or is this an expected result? The only change from main is now that wedon't undefine |
markshannon commentedApr 18, 2024 • edited by mdboom
Loading Uh oh!
There was an error while loading.Please reload this page.
edited by mdboom
Uh oh!
There was an error while loading.Please reload this page.
I think the change is correct, even though the stats are probably still wrong. |
Uh oh!
There was an error while loading.Please reload this page.
We were under-counting calls in
_PyEvalFramePushAndInitbecause theCALL_STAT_INCmacro was redefined to a no-op for the Tier 2 interpreter. The fix is a little convoluted (I had wanted to move the code around, but that would require moving something else around, and in the end I figured it was easier to tweak the macros@markshannon might disagree though?). This ought to result in ~37% more "Frames pushed" reported under "Call stats". The new count is the correct one (I presume).@mdboom can you review? This isone commit from my experiment about removing Tier 2 entirely (gh-117908).
To see the effect, look atthese pystat diffs.