Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
GH-130415: Improve the JIT's unneeded uop removal pass#132333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Ithink this should be fine? optimize_pop_top_again should reach a fixpoint (there are only a finite number of last we can change to _NOP).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I have a few questions.
The overall design looks good.
@@ -1283,7 +1283,7 @@ class Bar: | |||
load_attr_top = opnames.index("_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES", 0, call) | |||
load_attr_bottom = opnames.index("_LOAD_ATTR_NONDESCRIPTOR_WITH_VALUES", call) | |||
self.assertEqual(opnames[:load_attr_top].count("_GUARD_TYPE_VERSION"), 1) | |||
self.assertEqual(opnames[call:load_attr_bottom].count("_CHECK_VALIDITY"),1) | |||
self.assertEqual(opnames[call:load_attr_bottom].count("_CHECK_VALIDITY"),2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Why do we have two_CHECK_VALIDITY
s now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
There used to be one validity check, after thesetattr
call. This adds an additional one after the_POP_TOP
following the call, sincein theory the function could return an object with a destructor that invalidates the JIT code when it's popped.
Because of the old structure of this optimization pass, we weren't actually treating_POP_TOP
as escaping, even though it is (this is handled in thedefault
case, which we fall through to now).
Python/optimizer_analysis.c Outdated
switch (last->opcode) { | ||
case _POP_TWO_LOAD_CONST_INLINE_BORROW: | ||
last->opcode = _POP_TOP; | ||
goto optimize_pop_top_again; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Do we want to optimize again here?
If the previous instruction is_POP_TOP
I see no further optimization to apply here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nice catch. We might want to combine smaller ops into larger ones later (for example, turn_POP_TOP; _POP_TOP_LOAD_CONST_INLINE_BORROW
into_NOP; _POP_TWO_LOAD_CONST_INLINE_BORROW
), which is where I think I was going with this. But that can come later.
last->opcode = _NOP; | ||
buffer[pc].opcode = _NOP; | ||
} | ||
if (last->opcode == _REPLACE_WITH_TRUE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Are we not optimizing this any more? I would have thought it would go in the same case as_POP_TOP_LOAD_CONST_INLINE_BORROW
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nice catch. Though these should probably just be turned into_POP_TOP_LOAD_CONST_INLINE_BORROW
by the optimizer instead.
I'll leave it this way and open up another PR for that. Having the hardcoded constant makes things more difficult, since we'd need to start writing operands instead of operating purely on opcodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Actually, I just added that change as part of this PR.
4f7f72c
intopython:mainUh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
This improves our ability to remove unused constant and local variable loads during the
remove_unneeded_uops
pass in the JIT's optimizer.It now handles all combinations of:
_COPY
_LOAD_CONST_INLINE
_LOAD_CONST_INLINE_BORROW
_LOAD_FAST
_LOAD_FAST_BORROW
_LOAD_SMALL_INT
...followed by...
_POP_TOP
_POP_TOP_LOAD_CONST_INLINE
_POP_TOP_LOAD_CONST_INLINE_BORROW
_POP_TWO_LOAD_CONST_INLINE_BORROW
(It also fixes an issue where
CHECK_VALIDITY
wasn't being added after these latter four instructions, due to how the oldswitch
-case
was structured.)According to thestats, this results in:
_LOAD_FAST
_POP_TOP_LOAD_CONST_INLINE_BORROW
_POP_TOP_LOAD_CONST_INLINE
@savannahostrowski: We should see even more impact once you start using
_POP_TWO_LOAD_CONST_INLINE_BORROW
in more places. I'm currently working on turning basically all cached class attribute/method loads into constants too, which should help here as well.