Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-131798: JIT: SplitCALL_ISINSTANCE
into several uops#133339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Thanks, I just see an opportunity for further cleanup!
Python/bytecodes.c Outdated
op(_GUARD_CALLABLE_ISINSTANCE_NULL, (callable,null, unused[oparg] --callable, null, unused[oparg])) { | ||
DEOPT_IF(!PyStackRef_IsNull(null)); | ||
} |
brandtbucherMay 3, 2025 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Since now we know that the oparg is two, we can split out the two args and get rid of the array. Also, let's give this a better name (since it's part of the same logical family as_GUARD_TOS_NULL
and_GUARD_NOS_NULL
).
op(_GUARD_CALLABLE_ISINSTANCE_NULL, (callable,null,unused[oparg]--callable,null,unused[oparg])) { | |
DEOPT_IF(!PyStackRef_IsNull(null)); | |
} | |
op(_GUARD_THIRD_NULL, (null,unused,unused--null,unused,unused)) { | |
DEOPT_IF(!PyStackRef_IsNull(null)); | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Nice! I agree it makes sense to make it reusable. I also moved it just under_GUARD_NOS_NULL
to help with discoverability
Uh oh!
There was an error while loading.Please reload this page.
Python/bytecodes.c Outdated
PyInterpreterState *interp = tstate->interp; | ||
DEOPT_IF(callable_o != interp->callable_cache.isinstance); | ||
} | ||
op(_CALL_ISINSTANCE, (callable, null, args[oparg] -- res)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
And here:
op(_CALL_ISINSTANCE, (callable,null,args[oparg]--res)) { | |
op(_CALL_ISINSTANCE, (callable,null,inst,cls--res)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
inst
is a reserved keyword so I went withinst_
:)
Uh oh!
There was an error while loading.Please reload this page.
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated. Once you have made the requested changes, please leave a comment on this pull request containing the phrase |
Python/bytecodes.c Outdated
if (retval < 0) { | ||
ERROR_NO_POP(); | ||
} | ||
(void)null; // Silence compiler warnings about unused variables | ||
PyStackRef_CLOSE(cls); | ||
PyStackRef_CLOSE(inst_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Interesting that with named stackrefs it won't let me do this:
DEAD(null);PyStackRef_CLOSE(cls);PyStackRef_CLOSE(inst_);
(the error isSyntaxError: Input 'null' is not live, but 'inst_' is
)
While with the previousargs
version this was fine:
DEAD(null);PyStackRef_CLOSE(args[0]);PyStackRef_CLOSE(args[1]);
I guess the cases generator can't reason about arrays? Another reason to use named stackrefs instead :)
CI looks good so: I have made the requested changes; please review again :) |
Thanks for making the requested changes! @brandtbucher: please review the changes made to this pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
One fix to the test:
Uh oh!
There was an error while loading.Please reload this page.
Python/bytecodes.c Outdated
PyInterpreterState *interp = tstate->interp; | ||
DEOPT_IF(callable_o != interp->callable_cache.isinstance); | ||
} | ||
op(_CALL_ISINSTANCE, (callable, null, inst_, cls -- res)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I don't love the trailing underscore. Not a huge deal, but maybe just rename toinstance
orobj
or something ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I renamed it toinstance
inbe50e24
CALL_ISINSTANCE
into severeal uopsCALL_ISINSTANCE
into several uops
You could do this if you want. Mark wanted to do this, but no one has time to implement it proper. The steps to do this would be the following:
|
Ok I'll give it a try! Seems like a nice project for the weekend (or for PyCon if I can't get it working by then 😄) Thanks for writing down the individual steps for me, it's super helpful :) |
c492ac7
intopython:mainUh oh!
There was an error while loading.Please reload this page.
@Fidget-Spinner, I noticed that the optimizer can already be turned off by setting Lines 1280 to 1288 in98e2c3a
Rather than exposing a new internal api, I think we could simply toggle this variable in the tests. I'm not sure if this is what Mark intended as this would still test more than just the optimizer proper. Though I like that these tests are more end-to-end. Anyway, here's what it'd look like. Let me know if this is something we want to pursue, otherwise I'll go back to thinking how to expose just the optimizer :) diff --git a/Lib/test/test_capi/test_opt.py b/Lib/test/test_capi/test_opt.pyindex 651148336f7..b82b36fa2f5 100644--- a/Lib/test/test_capi/test_opt.py+++ b/Lib/test/test_capi/test_opt.py@@ -11,6 +11,7 @@ from test.support import (script_helper, requires_specialization, import_helper, Py_GIL_DISABLED, requires_jit_enabled, reset_code)+from test.support.os_helper import EnvironmentVarGuard _testinternalcapi = import_helper.import_module("_testinternalcapi")@@ -458,6 +459,12 @@ def _run_with_optimizer(self, testfunc, arg): ex = get_first_executor(testfunc) return res, ex+ def _run_without_optimizer(self, testfunc, arg):+ with EnvironmentVarGuard() as env:+ env["PYTHON_UOPS_OPTIMIZE"] = "0"+ res = testfunc(arg)+ ex = get_first_executor(testfunc)+ return res, ex def test_int_type_propagation(self): def testfunc(loops):@@ -1951,6 +1958,17 @@ def testfunc(n): x += 1 return x+ res, ex = self._run_without_optimizer(testfunc, TIER2_THRESHOLD)+ self.assertEqual(res, TIER2_THRESHOLD)+ self.assertIsNotNone(ex)+ uops = get_opnames(ex)+ self.assertIn("_CALL_ISINSTANCE", uops)+ self.assertIn("_GUARD_THIRD_NULL", uops)+ self.assertIn("_GUARD_CALLABLE_ISINSTANCE", uops)++ # Invalidate the executor to force a reoptimization+ _testinternalcapi.invalidate_executors(testfunc.__code__)+ res, ex = self._run_with_optimizer(testfunc, TIER2_THRESHOLD) self.assertEqual(res, TIER2_THRESHOLD) self.assertIsNotNone(ex) |
@tomasr8 yeah I'm not advocating for removing the end-to-end tests altogether. Rather, the optimizer tests should have a mix of both end-to-end and unit. For example, it would be nice if we could generate an optimized trace without having to even wrap it in a for loop and run till JIT threshold. This actually makes the tests really slow because JIT threshold is quite high right now. |
Got it! Yeah, being able to simplify the tests and make them run faster is definitely worth it :) |
Uh oh!
There was an error while loading.Please reload this page.
Split
CALL_ISINSTANCE
into two guards and the uop itself.It will be easier to implement#133172 with this in place.