Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32k
gh-128045: Mark unknown opcodes as deopting to themselves#128044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
for name, deopt in sorted(deopts): | ||
out.emit(f"[{name}] = {deopt},\n") | ||
defined = set(analysis.opmap.values()) | ||
for i in range(256): | ||
if i not in defined: | ||
out.emit(f"[{i}] = {i},\n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Since we're not testing this at all in cpython, I'd suggest we at least add a couple of assertions to make sure this is correctly covering the range of opcodes:
forname,deoptinsorted(deopts): | |
out.emit(f"[{name}] ={deopt},\n") | |
defined=set(analysis.opmap.values()) | |
foriinrange(256): | |
ifinotindefined: | |
out.emit(f"[{i}] ={i},\n") | |
defined=set(analysis.opmap.values()) | |
foriinrange(256): | |
ifinotindefined: | |
deopts.append((f'{i}',f'{i}')) | |
assertlen(deopts)==256 | |
assertlen(set(x[0]forxindeopts))==256 | |
forname,deoptinsorted(deopts): | |
out.emit(f"[{name}] ={deopt},\n") |
Summary:When CPython hands out the bytecode it will first do a de-opt on it:https://www.internalfb.com/code/fbsource/[241e0d0760d4c107a3c0ac2b4914524120b0c909]/third-party/python/3.12/Objects/codeobject.c?lines=1540&reveal=1540This uses the de-opt table which only defines the known opcodes, meaning unknown opcodes get turned into 0's. We need CPython to at least define unknown opcodes to at least de-opt to themselves.Upstream PR:python/cpython#128044Reviewed By: jbower-fbDifferential Revision: D67350914fbshipit-source-id: 0073efab52da1be775272e7dd9ae5a46468ccb10
I think we regard undefined instructions an error. For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process. |
Took me a while to get back to this but I'm finally back :) I applied the changes suggested by@iritkatriel.
I think as long as we could still have some way to construct a code object ourselves that would be fine, we'd just may need to implement our own custom unmarshaling logic that could support our opcodes. Obviously that doesn't cover all the ways that things could change in the future but we can try and figure out how to adapt to other potential changes :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Looks good
cc9add6
intopython:mainUh oh!
There was an error while loading.Please reload this page.
Thanks@DinoV for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…onGH-128044)* Mark unknown opcodes as deopting to themselves(cherry picked from commitcc9add6)Co-authored-by: Dino Viehland <dinoviehland@meta.com>
GH-134228 is a backport of this pull request to the3.14 branch. |
Uh oh!
There was an error while loading.Please reload this page.
When accessing
co_code
on a code object we'll first run through to do the de-opthttps://github.com/python/cpython/blob/main/Objects/codeobject.c#L1706This removes any unknown opcodes. Instead the deopt table should just recognize unknown opcodes as deopting to themselves, allowing the extensible interpreter loop to consume unknown opcodes.
co_code
#128045