Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-128045: Mark unknown opcodes as deopting to themselves#128044

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
DinoV merged 2 commits intopython:mainfromDinoV:deopt_unknown_ops
May 19, 2025

Conversation

DinoV
Copy link
Contributor

@DinoVDinoV commentedDec 17, 2024
edited by bedevere-appbot
Loading

When accessingco_code on a code object we'll first run through to do the de-opthttps://github.com/python/cpython/blob/main/Objects/codeobject.c#L1706

This removes any unknown opcodes. Instead the deopt table should just recognize unknown opcodes as deopting to themselves, allowing the extensible interpreter loop to consume unknown opcodes.

@DinoVDinoV changed the titleMark unknown opcodes as deopting to themselvesgh-128045: Mark unknown opcodes as deopting to themselvesDec 17, 2024
@DinoVDinoV marked this pull request as ready for reviewDecember 17, 2024 20:12
Comment on lines 251 to 256
for name, deopt in sorted(deopts):
out.emit(f"[{name}] = {deopt},\n")
defined = set(analysis.opmap.values())
for i in range(256):
if i not in defined:
out.emit(f"[{i}] = {i},\n")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Since we're not testing this at all in cpython, I'd suggest we at least add a couple of assertions to make sure this is correctly covering the range of opcodes:

Suggested change
forname,deoptinsorted(deopts):
out.emit(f"[{name}] ={deopt},\n")
defined=set(analysis.opmap.values())
foriinrange(256):
ifinotindefined:
out.emit(f"[{i}] ={i},\n")
defined=set(analysis.opmap.values())
foriinrange(256):
ifinotindefined:
deopts.append((f'{i}',f'{i}'))
assertlen(deopts)==256
assertlen(set(x[0]forxindeopts))==256
forname,deoptinsorted(deopts):
out.emit(f"[{name}] ={deopt},\n")

DinoV reacted with thumbs up emoji
facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this pull requestDec 18, 2024
Summary:When CPython hands out the bytecode it will first do a de-opt on it:https://www.internalfb.com/code/fbsource/[241e0d0760d4c107a3c0ac2b4914524120b0c909]/third-party/python/3.12/Objects/codeobject.c?lines=1540&reveal=1540This uses the de-opt table which only defines the known opcodes, meaning unknown opcodes get turned into 0's. We need CPython to at least define unknown opcodes to at least de-opt to themselves.Upstream PR:python/cpython#128044Reviewed By: jbower-fbDifferential Revision: D67350914fbshipit-source-id: 0073efab52da1be775272e7dd9ae5a46468ccb10
@markshannon
Copy link
Member

I think we regard undefined instructions an error.
If you want to pass custom bytecodes to your custom interpreter, a more robust approach might be needed.
We can do the deopt thing for now, but it seems fragile.

For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process.

@DinoV
Copy link
ContributorAuthor

Took me a while to get back to this but I'm finally back :) I applied the changes suggested by@iritkatriel.

I think we regard undefined instructions an error. If you want to pass custom bytecodes to your custom interpreter, a more robust approach might be needed. We can do the deopt thing for now, but it seems fragile.

For example, one thing I have considered is storing bytecodes in a compact format on disk: Instructions without an oparg would take 1 byte, those with an oparg would take 2. We would then combine the unmarshalling and quickening steps to create the full in-memory form in a single pass. Custom instructions would not survive this process.

I think as long as we could still have some way to construct a code object ourselves that would be fine, we'd just may need to implement our own custom unmarshaling logic that could support our opcodes. Obviously that doesn't cover all the ways that things could change in the future but we can try and figure out how to adapt to other potential changes :)

Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good

@DinoVDinoV merged commitcc9add6 intopython:mainMay 19, 2025
59 checks passed
@miss-islington-app
Copy link

Thanks@DinoV for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14.
🐍🍒⛏🤖

miss-islington pushed a commit to miss-islington/cpython that referenced this pull requestMay 19, 2025
…onGH-128044)* Mark unknown opcodes as deopting to themselves(cherry picked from commitcc9add6)Co-authored-by: Dino Viehland <dinoviehland@meta.com>
@bedevere-app
Copy link

GH-134228 is a backport of this pull request to the3.14 branch.

@bedevere-appbedevere-appbot removed the needs backport to 3.14bugs and security fixes labelMay 19, 2025
DinoV pushed a commit that referenced this pull requestMay 19, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@iritkatrieliritkatrieliritkatriel left review comments

@markshannonmarkshannonmarkshannon approved these changes

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@DinoV@markshannon@iritkatriel

[8]ページ先頭

©2009-2025 Movatter.jp