Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-135904: Optimize the JIT's assembly control flow#135905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
brandtbucher wants to merge20 commits intopython:main
base:main
Choose a base branch
Loading
frombrandtbucher:justin-hot

Conversation

brandtbucher
Copy link
Member

@brandtbucherbrandtbucher commentedJun 25, 2025
edited by bedevere-appbot
Loading

This adds a pass to the JIT build step that optimizes the control flow of the assembly for each template before compiling it to machine code. It replaces our current zero-length jump removal, and does a bit more:

  • It allows the assembler to resolve and efficiently encode jumps to_JIT_CONTINUE by just sticking a label in the assembly itself.
  • It inverts certain branches where branching is the common case and falling through is the uncommon case, increasing our ability to maintain a straight-line sequence of hot code.

Another benefit of this approach is that the machine code in the comments ofjit_stencils-*.hactually represents the real code emitted at runtime (currently our jump-removal and nop padding change the code, but not the comment).

The resulting code is over1% faster. For an idea of how it impacts the stencils themselves, here's adiff.

Note that this mostly punts on AArch64 support... all I've done is implement the same zero-length jump removal that we already had (but@diegorusso is going to take care of the rest).

Later, we'll do proper hot-cold splitting, but that's for another PR.

_re_return: typing.ClassVar[re.Pattern[str]] = _RE_NEVER_MATCH

def __post_init__(self) -> None:
# Split the code into a linked list of basic blocks. A basic block is an

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So I'm trying to reason about what happens if we miss something, say a branch instruction that we did not include in our branch table?

Or is everything in the x64 spec already included in the table above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If we miss it, it won't be optimised and I don't think it will break the logic anyway.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Everything is already included (but one was misspelled, thanks for making me double-check).

If we miss one, we will accidentally create a superblock instead of a basic block, and will miss one outgoing edge. This could cause_invert_hot_branches to miscount the number of predecessors for the jump after the branch, and perform an invalid optimization. It could also make our hot-cold splitting incorrect.

So bad things could happen, but those would just be bugs that need fixing.

(Not detectingany branches is a special case, since_invert_hot_branches will never run, so everything is fine. That's why AArch64 works fine now, even though we haven't taught its optimizer about branches yet.)

_RE_NEVER_MATCH = re.compile(r"(?!)")
# Dictionary mapping branch instructions to their inverted branch instructions.
# If a branch cannot be inverted, the value is None:
_X86_BRANCHES = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

is it worth having a link to the source of these instructions?

brandtbucher reacted with thumbs up emoji
"jrxz": None,
"js": "jns",
"jz": "jnz",
"loop": None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The ones with None as a key won't be optmised. Do we really need to include them?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We still need to know that they're branches (in order to form our graph of blocks), even if they won't be optimized.

class _Block:
label: str | None = None
# Non-instruction lines like labels, directives, and comments:
noise: list[str] = dataclasses.field(default_factory=list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We have already discuss this, but can we find a better name for this? Noise implies something irrelevant or random data that can be discarded, eliminated or filtered out. We don't do this here, on the contrary we include all these lines otherwise we end up with a broken file.
I would vote for something like:

  • non_instructions: this is very explicit and clear
  • metadata: still good enough
  • other: this is fairly generic
    Metadata,

Fidget-Spinner reacted with thumbs up emoji
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've renamed tononinstructions.

@diegorusso
Copy link
Contributor

This is great BTW, I love it.

brandtbucher and Fidget-Spinner reacted with heart emojiFidget-Spinner reacted with rocket emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@diegorussodiegorussodiegorusso left review comments

@Fidget-SpinnerFidget-SpinnerFidget-Spinner left review comments

@savannahostrowskisavannahostrowskiAwaiting requested review from savannahostrowskisavannahostrowski is a code owner

Assignees

@brandtbucherbrandtbucher

Labels
awaiting core reviewinterpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usagetopic-JIT
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@brandtbucher@diegorusso@Fidget-Spinner

[8]ページ先頭

©2009-2025 Movatter.jp