Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-135904: Optimize the JIT's assembly control flow#135905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
brandtbucher merged 20 commits intopython:mainfrombrandtbucher:justin-hot
Jun 27, 2025
Merged
Changes from1 commit
Commits
Show all changes
20 commits
Select commitHold shift + click to select a range
5606078
Do some textual assembly magic
brandtbucherJun 12, 2025
858624a
Switch to a linked list
brandtbucherJun 12, 2025
77886d0
Handle prefixes
brandtbucherJun 12, 2025
da72079
Rework optimizer and break it out into its own module
brandtbucherJun 19, 2025
9ebb32c
Add AArch64 stub
brandtbucherJun 20, 2025
4f85fab
Shake out some bugs on Windows
brandtbucherJun 20, 2025
d2c9ae9
Disable dead code removal for now
brandtbucherJun 20, 2025
ee97e87
More cleanup, don't add jumps
brandtbucherJun 20, 2025
b436751
Fix Windows label detection (and cleanup)
brandtbucherJun 20, 2025
fbb859d
fixup
brandtbucherJun 20, 2025
4cfabf5
Add comment and jump detection for AArch64
brandtbucherJun 20, 2025
77a8ba1
Fix bug in predecessor detection
brandtbucherJun 23, 2025
a987be7
formatting
brandtbucherJun 23, 2025
5ec6022
Commenting and cleanup
brandtbucherJun 24, 2025
a577c36
More comments
brandtbucherJun 24, 2025
807a359
fixup
brandtbucherJun 24, 2025
3541ef7
fixup
brandtbucherJun 24, 2025
51456c3
blurb add
brandtbucherJun 24, 2025
6ddeaaf
Add comments, noise -> noninstructions, jrxz -> jrcxz, predecessor ->…
brandtbucherJun 25, 2025
b6da4e6
fixup
brandtbucherJun 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
More comments
  • Loading branch information
@brandtbucher
brandtbucher committedJun 24, 2025
commita577c364fdbabddf4f897e2e2f9315ad46831474
73 changes: 59 additions & 14 deletionsTools/jit/_optimizers.py
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -41,11 +41,17 @@
@dataclasses.dataclass
class _Block:
label: str | None = None
header: list[str] = dataclasses.field(default_factory=list)
# Non-instruction lines like labels, directives, and comments:
noise: list[str] = dataclasses.field(default_factory=list)
Copy link
Contributor

@diegorussodiegorussoJun 25, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We have already discuss this, but can we find a better name for this? Noise implies something irrelevant or random data that can be discarded, eliminated or filtered out. We don't do this here, on the contrary we include all these lines otherwise we end up with a broken file.
I would vote for something like:

  • non_instructions: this is very explicit and clear
  • metadata: still good enough
  • other: this is fairly generic

Fidget-Spinner reacted with thumbs up emoji
Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I've renamed tononinstructions.

diegorusso reacted with thumbs up emoji
# Instruction lines:
instructions: list[str] = dataclasses.field(default_factory=list)
# If this block ends in a jump, where to?
target: typing.Self | None = None
# The next block in the linked list:
link: typing.Self | None = None
# Whether control flow can fall through to the linked block above:
fallthrough: bool = True
# Whether this block can eventually reach the next uop (_JIT_CONTINUE):
hot: bool = False

def resolve(self) -> typing.Self:
Expand All@@ -68,7 +74,7 @@ class Optimizer:
_root: _Block = dataclasses.field(init=False, default_factory=_Block)
_labels: dict[str, _Block] = dataclasses.field(init=False, default_factory=dict)
# No groups:
_re_header: typing.ClassVar[re.Pattern[str]] = re.compile(r"\s*(?:\.|#|//|$)")
_re_noise: typing.ClassVar[re.Pattern[str]] = re.compile(r"\s*(?:\.|#|//|$)")
# One group (label):
_re_label: typing.ClassVar[re.Pattern[str]] = re.compile(
r'\s*(?P<label>[\w."$?@]+):'
Expand All@@ -84,32 +90,44 @@ class Optimizer:
_re_return: typing.ClassVar[re.Pattern[str]] = _RE_NEVER_MATCH

def __post_init__(self) -> None:
# Split the code into a linked list of basic blocks. A basic block is an

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

So I'm trying to reason about what happens if we miss something, say a branch instruction that we did not include in our branch table?

Or is everything in the x64 spec already included in the table above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If we miss it, it won't be optimised and I don't think it will break the logic anyway.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Everything is already included (but one was misspelled, thanks for making me double-check).

If we miss one, we will accidentally create a superblock instead of a basic block, and will miss one outgoing edge. This could cause_invert_hot_branches to miscount the number of predecessors for the jump after the branch, and perform an invalid optimization. It could also make our hot-cold splitting incorrect.

So bad things could happen, but those would just be bugs that need fixing.

(Not detectingany branches is a special case, since_invert_hot_branches will never run, so everything is fine. That's why AArch64 works fine now, even though we haven't taught its optimizer about branches yet.)

diegorusso reacted with thumbs up emoji
# optional label, followed by zero or more non-instruction ("noise")
# lines, followed by one or more instruction lines (only the last of
# which may be a branch, jump, or return).
text = self._preprocess(self.path.read_text())
block = self._root
for line in text.splitlines():
# See if we need to start a new block:
if match := self._re_label.match(line):
# Label. New block:
block.link = block = self._lookup_label(match["label"])
block.header.append(line)
block.noise.append(line)
continue
if self._re_header.match(line):
if self._re_noise.match(line):
if block.instructions:
# Noise lines. New block:
block.link = block = _Block()
block.header.append(line)
block.noise.append(line)
continue
if block.target or not block.fallthrough:
# Current block ends with a branch, jump, or return. New block:
block.link = block = _Block()
block.instructions.append(line)
if match := self._re_branch.match(line):
# A block ending in a branch has a target, and fallthrough:
block.target = self._lookup_label(match["target"])
assert block.fallthrough
elif match := self._re_jump.match(line):
# A block ending in a jump has a target, and no fallthrough:
block.target = self._lookup_label(match["target"])
block.fallthrough = False
elif self._re_return.match(line):
# A block ending in a return has no target, and fallthrough:
assert not block.target
block.fallthrough = False

def _preprocess(self, text: str) -> str:
# Override this method to do preprocessing of the textual assembly:
return text

@classmethod
Expand All@@ -120,13 +138,21 @@ def _invert_branch(cls, line: str, target: str) -> str | None:
if not inverted:
return None
(a, b), (c, d) = match.span("instruction"), match.span("target")
# Before:
# je FOO
# After:
# jne BAR
return "".join([line[:a], inverted, line[b:c], target, line[d:]])

@classmethod
def _update_jump(cls, line: str, target: str) -> str:
match = cls._re_jump.match(line)
assert match
a, b = match.span("target")
# Before:
# jmp FOO
# After:
# jmp BAR
return "".join([line[:a], target, line[b:]])

def _lookup_label(self, label: str) -> _Block:
Expand All@@ -146,30 +172,41 @@ def _body(self) -> str:
for block in self._blocks():
if hot != block.hot:
hot = block.hot
# Make it easy to tell at a glance where cold code is:
lines.append(f"# JIT: {'HOT' if hot else 'COLD'} ".ljust(80, "#"))
lines.extend(block.header)
lines.extend(block.noise)
lines.extend(block.instructions)
return "\n".join(lines)

def _predecessors(self, block: _Block) -> typing.Generator[_Block, None, None]:
# This is inefficient, but it's never wrong:
for predecessor in self._blocks():
if predecessor.target is block or (
predecessor.fallthrough and predecessor.link is block
):
yield predecessor

def _insert_continue_label(self) -> None:
# Find the block with the last instruction:
for end in reversed(list(self._blocks())):
if end.instructions:
break
# Before:
# jmp FOO
# After:
# jmp FOO
# .balign 8
# _JIT_CONTINUE:
align = _Block()
align.header.append(f"\t.balign\t{self._alignment}")
align.noise.append(f"\t.balign\t{self._alignment}")
continuation = self._lookup_label(f"{self.prefix}_JIT_CONTINUE")
assert continuation.label
continuation.header.append(f"{continuation.label}:")
continuation.noise.append(f"{continuation.label}:")
end.link, align.link, continuation.link = align, continuation, end.link

def _mark_hot_blocks(self) -> None:
# Start with the last block, and perform a DFS to find all blocks that
# can eventually reach it:
todo = list(self._blocks())[-1:]
while todo:
block = todo.pop()
Expand All@@ -181,17 +218,17 @@ def _mark_hot_blocks(self) -> None:
)

def _invert_hot_branches(self) -> None:
# Before:
# branch <hot>
# jump <cold>
# After:
# opposite-branch <cold>
# jump <hot>
for branch in self._blocks():
link = branch.link
if link is None:
continue
jump = link.resolve()
# Before:
# je HOT
# jmp COLD
# After:
# jne COLD
# jmp HOT
if (
# block ends with a branch to hot code...
branch.target
Expand All@@ -209,6 +246,7 @@ def _invert_hot_branches(self) -> None:
inverted = self._invert_branch(
branch.instructions[-1], jump.target.label
)
# Check to see if the branch can even be inverted:
if inverted is None:
continue
branch.instructions[-1] = inverted
Expand All@@ -219,7 +257,14 @@ def _invert_hot_branches(self) -> None:
jump.hot = True

def _remove_redundant_jumps(self) -> None:
# Zero-length jumps can be introduced by _insert_continue_label and
# _invert_hot_branches:
for block in self._blocks():
# Before:
# jmp FOO
# FOO:
# After:
# FOO:
if (
block.target
and block.link
Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp