Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork32.2k
GH-135904: Optimize the JIT's assembly control flow#135905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
base:main
Are you sure you want to change the base?
Conversation
_re_return: typing.ClassVar[re.Pattern[str]] = _RE_NEVER_MATCH | ||
def __post_init__(self) -> None: | ||
# Split the code into a linked list of basic blocks. A basic block is an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
So I'm trying to reason about what happens if we miss something, say a branch instruction that we did not include in our branch table?
Or is everything in the x64 spec already included in the table above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
If we miss it, it won't be optimised and I don't think it will break the logic anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
Everything is already included (but one was misspelled, thanks for making me double-check).
If we miss one, we will accidentally create a superblock instead of a basic block, and will miss one outgoing edge. This could cause_invert_hot_branches
to miscount the number of predecessors for the jump after the branch, and perform an invalid optimization. It could also make our hot-cold splitting incorrect.
So bad things could happen, but those would just be bugs that need fixing.
(Not detectingany branches is a special case, since_invert_hot_branches
will never run, so everything is fine. That's why AArch64 works fine now, even though we haven't taught its optimizer about branches yet.)
Uh oh!
There was an error while loading.Please reload this page.
_RE_NEVER_MATCH = re.compile(r"(?!)") | ||
# Dictionary mapping branch instructions to their inverted branch instructions. | ||
# If a branch cannot be inverted, the value is None: | ||
_X86_BRANCHES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
is it worth having a link to the source of these instructions?
"jrxz": None, | ||
"js": "jns", | ||
"jz": "jnz", | ||
"loop": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
The ones with None as a key won't be optmised. Do we really need to include them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We still need to know that they're branches (in order to form our graph of blocks), even if they won't be optimized.
Tools/jit/_optimizers.py Outdated
class _Block: | ||
label: str | None = None | ||
# Non-instruction lines like labels, directives, and comments: | ||
noise: list[str] = dataclasses.field(default_factory=list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
We have already discuss this, but can we find a better name for this? Noise implies something irrelevant or random data that can be discarded, eliminated or filtered out. We don't do this here, on the contrary we include all these lines otherwise we end up with a broken file.
I would vote for something like:
- non_instructions: this is very explicit and clear
- metadata: still good enough
- other: this is fairly generic
Metadata,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
I've renamed tononinstructions
.
This is great BTW, I love it. |
Uh oh!
There was an error while loading.Please reload this page.
This adds a pass to the JIT build step that optimizes the control flow of the assembly for each template before compiling it to machine code. It replaces our current zero-length jump removal, and does a bit more:
_JIT_CONTINUE
by just sticking a label in the assembly itself.Another benefit of this approach is that the machine code in the comments of
jit_stencils-*.h
actually represents the real code emitted at runtime (currently our jump-removal and nop padding change the code, but not the comment).The resulting code is over1% faster. For an idea of how it impacts the stencils themselves, here's adiff.
Note that this mostly punts on AArch64 support... all I've done is implement the same zero-length jump removal that we already had (but@diegorusso is going to take care of the rest).
Later, we'll do proper hot-cold splitting, but that's for another PR.