Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-131798: JIT: Optimize_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW#134369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
brandtbucher merged 9 commits intopython:mainfromtomasr8:jit-optimize-pop-call-two
May 22, 2025

Conversation

tomasr8
Copy link
Member

@tomasr8tomasr8 commentedMay 20, 2025
edited by bedevere-appbot
Loading

Optimize_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW

This requires adding some additional ops:

  • _POP_CALL_ONE_LOAD_CONST_INLINE_BORROW
  • _POP_CALL_LOAD_CONST_INLINE_BORROW
  • _POP_THREE
  • _POP_TWO

We can now optimize:

_LOAD_CONST_INLINE_BORROW_LOAD_CONST_INLINE_BORROW_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW

into just

_POP_CALL_LOAD_CONST_INLINE_BORROW

i.e. we save two stack pushes and two stack pops.

PyStackRef_CLOSE(pop1);
}

tier2 pure op(_POP_CALL_LOAD_CONST_INLINE_BORROW, (ptr/4, callable, null -- value)) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you remove thepure annotations from these please? They don't get optimized to anything so there's not really a use for them.

tomasr8 reacted with thumbs up emoji
Copy link
Member

@Fidget-SpinnerFidget-Spinner left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for the PR. I think@brandtbucher brought this up before, but I think we should be making the uop buffer add-to rather than in-place. What I mean is that instead of adding a pop variant for every optimization we want to do:

..._POP_X_LOAD_CONST_INLINE_BORROW

We should add:

..._POP_TOP_POP_TOP..._LOAD_CONST_INLINE_BORROW

This way:

  1. We don't have to add a new special pop uop for every single op. This saves on mental overhead, because a special pop variant of the uop for every uop we want to optimize is really tough to reason about.
  2. In the optimizer, we can just generalize it toPOP_TOP X times, which is easier to write code to optimize for, rather than add a new entry toremove_unneeded_uops. As you probably noticed already, it's really tedious to do this.

This would require allowing us to add to the uop buffer, rather than overwriting it in-place. It might be a bigger change, but if you're up for it, it would greatly simplify all future optimizations, and I'd be grateful for someone cleaning that part up. It would best be a separate PR.

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@tomasr8
Copy link
MemberAuthor

Thanks for the PR. I think@brandtbucher brought this up before, but I think we should be making the uop buffer add-to rather than in-place.

We actually discussed this w/ Brandt yesterday and I totally agree with that. Brandt also mentioned that it is ok to add this for now but I don't mind closing this and focusing on changing the uop buffer right away.

Note that Brandt greatly simplifiedremove_unneeded_uops yesterday in#134373. Once I rebase this PR, it should end up being much simpler.

cc@brandtbucher

Fidget-Spinner reacted with thumbs up emoji

@brandtbucher
Copy link
Member

Yeah, 90% of the reason why I felt the urgency to move away from in-place modification was because the logic in the pass that removes matching pushes and pops was getting completely out-of-hand. That's not a concern anymore, so I think we can keep the momentum going on the current in-place optimizer work for the time being.

Fidget-Spinner reacted with thumbs up emoji

@Fidget-Spinner
Copy link
Member

Fidget-Spinner commentedMay 21, 2025
edited
Loading

Yeah, 90% of the reason why I felt the urgency to move away from in-place modification was because the logic in the pass that removes matching pushes and pops was getting completely out-of-hand. That's not a concern anymore, so I think we can keep the momentum going on the current in-place optimizer work for the time being.

Ok. Can we please make sure to move away from the in-place optimizer after this PR for any new pop tops load const inline X instructions? The pushes and pops are no longer an issue, but the number of new uops variants we're adding is getting out of hand. I have no clue what some of these uops do unless I specifically look up their emission in the optimizer. We don't need this many uops and fewer uops is generally a better thing (less maintainer burden, less human error ...). If we can generalize to just a fewreplicate(7) POP_TOP_X instructions, that would be awesome.

brandtbucher and tomasr8 reacted with thumbs up emoji

@brandtbucher
Copy link
Member

Yep, I agree. It also allows us to specialize individual pops for specific known types, immortal values, non-escaping pops, etc. After the sprint ends tomorrow, we can start with larger cleanup efforts (which I agree we need, in the tracer too).

tomasr8 reacted with thumbs up emoji

@Fidget-Spinner
Copy link
Member

Great! Feel free to dismiss my review then in the GH UI.

@tomasr8tomasr8force-pushed thejit-optimize-pop-call-two branch 4 times, most recently from3d71a8e to47995d0CompareMay 22, 2025 15:47
@tomasr8tomasr8force-pushed thejit-optimize-pop-call-two branch from47995d0 to90d38d7CompareMay 22, 2025 15:48
@tomasr8tomasr8 marked this pull request as ready for reviewMay 22, 2025 15:50
@brandtbucher
Copy link
Member

When I get back, I'll start working on making the optimizer operate on a copy of the stream. This is good for now though. Really cool to see entire calls start disappearing.

tomasr8 reacted with rocket emoji

YvesDup pushed a commit to YvesDup/cpython that referenced this pull requestMay 22, 2025
@brandtbucherbrandtbucher merged commitd1ea8ed intopython:mainMay 22, 2025
64 checks passed
@brandtbucher
Copy link
Member

Looks great.@tomasr8: natural follow-ups would be support for tuples of types inisinstance calls, as well as extending this optimization to the other calls we specialize (and seeing if we should possibly specialize more of them).

@tomasr8tomasr8 deleted the jit-optimize-pop-call-two branchMay 22, 2025 18:06
@tomasr8
Copy link
MemberAuthor

Looks great.@tomasr8: natural follow-ups would be support for tuples of types inisinstance calls, as well as extending this optimization to the other calls we specialize (and seeing if we should possibly specialize more of them).

on it!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@Fidget-SpinnerFidget-SpinnerFidget-Spinner requested changes

@brandtbucherbrandtbucherbrandtbucher approved these changes

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@tomasr8@brandtbucher@Fidget-Spinner

[8]ページ先頭

©2009-2025 Movatter.jp