NotificationsYou must be signed in to change notification settings
Fork32.3k
Star67.7k

gh-131798: JIT: Optimize`_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW`#134369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Merged

brandtbucher merged 9 commits intopython:mainfromtomasr8:jit-optimize-pop-call-two

May 22, 2025

Merged

gh-131798: JIT: Optimize`_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW`#134369

brandtbucher merged 9 commits intopython:mainfromtomasr8:jit-optimize-pop-call-two

May 22, 2025

Conversation

Copy link

Member

tomasr8 commentedMay 20, 2025•
edited by bedevere-appbot
Loading

Optimize_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW

This requires adding some additional ops:

_POP_CALL_ONE_LOAD_CONST_INLINE_BORROW
_POP_CALL_LOAD_CONST_INLINE_BORROW
_POP_THREE
_POP_TWO

We can now optimize:

_LOAD_CONST_INLINE_BORROW_LOAD_CONST_INLINE_BORROW_POP_CALL_TWO_LOAD_CONST_INLINE_BORROW

into just

_POP_CALL_LOAD_CONST_INLINE_BORROW

i.e. we save two stack pushes and two stack pops.

Issue:Better uop coverage in the JIT optimizer #131798

tomasr8 requested a review frombrandtbucher

May 20, 2025 17:47

tomasr8 requested review fromFidget-Spinner andmarkshannon ascode owners

May 20, 2025 17:47

bedevere-appbot mentioned this pull request

May 20, 2025

Better uop coverage in the JIT optimizer#131798

Open

bedevere-appbot added the awaiting review label

May 20, 2025

tomasr8 force-pushed thejit-optimize-pop-call-two branch fromf37b9c3 to82e0a17Compare

May 20, 2025 18:10

tomasr8 marked this pull request as draft

May 20, 2025 20:34

bedevere-appbot removed the awaiting review label

May 20, 2025

Fidget-Spinner reviewed

May 21, 2025

View reviewed changes

Python/bytecodes.c Outdated

		PyStackRef_CLOSE(pop1);
		}

		tier2 pure op(_POP_CALL_LOAD_CONST_INLINE_BORROW, (ptr/4, callable, null -- value)) {

Copy link

Member

Fidget-SpinnerMay 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can you remove thepure annotations from these please? They don't get optimized to anything so there's not really a use for them.

Fidget-Spinner requested changes

May 21, 2025

View reviewed changes

Copy link

Member

Fidget-Spinner left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for the PR. I think@brandtbucher brought this up before, but I think we should be making the uop buffer add-to rather than in-place. What I mean is that instead of adding a pop variant for every optimization we want to do:

..._POP_X_LOAD_CONST_INLINE_BORROW

We should add:

..._POP_TOP_POP_TOP..._LOAD_CONST_INLINE_BORROW

This way:

We don't have to add a new special pop uop for every single op. This saves on mental overhead, because a special pop variant of the uop for every uop we want to optimize is really tough to reason about.
In the optimizer, we can just generalize it toPOP_TOP X times, which is easier to write code to optimize for, rather than add a new entry toremove_unneeded_uops. As you probably noticed already, it's really tedious to do this.

This would require allowing us to add to the uop buffer, rather than overwriting it in-place. It might be a bigger change, but if you're up for it, it would greatly simplify all future optimizations, and I'd be grateful for someone cleaning that part up. It would best be a separate PR.

Copy link

bedevere-appbot commentedMay 21, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

bedevere-appbot added the awaiting changes label

May 21, 2025

Copy link

MemberAuthor

tomasr8 commentedMay 21, 2025

Thanks for the PR. I think@brandtbucher brought this up before, but I think we should be making the uop buffer add-to rather than in-place.

We actually discussed this w/ Brandt yesterday and I totally agree with that. Brandt also mentioned that it is ok to add this for now but I don't mind closing this and focusing on changing the uop buffer right away.

Note that Brandt greatly simplifiedremove_unneeded_uops yesterday in#134373. Once I rebase this PR, it should end up being much simpler.

cc@brandtbucher

Copy link

Member

brandtbucher commentedMay 21, 2025

Yeah, 90% of the reason why I felt the urgency to move away from in-place modification was because the logic in the pass that removes matching pushes and pops was getting completely out-of-hand. That's not a concern anymore, so I think we can keep the momentum going on the current in-place optimizer work for the time being.

Copy link

Member

Fidget-Spinner commentedMay 21, 2025•
edited
Loading

Yeah, 90% of the reason why I felt the urgency to move away from in-place modification was because the logic in the pass that removes matching pushes and pops was getting completely out-of-hand. That's not a concern anymore, so I think we can keep the momentum going on the current in-place optimizer work for the time being.

Ok. Can we please make sure to move away from the in-place optimizer after this PR for any new pop tops load const inline X instructions? The pushes and pops are no longer an issue, but the number of new uops variants we're adding is getting out of hand. I have no clue what some of these uops do unless I specifically look up their emission in the optimizer. We don't need this many uops and fewer uops is generally a better thing (less maintainer burden, less human error ...). If we can generalize to just a fewreplicate(7) POP_TOP_X instructions, that would be awesome.

Copy link

Member

brandtbucher commentedMay 21, 2025

Yep, I agree. It also allows us to specialize individual pops for specific known types, immortal values, non-escaping pops, etc. After the sprint ends tomorrow, we can start with larger cleanup efforts (which I agree we need, in the tracer too).

Copy link

Member

Fidget-Spinner commentedMay 21, 2025

Great! Feel free to dismiss my review then in the GH UI.

tomasr8 force-pushed thejit-optimize-pop-call-two branch 4 times, most recently from3d71a8e to47995d0Compare

May 22, 2025 15:47

Optimize _POP_CALL_TWO_LOAD_CONST_INLINE_BORROW

90d38d7

tomasr8 force-pushed thejit-optimize-pop-call-two branch from47995d0 to90d38d7Compare

May 22, 2025 15:48

Add news entry

0e66db4

tomasr8 marked this pull request as ready for review

May 22, 2025 15:50

bedevere-appbot added awaiting review and removed awaiting changes labels

May 22, 2025

tomasr8 added3 commits

May 22, 2025 17:54

Trailing whitespace

0e1030b

Simplify logic

af88646

Wrap line

3ed47d2

tomasr8 added4 commits

May 22, 2025 18:06

PEP7

bf27867

PEP7 vol. 2

2f90609

Simplify logic

5db29d8

Style fixes

efbded1

brandtbucher approved these changes

May 22, 2025

View reviewed changes

bedevere-appbot added awaiting merge and removed awaiting review labels

May 22, 2025

Copy link

Member

brandtbucher commentedMay 22, 2025

When I get back, I'll start working on making the optimizer operate on a copy of the stream. This is good for now though. Really cool to see entire calls start disappearing.

YvesDup pushed a commit to YvesDup/cpython that referenced this pull request

May 22, 2025

pythonGH-131798: Optimize away isinstance calls in the JIT (pythonGH-…

484e003

…134369)

brandtbucher merged commitd1ea8ed intopython:main

May 22, 2025

64 checks passed

bedevere-appbot removed the awaiting merge label

May 22, 2025

Copy link

Member

brandtbucher commentedMay 22, 2025

Looks great.@tomasr8: natural follow-ups would be support for tuples of types inisinstance calls, as well as extending this optimization to the other calls we specialize (and seeing if we should possibly specialize more of them).

tomasr8 deleted the jit-optimize-pop-call-two branch

May 22, 2025 18:06

Copy link

MemberAuthor

tomasr8 commentedMay 22, 2025

Looks great.@tomasr8: natural follow-ups would be support for tuples of types inisinstance calls, as well as extending this optimization to the other calls we specialize (and seeing if we should possibly specialize more of them).

on it!