Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-106581: SplitCALL_PY_EXACT_ARGS into uops#107760

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
gvanrossum merged 20 commits intopython:mainfromgvanrossum:call-uops
Aug 16, 2023
Merged
Changes from1 commit
Commits
Show all changes
20 commits
Select commitHold shift + click to select a range
56133bb
Split `CALL_PY_EXACT_ARGS` into uops
gvanrossumAug 5, 2023
907ff95
Fix merge so it works again (I think)
gvanrossumAug 9, 2023
2c6be6d
Split into finer-grained uops
gvanrossumAug 9, 2023
6d78ff2
Fix type error in stacking.py
gvanrossumAug 10, 2023
0d8e66c
Add test
gvanrossumAug 10, 2023
b75f30e
Add comment explaining _PUSH_FRAME's unused output effect
gvanrossumAug 10, 2023
61c2822
Make PUSH_FRAME special case a little less myterious
gvanrossumAug 10, 2023
f73ea90
Rename Instruction.write to write_case_body
gvanrossumAug 10, 2023
12910fc
Move next_instr update to a more logical place
gvanrossumAug 10, 2023
2fafa2c
Don't recompute macro cache offset
gvanrossumAug 10, 2023
2717b07
Fold and refactor long line in stacking.py
gvanrossumAug 10, 2023
e487908
Fold long lines in generate_cases.py
gvanrossumAug 10, 2023
1d549af
Don't emit static assert to executor cases
gvanrossumAug 10, 2023
f40fb1f
Factor away write_case_body (formerly Instruction.write)
gvanrossumAug 10, 2023
4f6f8f8
Fold long lines
gvanrossumAug 11, 2023
6facc8d
Make less of a special case of _PUSH_FRAME
gvanrossumAug 11, 2023
94630d4
Stop special-casing _PUSH_FRAME altogether
gvanrossumAug 11, 2023
cf8e2c0
Call _Py_EnterRecursivePy in _FRAME_PUSH
gvanrossumAug 15, 2023
1e62876
Merge remote-tracking branch 'upstream/main' into call-uops
gvanrossumAug 15, 2023
05af848
Introduce SAVE_CURRENT_IP uop per Mark's proposal
gvanrossumAug 16, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
Add comment explaining _PUSH_FRAME's unused output effect
  • Loading branch information
@gvanrossum
gvanrossum committedAug 10, 2023
commitb75f30eb37c8ac68a014a7a3318e549e9e4d4ea3
3 changes: 3 additions & 0 deletionsPython/bytecodes.c
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -2977,6 +2977,9 @@ dummy_func(
}
}

// The 'unused' output effect represents the return value
// (which will be pushed when the frame returns).
// It is needed so CALL_PY_EXACT_ARGS matches its family.
op(_PUSH_FRAME, (new_frame: _PyInterpreterFrame* -- unused)) {
Copy link
Member

@markshannonmarkshannonAug 9, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Since_PUSH_FRAME is justframe->return_offset = 0; DISPATCH_INLINED(new_frame);, it would make sense to spell outDISPATCH_INLINED to clarify which operations that need to be different for tier 1 and tier 2.
Something like:

op(_PUSH_FRAME, (new_frame:_PyInterpreterFrame*--unused)) {SAVE_FRAME_STATE();// Equivalent to frame->prev_instr = next_instr - 1; _PyFrame_SetStackPointer(frame, stack_pointer);frame->return_offset=0;new_frame->previous=frame;frame=cframe.current_frame=new_frame;CALL_STAT_INC(inlined_py_calls);if (_Py_EnterRecursivePy(tstate)) {        gotoexit_unwind;    }START_FRAME();// Equivalent to next_instr = frame->prev_instr  + 1; stack_pointer =stack_pointer=_PyFrame_GetStackPointer(frame);}

Copy link
Member

@markshannonmarkshannonAug 9, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For example:2b3e6f2

In which case the code generators needs to know to push the temporary stack values to the real stack beforeSAVE_FRAME_STATE()

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have to study this more. A problem is that the Tier 1 and Tier 2 versions of_PUSH_FRAME are so different. I am working on a mechanism to be able to say

#if TIER_ONE<code for Tier 1>#else<code for Tier 2>#endif

I'm not sure yet what you mean with your last remark about pushing temp stack values.

Copy link
MemberAuthor

@gvanrossumgvanrossumAug 9, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Comparing carefully the two versions ofDISPATCH_INLINED (addingframe->return_offset = 0 which precedes it in both cases):

In Tier 1:

        frame->return_offset = 0;        assert(tstate->interp->eval_frame == NULL);        _PyFrame_SetStackPointer(frame, stack_pointer);        frame->prev_instr = next_instr - 1;        (NEW_FRAME)->previous = frame;        frame = cframe.current_frame = (NEW_FRAME);        CALL_STAT_INC(inlined_py_calls);        goto start_frame;

In Tier 2:

        frame->return_offset = 0;        assert(tstate->interp->eval_frame == NULL);        _PyFrame_SetStackPointer(frame, stack_pointer);        frame->prev_instr -= 1;        (NEW_FRAME)->previous = frame;        frame = tstate->cframe->current_frame = (NEW_FRAME);        CALL_STAT_INC(inlined_py_calls);        stack_pointer = _PyFrame_GetStackPointer(frame);        ip_offset = (_Py_CODEUNIT *)_PyFrame_GetCode(frame)->co_code_adaptive;

Diff:

@@ -1,8 +1,9 @@         frame->return_offset = 0;         assert(tstate->interp->eval_frame == NULL);         _PyFrame_SetStackPointer(frame, stack_pointer);-        frame->prev_instr = next_instr - 1;+        frame->prev_instr -= 1;         (NEW_FRAME)->previous = frame;-        frame = cframe.current_frame = (NEW_FRAME);+        frame = tstate->cframe->current_frame = (NEW_FRAME);         CALL_STAT_INC(inlined_py_calls);-        goto start_frame;+        stack_pointer = _PyFrame_GetStackPointer(frame);+        ip_offset = (_Py_CODEUNIT *)_PyFrame_GetCode(frame)->co_code_adaptive;

Note that the Tier 2 version must be preceded by aSAVE_IP, which does the equivalent offrame->prev_instr = next_instr. If we had a Tier 1 version ofSAVE_IP we could include it in the macro definition:

        macro(CALL_PY_EXACT_ARGS) =            unused/1 + // Skip over the counter            _CHECK_PEP_523 +            _CHECK_FUNCTION_EXACT_ARGS +            _CHECK_STACK_SPACE +            _INIT_CALL_PY_EXACT_ARGS +            SAVE_IP +      // <-------------- added            _PUSH_FRAME;

which would reduce the special-casing in the code generator a bit (it would still need to do something special forSAVE_IP to ensure that itsoparg has the right value, different from theoparg of the macro (which is the argument count). This would take care of the first diff chunk (what to assign toframe->prev_inst), but it would still be pretty fragile. (Like my current version, it would entice the optimizer to incorrectly try to remove theSAVE_IP uop.)

The second diff chunk relates to how we setcframe.current_frame -- in Tier 2 we must access this through thetstate.

The third and final diff chunk relates to really start using the new frame. In Tier 1, this must actually do the following:

  • Check recursion
  • Loadstack_pointer
  • Loadnext_instr
  • Dispatch to the next opcode.

This is done by the code atstart_frame.

In Tier 2 there is nostart_frame label (the only uop that can go to a label isEXIT_TRACE, and of courseDEOPT_IF andERROR_IF also jump). So we loadstack_frame here. There is no direct equivalent tonext_instr, but we have to setip_offset, whichSAVE_IP adds to itsoparg to get theprev_instr value. (This variable is a cache forframe->code->co_code_adaptive, to save some memory loads, so wheneverframe changes we must update it.)

(More later.)

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

There's another thing though, and I think that is what Mark meant. In Tier 1 the code generation for macros is special-cased for_PUSH_FRAME so that both the stack adjustmentand thenext_instr adjustment are emittedbefore the_PUSH_FRAME opcode. This is done so that the flushing of these variables to the frame in theDISPATCH_INLINED macro flush the correct values.

But this is really ugly and unprincipled, and the logic is much hairier than the other special cases for_PUSH_FRAME. One of Mark's ideas here is to make this special case look for uops using theSAVE_FRAME_STATE macro rather than for the specific uop_PUSH_FRAME. But detecting when to trigger the special case is only part of the problem -- IMO the worse problem is that the special case itself is so ugly:

        dispatch_inlined_special_case = False        if mgr is managers[-1] and mgr.instr.always_exits.startswith("DISPATCH_INLINED") and mgr.instr.name == "_PUSH_FRAME":            dispatch_inlined_special_case = True            temp = mgr.final_offset.clone()            temp.deeper(StackEffect(UNUSED))  # Hack            out.stack_adjust(temp.deep, temp.high)            # Use clone() since adjust_inverse() mutates final_offset            mgr.adjust_inverse(mgr.final_offset.clone())            if cache_adjust:                out.emit(f"next_instr += {cache_adjust};")

The last 4 lines here, starting with# Use clone(), occur further down too, for the normal case (after the final uop). I don't even recall why thetemp.deeper() call is needed!

I'll mull this over some more.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think I have addressed this.@markshannon Please have another look. Assuming the tests look okay I'll un-draft this.

frame->return_offset = 0;
DISPATCH_INLINED(new_frame);
Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp