NotificationsYou must be signed in to change notification settings
Fork33.3k
Star69.7k

gh-104584: Support most jumping instructions#106393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

gvanrossum wants to merge9 commits intopython:mainfromgvanrossum:jump-uops

Closed

gh-104584: Support most jumping instructions#106393

gvanrossum wants to merge9 commits intopython:mainfromgvanrossum:jump-uops

Conversation

Copy link

Member

gvanrossum commentedJul 4, 2023•
edited
Loading

Supports all jumping opcodes exceptSEND
IncludesFOR_ITER and its specializations, exceptFOR_ITER_GEN
"See through"ENTER_EXECUTOR
ChangePOP_JUMP_IF_TRUE/FALSE to only jump when needed
JUMP_BACKWARD to top of trace becomes a specialJUMP_TO_TOP uop

A big weakness (?) is that this always assumes that branches are rarely taken. Any time a branch or jump (other than to the top of the trace) is taken, we leave the trace.

Issue:Pluggable optimizer API #104584

gvanrossum added the skip news label

Jul 4, 2023

bedevere-bot mentioned this pull request

Jul 4, 2023

Pluggable optimizer API#104584

Closed

gvanrossum marked this pull request as ready for review

July 4, 2023 02:18

gvanrossum requested a review frommarkshannon as acode owner

July 4, 2023 02:18

bedevere-bot added the awaiting core review label

Jul 4, 2023

gvanrossum requested a review frombrandtbucher

July 4, 2023 02:18

gvanrossum mentioned this pull request

Jul 4, 2023

GH-106008: Make implicit boolean conversions explicit#106003

Merged

Copy link

MemberAuthor

gvanrossum commentedJul 4, 2023

FWIW the Tier 2 interpreter now supports 109 bytecodes and 12 uops (the Tier 1 interpreter supports 202 bytecodes).

Cases in Python/executor_cases.c.h:

        case NOP: {        case LOAD_FAST_CHECK: {        case LOAD_FAST: {        case LOAD_FAST_AND_CLEAR: {        case LOAD_CONST: {        case STORE_FAST: {        case POP_TOP: {        case PUSH_NULL: {        case END_SEND: {        case UNARY_NEGATIVE: {        case UNARY_NOT: {        case TO_BOOL: {        case TO_BOOL_BOOL: {        case TO_BOOL_INT: {        case TO_BOOL_LIST: {        case TO_BOOL_NONE: {        case TO_BOOL_STR: {        case TO_BOOL_ALWAYS_TRUE: {        case UNARY_INVERT: {        case _GUARD_BOTH_INT: {        case _BINARY_OP_MULTIPLY_INT: {        case _BINARY_OP_ADD_INT: {        case _BINARY_OP_SUBTRACT_INT: {        case _GUARD_BOTH_FLOAT: {        case _BINARY_OP_MULTIPLY_FLOAT: {        case _BINARY_OP_ADD_FLOAT: {        case _BINARY_OP_SUBTRACT_FLOAT: {        case _GUARD_BOTH_UNICODE: {        case _BINARY_OP_ADD_UNICODE: {        case BINARY_SUBSCR: {        case BINARY_SLICE: {        case STORE_SLICE: {        case BINARY_SUBSCR_LIST_INT: {        case BINARY_SUBSCR_TUPLE_INT: {        case BINARY_SUBSCR_DICT: {        case LIST_APPEND: {        case SET_ADD: {        case STORE_SUBSCR: {        case STORE_SUBSCR_LIST_INT: {        case STORE_SUBSCR_DICT: {        case DELETE_SUBSCR: {        case CALL_INTRINSIC_1: {        case CALL_INTRINSIC_2: {        case GET_AITER: {        case GET_ANEXT: {        case GET_AWAITABLE: {        case POP_EXCEPT: {        case LOAD_ASSERTION_ERROR: {        case LOAD_BUILD_CLASS: {        case STORE_NAME: {        case DELETE_NAME: {        case UNPACK_SEQUENCE: {        case UNPACK_SEQUENCE_TWO_TUPLE: {        case UNPACK_SEQUENCE_TUPLE: {        case UNPACK_SEQUENCE_LIST: {        case UNPACK_EX: {        case DELETE_ATTR: {        case STORE_GLOBAL: {        case DELETE_GLOBAL: {        case _LOAD_LOCALS: {        case _LOAD_FROM_DICT_OR_GLOBALS: {        case LOAD_GLOBAL: {        case DELETE_FAST: {        case DELETE_DEREF: {        case LOAD_FROM_DICT_OR_DEREF: {        case LOAD_DEREF: {        case STORE_DEREF: {        case COPY_FREE_VARS: {        case BUILD_STRING: {        case BUILD_TUPLE: {        case BUILD_LIST: {        case LIST_EXTEND: {        case SET_UPDATE: {        case BUILD_SET: {        case BUILD_MAP: {        case SETUP_ANNOTATIONS: {        case BUILD_CONST_KEY_MAP: {        case DICT_UPDATE: {        case DICT_MERGE: {        case MAP_ADD: {        case LOAD_SUPER_ATTR_ATTR: {        case LOAD_SUPER_ATTR_METHOD: {        case LOAD_ATTR: {        case COMPARE_OP: {        case COMPARE_OP_FLOAT: {        case COMPARE_OP_INT: {        case COMPARE_OP_STR: {        case IS_OP: {        case CONTAINS_OP: {        case CHECK_EG_MATCH: {        case CHECK_EXC_MATCH: {        case JUMP_FORWARD: {        case JUMP_BACKWARD: {        case POP_JUMP_IF_FALSE: {        case POP_JUMP_IF_TRUE: {        case POP_JUMP_IF_NOT_NONE: {        case POP_JUMP_IF_NONE: {        case JUMP_BACKWARD_NO_INTERRUPT: {        case GET_LEN: {        case MATCH_CLASS: {        case MATCH_MAPPING: {        case MATCH_SEQUENCE: {        case MATCH_KEYS: {        case GET_ITER: {        case GET_YIELD_FROM_ITER: {        case FOR_ITER: {        case FOR_ITER_LIST: {        case FOR_ITER_TUPLE: {        case FOR_ITER_RANGE: {        case WITH_EXCEPT_START: {        case PUSH_EXC_INFO: {        case EXIT_INIT_CHECK: {        case MAKE_FUNCTION: {        case SET_FUNCTION_ATTRIBUTE: {        case BUILD_SLICE: {        case CONVERT_VALUE: {        case FORMAT_SIMPLE: {        case FORMAT_WITH_SPEC: {        case COPY: {        case BINARY_OP: {        case SWAP: {

Copy link

MemberAuthor

gvanrossum commentedJul 4, 2023

FWIW the Tier 2 interpreter now supports 109 bytecodes and 12 uops (the Tier 1 interpreter supports 202 bytecodes).

It looks like if we can manage to arrange that aKW_NAMES instruction and the followingCALL instructionorCALL specialization are always either part of the same trace or neither included in a trace, we could add over a dozen new bytecodes to the list (in particular, many of the simplerCALL specializations).

markshannon reviewed

Jul 4, 2023

View reviewed changes

Copy link

Member

markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We want to support specialized instructions, like LOAD_ATTR_INSTANCE_VALUE, and simple instructions like LOAD_FAST, but we specifically do not want to support unspecialized instructions like LOAD_ATTR.

If those instructions show up, either we are projecting the superblock too early or, more likely, need better specialization.

Regarding jumps:

Unconditional jumps should be nops, simply relying onSAVE_IP to update the externally visible state
Conditional jumps should be converted to conditional exits and unconditional jumps.

Python/bytecodes.c


		inst(JUMP_FORWARD, (--)) {
		JUMPBY(oparg);
		JUMP_POP_DISPATCH(oparg,0);

Copy link

Member

markshannonJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can we leave dispatch to the code generator?
The saved instruction pointer (frame->prev_instr) is part of the VM state like any other, so shouldn't need special casing.Unless the necessary information is not otherwise present. All the information necessary is inJUMPBY(oparg).

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Okay, so you're saying that for Tier 2 the code generator should just replaceJUMPBY(<expr>) with something Tier-2-appropriate. That could actually work. There are three places where we shouldn't do that,SEND,JUMP_BACKWARD andENTER_EXECUTOR, we can exclude those by forbidding something they use.

Copy link

Member

markshannonJul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

SEND will probably need to be broken up into two micro ops, one that does the send, and another that does the jump. TheSEND jump should be a more-or-less normal jump. We also need to track where we are sending from, to handle the matchingYIELD_VALUE, so I'd "forbid"SEND for now.

I think you already handleJUMP_BACKWARD andENTER_EXECUTOR correctly. They complete the loop, or exit.

Python/optimizer.c Outdated

		ADD_TO_TRACE(SAVE_IP, (int)(instr- (_Py_CODEUNIT*)code->co_code_adaptive));
		intopcode=instr->op.code;
		uint64_toperand=instr->op.arg;
		// TODO: EXTENDED_ARG handling

Copy link

Member

markshannonJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Setoparg = 0, loop over EXTENDED_ARGS, shifting as you go then shift in the final oparg.
https://github.com/python/cpython/blob/main/Python/instrumentation.c#L1332

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Will do.

Copy link

MemberAuthor

gvanrossumJul 5, 2023•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This turns out more complicated than I expected. The big question is:what IP shouldSAVE_IP save?

To make my current deopt strategy work (re-execute the specialized bytecode, which will fail the same guard and then deopt to the generic bytecode), the saved IP should point to theEXTENDED_ARG instruction. But the jump offsets are calculated relative to the jumping opcode. I suppose I could add the number ofEXTENDED_ARG opcodes to the jump offsets (or subtract, for backward jumps).

But I worry that the error handling code might also need an adjustment -- I don't know ifEXTENDED_ARG is included in exception handling ranges or if the error handling looks at the last executed bytecode.

I guess for now I can just treatEXTENDED_ARG as an unsupported opcode and kick this can of worms down the road...

Copy link

Member

markshannonJul 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Saving the IP of the start of theinstruction (not codeunit) is the correct thing, I think.
On entering the tier 1 interpreter, it will then get the correctoparg as it will execute all theEXTENDED_ARGs.

EXTENDED_ARG can never cause an error, so it doesn't matter whether they are included in the jump tables, although I think they are anyway.

Python/optimizer.c Outdated

		_PyExecutorObjectexecutor= (_PyExecutorObject)code->co_executors->executors[operand&255];
		opcode=executor->vm_data.opcode;
		DPRINTF(2," * ENTER_EXECUTOR -> %s\n",_PyOpcode_OpName[opcode]);
		operand=executor->vm_data.oparg;// TODO: EXTENDED_ARG handling

Copy link

Member

markshannonJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

operand = (operand & 0xffffff00) | executor->vm_data.oparg

Python/executor_cases.c.h Outdated

		}

		caseTO_BOOL: {
		static_assert(INLINE_CACHE_ENTRIES_TO_BOOL==3,"incorrect cache size");

Copy link

Member

markshannonJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We don't want unspecialized instructions in the superblocks. I'd treatENABLE_SPECIALIZATION as one of the forbidden words.
The fix is to improve our specialization, not to handle unspecialized instructions.

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That seems a bit harsh. It would mean that if we have a sequence of innocent bytecodes like

LOAD_FAST iLOAD_CONST jBINARY_OP (/)STORE_FAST k...

we would not be able to include theBINARY_OP andSTORE_FAST instructions in a superblock, because the unspecializedBINARY_OP ends the superblock generation.

Copy link

Member

markshannonJul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Harsh, but fair.BINARY_OP needs to work harder at being specialized 🙂

Seriously though, there is no merit in longer superblocks if we can't optimize them well, and the unspecialized forms do not optimize well. They are much slower, and force us to discard a lot of type information.

It is far more profitable to improve the specializer, so that unspecialized instructions are rare.

Python/optimizer.c

		operand=executor->vm_data.oparg;// TODO: EXTENDED_ARG handling
		}
		switch (opcode) {
		caseLOAD_FAST_LOAD_FAST:

Copy link

Member

markshannonJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Rather than special case individual instructions, can we test on flags?
That way, this will continue to work if we addLOAD_FAST_LOAD_CONST.
Something likeif (_Py_OpcodeMetadata[opcode].is_super_instruction)

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We could do something where the code generator encodes this in thesize field of the expansions, in the switch (currently) at line 405 below. E.g.-1 meansoperand >> 4 and-2 meansoperand & 0xF. Should probably switch to an enum. The generator would have to work a little harder but that seems doable (e.g. it could detect the use ofoparg1 andoparg2 and then parse the instruction name into two pieces).

Copy link

Member

markshannonJul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That should work, although I was thinking of the dumber approach of just annotatingLOAD_FAST_LOAD_FAST, etc as a superinstruction in bytecodes.c. Something like adding/* superinstruction */

Python/ceval.c

		///////////////////// Experimental UOp Interpreter /////////////////////

		#undef JUMP_POP_DISPATCH
		#defineJUMP_POP_DISPATCH(x,n) \

Copy link

Member

markshannonJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This seems unnecessary.

The superblock generator should be converting each conditional jump into a conditional exit, followed by an unconditional jump.
Such that:

POP_JUMP_IF_TRUE label

becomes:

False branch more likely

EXIT_IF_TRUE

True branch more likely

EXIT_IF_FALSEJUMP_FORWARD label

As for which branch is more likely, we will need to record which way branches go.

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Okay, just so I understand correctly, the idea is thatEXIT_IF_TRUE/FALSE doesn't pop, and continues in the bytecode at the originalPOP_JUMP_IF_TRUE instruction, which will pop. So the "False branch more likely" (i.e., branch not likely) version will have to actually translate toEXIT_IF_TRUE; POP_TOP. We could special-case this in the translator: it would have put such a list of uops in the expansion forPOP_JUMP_IF_TRUE and friends, and we'd have to add hand-written cases to the uop executor forEXIT_IF_TRUE etc.

In the other case (branch likely) theJUMP_FORWARD label could just beSAVE_IP label; superblock generation would then continue from the bytecode atlabel. That's not something I currently do -- a simplistic approach could do theSAVE_IP followed byEXIT_TRACE, and we can iterate later on continuing fromlabel.

But recording which way branches go is something for a future PR; again the most simplistic thing to do for now is to assume that branching is less likely than not branching (i.e., always generateEXIT_IF_TRUE; POP_TOP for now).

Honestly, for now I think I'll stick to just making the generator explicitly translateJUMPBY(n) into the correct sequence ofJUMPBY,STACK_SHRINK andDISPATCH. Nothing should followJUMPBY then.

Copy link

Member

markshannonJul 5, 2023•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it makes more sense to consume the condition before exiting

These two instructions,POP_JUMP_IF_TRUE andPOP_JUMP_IF_FALSE (theNone variants can be broken down intoLOAD_CONST None; IS_OP; POP_JUMP...) are special, so don't worry too much about fitting them into the more general framework.

There are a number of approaches, for exits. Here are three:

Leave the condition on the stack, and exit to thePOP_JUMP... instruction.
I don't like this approach as it is inefficient and will make trace stitching tricky
Pop the condition and exit to the target, or successor, instruction.
This is better, but means that theEXIT uop needs to handle a jump and the exit.
Convert the superblock into a tree, jumping to an exit branch which makes an unconditional exit.
Definitely my preferred option. Means that there is only oneEXIT uop and that it just exits.

Option 3 makes the superblock creation more complex, but simplifies the tier2 interpreter.
We will want to push some operations onto exit branches in the specializer and partial evaluator, so we will need to convert the superblock to a tree at some point.

If option 3 is too complex for this PR, option 2 should work as a temporary step.

So for option 3, using the example above ofPOP_JUMP_IF_TRUE label we get:

False branch more likely

    POP_JUMP_IF_TRUE_UOP uop_label:    ...uop_label:    SAVE_IP label    EXIT

True branch more likely

    POP_JUMP_IF_FALSE_UOP uop_label:    SAVE_IP label    ...uop_label:    EXIT

gvanrossum commented

Jul 4, 2023

View reviewed changes

Copy link

MemberAuthor

gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks, this gave me food for thought.

Python/bytecodes.c


		inst(JUMP_FORWARD, (--)) {
		JUMPBY(oparg);
		JUMP_POP_DISPATCH(oparg,0);

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Python/ceval.c

		///////////////////// Experimental UOp Interpreter /////////////////////

		#undef JUMP_POP_DISPATCH
		#defineJUMP_POP_DISPATCH(x,n) \

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Python/executor_cases.c.h Outdated

		}

		caseTO_BOOL: {
		static_assert(INLINE_CACHE_ENTRIES_TO_BOOL==3,"incorrect cache size");

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That seems a bit harsh. It would mean that if we have a sequence of innocent bytecodes like

LOAD_FAST iLOAD_CONST jBINARY_OP (/)STORE_FAST k...

we would not be able to include theBINARY_OP andSTORE_FAST instructions in a superblock, because the unspecializedBINARY_OP ends the superblock generation.

Python/executor_cases.c.h Outdated

Comment on lines 126 to 152

		#line127 "Python/executor_cases.c.h"
		#line153 "Python/executor_cases.c.h"

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Useless diff chunks like this make it harder to review the generated output. I would love to drop the#line directives. (Unsurprisingly, I get a lot of value out of seeing the diff of the generated code when I am fiddling with the code generator.) For debugging you could still generate line numbers using

python3 Tools/cases_generator/generate_cases.py -lPy_BUILD_CORE

Python/optimizer.c

		operand=executor->vm_data.oparg;// TODO: EXTENDED_ARG handling
		}
		switch (opcode) {
		caseLOAD_FAST_LOAD_FAST:

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Python/optimizer.c Outdated

		ADD_TO_TRACE(SAVE_IP, (int)(instr- (_Py_CODEUNIT*)code->co_code_adaptive));
		intopcode=instr->op.code;
		uint64_toperand=instr->op.arg;
		// TODO: EXTENDED_ARG handling

Copy link

MemberAuthor

gvanrossumJul 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Will do.

markshannon reviewed

Jul 5, 2023

View reviewed changes

Copy link

Member

markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Handling branches is a bit tricky, so it's OK to take a bit of time to get this right.

Python/bytecodes.c


		inst(JUMP_FORWARD, (--)) {
		JUMPBY(oparg);
		JUMP_POP_DISPATCH(oparg,0);

Copy link

Member

markshannonJul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think you already handleJUMP_BACKWARD andENTER_EXECUTOR correctly. They complete the loop, or exit.

Python/ceval.c

		///////////////////// Experimental UOp Interpreter /////////////////////

		#undef JUMP_POP_DISPATCH
		#defineJUMP_POP_DISPATCH(x,n) \

Copy link

Member

markshannonJul 5, 2023•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think it makes more sense to consume the condition before exiting

There are a number of approaches, for exits. Here are three:

Leave the condition on the stack, and exit to thePOP_JUMP... instruction.
I don't like this approach as it is inefficient and will make trace stitching tricky
Pop the condition and exit to the target, or successor, instruction.
This is better, but means that theEXIT uop needs to handle a jump and the exit.
Convert the superblock into a tree, jumping to an exit branch which makes an unconditional exit.
Definitely my preferred option. Means that there is only oneEXIT uop and that it just exits.

If option 3 is too complex for this PR, option 2 should work as a temporary step.

So for option 3, using the example above ofPOP_JUMP_IF_TRUE label we get:

False branch more likely

    POP_JUMP_IF_TRUE_UOP uop_label:    ...uop_label:    SAVE_IP label    EXIT

True branch more likely

    POP_JUMP_IF_FALSE_UOP uop_label:    SAVE_IP label    ...uop_label:    EXIT

Python/optimizer.c

		operand=executor->vm_data.oparg;// TODO: EXTENDED_ARG handling
		}
		switch (opcode) {
		caseLOAD_FAST_LOAD_FAST:

Copy link

Member

markshannonJul 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That should work, although I was thinking of the dumber approach of just annotatingLOAD_FAST_LOAD_FAST, etc as a superinstruction in bytecodes.c. Something like adding/* superinstruction */

gvanrossum marked this pull request as draft

July 5, 2023 21:19

bedevere-bot removed the awaiting core review label

Jul 5, 2023

Copy link

MemberAuthor

gvanrossum commentedJul 6, 2023•
edited
Loading

Drat, the super-instruction refactor is broken. But1868f91 works great!

I'll debug later.

Copy link

Member

markshannon commentedJul 6, 2023

I think it might be worth breaking up this PR into 3:

The fix in handling errors in_PyOptimizer_BackEdge
Handling of superinstructions
Jumps.

The handling of jumps has some subtleties, so we might as well get the other changes in while we refine the design of jump handling.

Copy link

MemberAuthor

gvanrossum commentedJul 6, 2023

I think it might be worth breaking up this PR into 3:
The fix in handling errors in_PyOptimizer_BackEdge
Handling of superinstructions
Jumps.
The handling of jumps has some subtleties, so we might as well get the other changes in while we refine the design of jump handling.

I might do that, though I suspect there are mute lurking bugs that we will only find by attempting to implement jumps. Also, jumps will be an iterative design anyway.

And where do you see handling EXTENDED_ARG fitting in?

Copy link

Member

markshannon commentedJul 6, 2023

And where do you see handling EXTENDED_ARG fitting in?

In another PR as well. I much prefer lots of small PRs. Large, draft PRs tend to block other progress.

gvanrossum mentioned this pull request

Jul 6, 2023

gh-104584: Fix error handling from backedge optimization#106484

Merged

Copy link

MemberAuthor

gvanrossum commentedJul 6, 2023

In another PR as well. I much prefer lots of small PRs. Large, draft PRs tend to block other progress.

Will do. I wonder though -- is a draft PR more of a problem than a private branch? If you'd rather have me do more of the latter I'm fine with that, but draft PRs have the advantage that they run CI.

gvanrossum added a commit that referenced this pull request

Jul 6, 2023

gh-104584: Fix error handling from backedge optimization (#106484)

003ba71

When `_PyOptimizer_BackEdge` returns `NULL`, we should restore `next_instr` (and `stack_pointer`). To accomplish this we should jump to `resume_with_error` instead of just `error`.The problem this causes is subtle -- the only repro I have is in PRgh-106393, at commitd7df54b. But the fix is real (as shown later in that PR).While we're at it, also improve the debug output: the offsets at which traces are identified are now measured in bytes, and always show the start offset. This makes it easier to correlate executor calls with optimizer calls, and either with `dis` output.<!-- gh-issue-number:gh-104584 -->* Issue:gh-104584<!-- /gh-issue-number -->

gvanrossum force-pushed thejump-uops branch from3c46ae9 to0f21e09Compare

July 6, 2023 23:42

Copy link

MemberAuthor

gvanrossum commentedJul 6, 2023

I pushed a leaner version, but without any of the significant changes to the branch strategy, so it's still a draft. Everything else has been split out or is being split out.

gvanrossum force-pushed thejump-uops branch 2 times, most recently from8d07517 toc87f9d6Compare

July 7, 2023 04:07

gvanrossum added5 commits

July 7, 2023 10:19

Hacky way to implement jump ops

d31d003

Introduce a new macro, JUMP_POP_DISPATCH(x, n).This does JUMPBY(x), STACK_SHRINK(n), DISPATCH().Most JUMP opcodes can use this.The exceptions are SEND, JUMP_BACKWARD, and JUMP_BACKWARD_NO_INTERRUPT.For JUMP_BACKWARD, I have to research whether CHECK_EVAL_BREAKER() and JUMPBY() commute.I think I'll just punt on SEND (it's too complex anyways).

Make JUMP_BACKWARD[_NO_INTERRUPT] viable

b886ca6

This involves some subtle rearrangement of code in JUMP_BACKWARD.

Replace ENTER_EXECUTOR with the original in trace projection

28b3951

Make POP_JUMP_IF_FALSE/TRUE conditionally jump

0825696

This may destroy the symmetry or slow things down,but (for now) it's needed so that the executor canat least avoid bailing when the jump is not taken.(The original code was doing a jump 0 in that case.)

Implement JUMP_TO_TOP special uop

1df7b84

If JUMP_BACKWARD jumps to the start of the trace, add this.It contains an eval breaker check.

gvanrossum added4 commits

July 7, 2023 10:20

Cleanup some commented-out lines

d6235ee

Tweak FOR_ITER so it's viable

6cd5719

Check room available for JUMP_TO_TOP

b33a1a2

A stray JUMP_BACKWARD also ends the trace

f5e18d8

gvanrossum force-pushed thejump-uops branch fromc87f9d6 tof5e18d8Compare

July 7, 2023 17:21

Copy link

MemberAuthor

gvanrossum commentedJul 7, 2023

Okay, I'll close this. I'll create a new issue describing how I think branches could be handled.

gvanrossum closed this

Jul 7, 2023

This was referencedJul 7, 2023

gh-104584: Replace ENTER_EXECUTOR with the original in trace projection#106526

Merged

Branching design for Tier 2 (uops) interpreter#106529

Open

gh-106529: Support FOR_ITER specializations as uops#106542

Closed

gvanrossum deleted the jump-uops branch

July 12, 2023 17:27

Labels

skip news

Movatterモバイル変換

Uh oh!

gh-104584: Support most jumping instructions#106393

gh-104584: Support most jumping instructions#106393

Uh oh!

Conversation

gvanrossum commentedJul 4, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

gvanrossum commentedJul 4, 2023

Uh oh!

gvanrossum commentedJul 4, 2023

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gvanrossumJul 5, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

False branch more likely

True branch more likely

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markshannonJul 5, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Choose a reason for hiding this comment

False branch more likely

True branch more likely

Uh oh!

gvanrossum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markshannon left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

markshannonJul 5, 2023• editedLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

gvanrossum commentedJul 4, 2023•
edited
Loading

gvanrossumJul 5, 2023•
edited
Loading

markshannonJul 5, 2023•
edited
Loading

markshannonJul 5, 2023•
edited
Loading

gvanrossum commentedJul 6, 2023•
edited
Loading