Adds a new Storage class which combines the state of the stack and the local variables within a micro-op.
Adds copying and merging functionality to that class, and the underlying stack and locals, to support tracking state across divergent flow.
Splits and merges the storage in if statements.
Tracks the state of conceptual stack and local variables, ensuring the necessary values are written to memory when needed.

The code generator needs to tell when inputs are dead, in order to known when the stack_pointer should be reduced when a call escapes. We can track explicitPyStackRef_CLOSE andDECREF_INPUTS, but many cases are implicit.
For these cases we add theDEAD macro to mark the variable as dead.

To simplify parsing, the code generator also enforces PEP 7 rules for braces.
A few of the changes in bytecodes.c are a result of this change.

Bothinterpreter performance andJIT performance show no slowdown.

Replaces#123397

Issue:Spill the stack pointer across calls in the interpreter. #119866

markshannon added30 commits

August 7, 2024 16:13

Add copy and == support to Stack class

75bda28

Merge branch 'main' into stack-copy-and-merge

c7e9102

blacken stack.py

d331371

Cases generator: Track reachability and divergent stacks in if statem…

4673d8a

…ents.

Fix type errors and rename ahead to look_ahead

132df06

Track state of output variables

b7f71d4

Handle stack and output locals togather as Storage class

b97dea9

Track locals as well as stack on differing paths

9838a05

Use 'PEP 7' in syntax error, to make it clear that this is not a C sy…

1f829be

…ntax error, just a tool limitation.

Push peeks back to stack in optimizer code gen

d2e5f12

Merge branch 'main' into stack-copy-and-merge

2753014

Update test

1754fc4

Cleanup whitespace

98f9720

Add tests for PEP 7 parsing and escaping call in condition

0284b3f

Remove merge artifact

03bea71

Remove test for escaping calls in conditions

ca2f457

Spill before escaping calls. Initial attempt

57de61f

Clean up asserts

a2e430a

Update test

95408d3

Flush locals as well on error

eb3a645

Spill stack on escaping calls. Preparatory work

00f5265

Spill stack contents on escaping calls

7de1e60

Find end of statement when anlyzing escaping calls

3bfed1b

Spill stack pointer as well. Work in progress

d7e1c82

Don't allow escaping calls in ERROR_IF or DEOPT_IF

2cc4f64

Improve tracking of stack and locals in conditional flow

8e258ad

Handle ERROR_NO_POP correctly

6da3fc6

Fix up handling of liveness

60ee3e9

Allow assignments of new refs to input as well as output variables.

9f2f3bb

Merge branch 'main' into spill-before-escaping-calls-2

87b5561

Insert extra braces to handle 'else if' without an 'else' correctly.

de3b86c

Copy link

MemberAuthor

markshannon commentedOct 3, 2024

I've addressed all the issues.

I've increased the number of functions that we treat as non-escaping, but not every function mentioned above.

I think it safer to do this than to try to mark everything that is non-escaping and risk making a mistake, or any of those functions changing in the future such that it does escape.

Since the cost of spilling is low, spilling around such maybe-non-escaping calls has no measurable performance impact.

markshannon added2 commits

October 3, 2024 07:36

Merge branch 'main' into spill-before-escaping-calls-2

2db84df

Fix test. gc.get_referrers can now see values on generator stacks

a1e55d4

Fidget-Spinner approved these changes

Oct 4, 2024

View reviewed changes

Copy link

Member

Fidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm going to approve this on the condition that usage ofDEAD is properly documented in a follow-up PR. I don't think it's so clear when to use it correctly and when it might be misused.

bedevere-appbot added awaiting merge and removed awaiting core review labels

Oct 4, 2024

Copy link

Member

Fidget-Spinner commentedOct 4, 2024

Alternatively, you can add the documentation in, since you need to fix the failing test cases anyways.

markshannon requested review from1st1,asvetlov,gvanrossum,kumaraditya303 andwillingc ascode owners

October 4, 2024 14:22

Copy link

MemberAuthor

markshannon commentedOct 4, 2024

I will leave the documentation ofDEAD to another PR, asERROR_IF and other macros are not documented either.
It will be easier to do them all together.

kumaraditya303 reviewed

Oct 4, 2024

View reviewed changes

Lib/test/test_asyncio/test_streams.py

		self.assertListEqual(gc.get_referrers(exc), [])

		asyncio.run(main())
		self.assertListEqual(gc.get_referrers(exc), [main_coro])

Copy link

Contributor

kumaraditya303Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Why this change?

Copy link

MemberAuthor

markshannonOct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Because we are spilling the stack pointer, so the GC can see the locals of the generator.

gvanrossum reviewed

Oct 4, 2024

View reviewed changes

Copy link

Member

gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The changes to asyncio look fine to me, but maybe a comment (along the lines of what you explained to Kumar) would be helpful.

brandtbucher approved these changes

Oct 5, 2024

View reviewed changes

Copy link

Member

brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm far from an expert on the cases generator, but I didn't see any obvious issues there. A few notes:

It should be an error if somebody uses a name after it has been killed by eitherDEAD(...) orINPUTS_DEAD().
It should be an error to kill a name twice in the same path.
Can you remind me whyDEAD(...)/INPUTS_DEAD() needs to be explicit in the code, and can't just be inferred from the location of the last use? I vaguely remember discussing this at the sprint, but I forget why.
Can we handle scalars-under-arrays in the analyzer, so we don't have these awkward length-one arrays?

I'm approving this PR because I think it probably needs to happen and I trust that this is the best way of doing this. But I think in general, we might need to take a step back and consider the huge amount of complexity the has accumulated inbytecodes.c, and the thousands of lines of code in the cases generator that process it. The DSL was originally introduced to reduce the boilerplate and complexity in the interpreter loop... I think it still does a good job of this, as evidenced by this PR. But the mental load when reading and editing this file has crept up incrementally with time, and I'm worried that rate is accelerating. It's beyond the scope of this issue, but we should probably consider if the DSL should be reworked to better support the needs of the cases generator.

Copy link

Member

brandtbucher commentedOct 5, 2024

Thanks for tackling this, by the way. It was a big project and it's really cool to see it working correctly (the newly-failing test was neat to see).

Copy link

MemberAuthor

markshannon commentedOct 7, 2024

Can you remind me why DEAD(...)/INPUTS_DEAD() needs to be explicit in the code, and can't just be inferred from the location of the last use? I vaguely remember discussing this at the sprint, but I forget why.

Because variables hold references to objects the last use doesn't kill the variable.PyStackRef_CLOSE() closes the reference and kills the variable.
However, for calls that consume references and immortal objects, the code generator needs to be told that the reference is dead withDEAD

Copy link

MemberAuthor

markshannon commentedOct 7, 2024

Can we handle scalars-under-arrays in the analyzer, so we don't have these awkward length-one arrays?

Theoretically yes, but mixing scalars and arrays is awkward. We want to be able to move scalars into registers, but having gaps in the in-memory stack makes things complicated.

Copy link

MemberAuthor

markshannon commentedOct 7, 2024

#125046 for the other issues.

markshannon merged commitda071fa intopython:main

Oct 7, 2024

62 of 63 checks passed

bedevere-appbot removed the awaiting merge label

Oct 7, 2024

mdboom mentioned this pull request

Oct 11, 2024

Compiler crash on MSVC when building with JIT and PGO#125217

Closed

graingert added a commit to graingert/cpython that referenced this pull request

Oct 14, 2024

fix incompatability withpythongh-124392

79606d6

1st1 pushed a commit that referenced this pull request

Oct 14, 2024

gh-124958: Revert "gh-125472: Revert "gh-124958: fix asyncio.TaskGrou…

0b28ea4

…p and _PyFuture refcycles ... (#125486)* Revert "gh-125472: Revert "gh-124958: fix asyncio.TaskGroup and _PyFuture refcycles (#12… (#125476)"This reverts commite99650b.* fix incompatability withgh-124392

This was referencedOct 16, 2024

[3.12] gh-124958: fix asyncio.TaskGroup and _PyFuture refcycles (#124959)#125466

Merged

[3.14] change in behaviour in gc.get_referrers(some_local)#125603

Open

This was referencedNov 6, 2024

Mark all objects reachable from roots as live before doing main cyclic GC pass#126491

Open

JIT error stubs don't account for peeks#126222

Closed