Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-107557: Setup abstract interpretation#107847

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
Fidget-Spinner merged 54 commits intopython:mainfromFidget-Spinner:partition_algo
Aug 15, 2023

Conversation

@Fidget-Spinner
Copy link
Member

@Fidget-SpinnerFidget-Spinner commentedAug 10, 2023
edited
Loading

This is joint work by@JuliaPoo and me, supported by the NUS TEST Lab.

This implements a subset of partial evaluation (specifically, constant propagation), using a technique described for our type propagation in ourTier 2 interpreter report and abstract interpretation over uops.

The main goal of upstreaming this PR is to set up the infrastructure for optimization passes of CPython uops. With constant propagation, turning global loads or attribute loads to constants at the region formation phase will open up further optimizations in subsequent passes. This PR also makes CPython uops ready for type propagation, thus allowing wide-scale typed operations.

Features:

  • An automatically generated abstract interpreter from the interpreter DSL.
  • Abstract interpretation of uops.
  • Partitioning of values on the stack and locals into static/dynamic using the concept of "partitions of nodes".
  • Introduces the concept of "pure" operations, and what can be ascertained about them.
  • Constant propagation purely via abstract interpretation of bytecode, with no requirement for SSA or AST.
  • An optimization pass to remove redundantSAVE_IPs after partial evaluation.
  • Jump target/instruction numbering, which allows us to freely relocate jump targets/instructions, as a final pass will fix them up.

Example:
We have the following test case.

        def testfunc(loops):            num = 0            while num < loops:                x = 0                y = 1                z = 2                a = x + y + z + x + y + z + x + y + z                num += 1            return a

This is now essentially simplified down to:

        def testfunc(loops):            num = 0            while num < loops:                x = 0                y = 1                z = 2                a = 9                num += 1            return a

Which roughly halves the trace length.

TODO:

  • Uselltrace convention commonly seen in rest of the uops codebase.
  • Cleanup.

Fidget-Spinnerand others added30 commitsAugust 2, 2023 17:52
Co-Authored-By: Guido van Rossum <gvanrossum@users.noreply.github.com>
Co-Authored-By: Jules <57632293+juliapoo@users.noreply.github.com>
Co-Authored-By: Jules <57632293+juliapoo@users.noreply.github.com>
Co-Authored-By: Jules <57632293+juliapoo@users.noreply.github.com>
Co-Authored-By: Jules <57632293+juliapoo@users.noreply.github.com>
* Fix+Refactor: Handling of root nodes in special-cased type prop* Style: Removed trailing space
…WRITE` and mis-port of `PARTITIONNODE_OVERWRITE` (#41)* Fix: Inconsistent AbstractInterpContext used in PARTITIONNODE_OVERWRITE and typo in PARTITIONNODE_OVERWRITE* Style: Removed whitespace
Copy link
Member

@gvanrossumgvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Here are some nits for everything except optimizer_analysis.c

Comment on lines 14 to 15
int32_topcode;
int32_toparg;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm curious about this change. Are you using negative opcodes or opargs? IIUC oparg really is treated as a 32-bit unsigned int in ceval.c and bytecodes.c.

Copy link
MemberAuthor

@Fidget-SpinnerFidget-SpinnerAug 14, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm using both negative ints and opargs, but only during the analysis phase. They are converted back to unsigned after the last pass.

For all intents and purposes outside of the optimizer pass, they can be treated as unsigned.

ifinstr.is_viable_uop()andinstr.namenotinSPECIALLY_HANDLED_ABSTRACT_INSTR:
self.out.emit("")
withself.out.block(f"case{thing.name}:"):
instr.write(self.out,tier=TIER_TWO)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Note that ingh-107760 I'm removingInstruction.write altogether.

Comment on lines +707 to +708
a.write_abstract_interpreter_instructions(args.abstract_interpreter_cases,
args.emit_line_directives)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Heh, I noticed this too. :-)

Copy link
Member

@gvanrossumgvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looking at optimizer_analysis.c I am wondering if that code is ready for prime time. Maybe there's a way that we can add most of theinfrastructure code (that is likely relatively stable) without adding any of the code that's still under heavy development? E.g. I'm fine with the extra optimizer argument and generating a new .c.h file, and even with having a dummy optimizer_analysis.c file, but I worry that the latter will undergo many serious changes that will cause a lot of churn.

It is also by far the largest amount of code, and of a highly algorithmic nature that I won't be able to review meaningfully until you and I have sat down for a review of the algorithm. Compare this to most of the Tier 2 work so far, which is mostly just engineering -- e.g. splitting bytecodes into guard and action uops that use at most one cache entry feels more like a refactoring, and the superblock generator and interpreter are also mostly just hard work, not deep thinking.

@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commentedAug 14, 2023
edited
Loading

Yeah I'm concerned with the changes in the analysis file as well. The other problem I also thought of is that new uops (especially calls) would need modifications to the analysis file and that requires deep understanding of the algorithm.

Would you be amenable if I guarded the optimization pass behind a define that is by default off? That way we can still experiment with things but not impede other efforts.

@gvanrossum
Copy link
Member

That would totally work. Can you update the PR?

Fidget-Spinner reacted with thumbs up emoji

@Fidget-Spinner
Copy link
MemberAuthor

I've decided to use an env var instead, because it's easier for testing. With a define, we would not be able to run any tests except with a custom branch of cpython.

// Inserts TOS at position specified by oparg
PyObject*tos=TOP();
for (inti=1;i<oparg+1;i++) {
stack_pointer[i]=stack_pointer[i-1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hm, doesn't this repeat the first stack element over and over? Maybememmove() would do what you want? (It is supposed to be good at overlaps, unlikememcpy().)

Copy link
MemberAuthor

@Fidget-SpinnerFidget-SpinnerAug 15, 2023
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Woops yeah thanks for catching this! Forgot to negate the indexes.

gvanrossum reacted with laugh emoji
Comment on lines 645 to 650
caseparsing.InstDef():
instr=AbstractInstruction(self.instrs[thing.name].inst)
ifinstr.is_viable_uop()andinstr.namenotinSPECIALLY_HANDLED_ABSTRACT_INSTR:
self.out.emit("")
withself.out.block(f"case{thing.name}:"):
instr.write(self.out,tier=TIER_TWO)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I wonder if you even need theAbstractInstruction class. Maybe you could just call a different method onInstruction?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I was thinking of how we might need to expand on it more in the future, so a separate class might be better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

When that happens in the future you can refactor the code. Until then, I recommend less code.

As you may have noticed this code gets refactored a lot. :-) It's easy because the generator is not a public API -- all we care about is whether it generates a handful of files correctly from bytecodes.c at build time. But I worry about copying and pasting code, because that's harder to refactor.

@Fidget-Spinner
Copy link
MemberAuthor

Guido, thanks for the multiple rounds of reviews. It's a gigantic PR, and I truly appreciate your time on this!

gvanrossum reacted with heart emoji

Copy link
Member

@gvanrossumgvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm still concerned that we have that 1000+ line experimental file. I think requiring you to have a branch with just the changes to that file in it makes sense until it all works reliably. (For comparison, Brandt's copy-and-patch is also still a branch.)

Fidget-Spinner reacted with thumbs up emoji
caseINSERT: {
PyObject*top;
PyObject**stuff;
PyObject**stuff1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hm, the warnings are kind of annoying -- let's see if we can rid of those.

@Fidget-Spinner
Copy link
MemberAuthor

I'm still concerned that we have that 1000+ line experimental file. I think requiring you to have a branch with just the changes to that file in it makes sense until it all works reliably. (For comparison, Brandt's copy-and-patch is also still a branch.)

Got it. I removed the experimental optimizer part and pushed it to another branch.

@Fidget-SpinnerFidget-Spinner changed the titlegh-107557: Limited partial evaluation via abstract interpretationgh-107557: Setup abstract interpretationAug 15, 2023
Copy link
Member

@gvanrossumgvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Great. Do you think this is in the state you'd like to merge now?

@Fidget-Spinner
Copy link
MemberAuthor

Yes I'm happy merging this!

Copy link
Member

@gvanrossumgvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Go for it!

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@gvanrossumgvanrossumgvanrossum approved these changes

@iritkatrieliritkatrielAwaiting requested review from iritkatriel

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

@brandtbucherbrandtbucherAwaiting requested review from brandtbucher

Assignees

No one assigned

Labels

interpreter-core(Objects, Python, Grammar, and Parser dirs)

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

6 participants

@Fidget-Spinner@gvanrossum@ambv@bedevere-bot@Eclips4@JuliaPoo

[8]ページ先頭

©2009-2025 Movatter.jp