NotificationsYou must be signed in to change notification settings
Fork33.7k
Star70.4k

gh-104635: Expand optimization for dead store elimination#106571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

corona10 wants to merge9 commits intopython:mainfromcorona10:gh-104635-extend

Closed

gh-104635: Expand optimization for dead store elimination#106571

corona10 wants to merge9 commits intopython:mainfromcorona10:gh-104635-extend

Conversation

Copy link

Member

corona10 commentedJul 9, 2023•
edited
Loading

Explain

We might also want to expand this so it can look more than one instruction ahead for another write to the same location, skipping over an allowlist of instructions that we know don't read that same local, don't execute arbitrary code, and can't raise an exception. E.g. LOAD_FAST of a different local, POP_TOP, NOP, probably others.

This PR is motivated by@carljm 's comment from#105040 (comment)
The limitation of#105320 is that the optimization only covers the next redundant opcode, but with this patch, the optimization can be applied to more than one instruction ahead for another write to the same location.

Benchmark

importpyperfrunner=pyperf.Runner()runner.timeit(name="bench dead_store",stmt="""_, a, b, c, _ = 1, 2, 3, 4, 5""")

Mean +- std dev: [base] 13.1 ns +- 0.2 ns -> [opt] 12.6 ns +- 0.2 ns: 1.04x faster

TODO (Done)

Fix test_dis.py
Add testcase for this optimization.

Issue:dead store elimination in the compiler #104635

pythongh-104635: Extend optimization for dead store elimination

f3b8d7e

corona10 requested a review fromcarljm

July 9, 2023 14:50

bedevere-bot mentioned this pull request

Jul 9, 2023

dead store elimination in the compiler#104635

Open

nit

20e17d2

corona10 added the skip news label

Jul 9, 2023

corona10 changed the title~~gh-104635: Extend optimization for dead store elimination~~[WIP] gh-104635: Extend optimization for dead store elimination

Jul 9, 2023

Copy link

MemberAuthor

corona10 commentedJul 9, 2023•
edited
Loading

./_bootstrap_python ./Programs/_freeze_module.py runpy ./Lib/runpy.py Python/frozen_modules/runpy.hAssertion failed: (OPCODE_HAS_ARG(opcode) || HAS_TARGET(opcode) || oparg == 0), function instr_sequence_addop, file compile.c, line 243.make: *** [Python/frozen_modules/runpy.h] Abort trap: 6make: *** Waiting for unfinished jobs....

Interesting crash; this crash was not triggered at my local machine. I will take a look.

-> Done

fix crash

c624371

corona10 changed the title~~[WIP] gh-104635: Extend optimization for dead store elimination~~gh-104635: Extend optimization for dead store elimination

Jul 9, 2023

nit

d5ce21b

corona10 marked this pull request as ready for review

July 9, 2023 16:06

corona10 requested review fromiritkatriel andmarkshannon ascode owners

July 9, 2023 16:06

bedevere-bot added the awaiting core review label

Jul 9, 2023

corona10 marked this pull request as draft

July 9, 2023 16:07

bedevere-bot removed the awaiting core review label

Jul 9, 2023

Update test_dis.py

bf580d0

corona10 force-pushed thegh-104635-extend branch from471bae6 tobf580d0Compare

July 9, 2023 17:28

Add test case

21e0378

corona10 marked this pull request as ready for review

July 9, 2023 17:56

corona10 requested a review frombrandtbucher as acode owner

July 9, 2023 17:56

bedevere-bot added the awaiting core review label

Jul 9, 2023

Copy link

MemberAuthor

corona10 commentedJul 9, 2023

@carljm Please take a look :)

corona10 changed the title~~gh-104635: Extend optimization for dead store elimination~~gh-104635: Expand optimization for dead store elimination

Jul 9, 2023

Copy link

MemberAuthor

corona10 commentedJul 10, 2023•
edited
Loading

~~Umm, this patch is missing the following case. (same line, multi variable multi target case)~~
~~I will improve for such case, do not merge yet :)~~

a, a, b, b = 1, 2, 3, 4

corona10 added the DO-NOT-MERGE label

Jul 10, 2023

Improve coverage of optimizaiton

8c5d3fb

corona10 removed the DO-NOT-MERGE label

Jul 10, 2023

Copy link

MemberAuthor

corona10 commentedJul 10, 2023•
edited
Loading

@carljm Ready to review, I wanted the linear time complexity for the first time, but it will require more.

Add comment

f419221

Add more comment

661629f

corona10 force-pushed thegh-104635-extend branch from30ffbf1 to661629fCompare

July 10, 2023 04:25

Copy link

MemberAuthor

corona10 commentedJul 10, 2023

>>> def f():...     a = 42; b = a + 54; a = 54...     return a, b...>>> f()Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "<stdin>", line 2, in fUnboundLocalError: cannot access local variable 'a' where it is not associated with a value

Found more edge case, I will close the PR. We may need some kinds of use-def analysis

corona10 closed this

Jul 10, 2023

corona10 mentioned this pull request

Jul 10, 2023

gh-104635: Add a test case for variables that have a dependency.#106583

Merged

Copy link

Member

markshannon commentedJul 10, 2023

I'm still unconvinced that this can do anything that a superoptimizer could not do.

_, a, b, c, _ = 1, 2, 3, 4, 5 is obviously equivalent toa, b, c, _ = 2, 3, 4, 5. A superoptimizer would produce an optimal sequence of code for this.

If you are only looking at linear sequences of instructions that are on a single line then, almost by definition, a superoptimizer will do at least as well.

Copy link

MemberAuthor

corona10 commentedJul 10, 2023

If you are only looking at linear sequences of instructions that are on a single line then, almost by definition, a superoptimizer will do at least as well.

Yeah, I agree if the super-optimizer is added, I expect that most of the cases will be covered, but I just tried it because the super-optimizer infra does not exist at this moment.

More digging will be more expensive than I expect, so I am not going to cover those cases at this moment.

(At least, we may need to add use-def analysis, but it should be compared to adding a super-optimizer process)