Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-104635: Expand optimization for dead store elimination#106571

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
corona10 wants to merge9 commits intopython:mainfromcorona10:gh-104635-extend

Conversation

@corona10
Copy link
Member

@corona10corona10 commentedJul 9, 2023
edited
Loading

Explain

We might also want to expand this so it can look more than one instruction ahead for another write to the same location, skipping over an allowlist of instructions that we know don't read that same local, don't execute arbitrary code, and can't raise an exception. E.g. LOAD_FAST of a different local, POP_TOP, NOP, probably others.

This PR is motivated by@carljm 's comment from#105040 (comment)
The limitation of#105320 is that the optimization only covers the next redundant opcode, but with this patch, the optimization can be applied to more than one instruction ahead for another write to the same location.

Benchmark

importpyperfrunner=pyperf.Runner()runner.timeit(name="bench dead_store",stmt="""_, a, b, c, _ = 1, 2, 3, 4, 5""")
Mean +- std dev: [base] 13.1 ns +- 0.2 ns -> [opt] 12.6 ns +- 0.2 ns: 1.04x faster

TODO (Done)

  • Fix test_dis.py
  • Add testcase for this optimization.

@corona10corona10 changed the titlegh-104635: Extend optimization for dead store elimination[WIP] gh-104635: Extend optimization for dead store eliminationJul 9, 2023
@corona10
Copy link
MemberAuthor

corona10 commentedJul 9, 2023
edited
Loading

./_bootstrap_python ./Programs/_freeze_module.py runpy ./Lib/runpy.py Python/frozen_modules/runpy.hAssertion failed: (OPCODE_HAS_ARG(opcode) || HAS_TARGET(opcode) || oparg == 0), function instr_sequence_addop, file compile.c, line 243.make: *** [Python/frozen_modules/runpy.h] Abort trap: 6make: *** Waiting for unfinished jobs....

Interesting crash; this crash was not triggered at my local machine. I will take a look.

-> Done

@corona10corona10 changed the title[WIP] gh-104635: Extend optimization for dead store eliminationgh-104635: Extend optimization for dead store eliminationJul 9, 2023
@corona10corona10 marked this pull request as ready for reviewJuly 9, 2023 16:06
@corona10corona10 marked this pull request as draftJuly 9, 2023 16:07
@corona10corona10 marked this pull request as ready for reviewJuly 9, 2023 17:56
@corona10
Copy link
MemberAuthor

@carljm Please take a look :)

@corona10corona10 changed the titlegh-104635: Extend optimization for dead store eliminationgh-104635: Expand optimization for dead store eliminationJul 9, 2023
@corona10
Copy link
MemberAuthor

corona10 commentedJul 10, 2023
edited
Loading

Umm, this patch is missing the following case. (same line, multi variable multi target case)
I will improve for such case, do not merge yet :)

a, a, b, b = 1, 2, 3, 4

@corona10
Copy link
MemberAuthor

corona10 commentedJul 10, 2023
edited
Loading

@carljm Ready to review, I wanted the linear time complexity for the first time, but it will require more.

@corona10
Copy link
MemberAuthor

>>> def f():...     a = 42; b = a + 54; a = 54...     return a, b...>>> f()Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "<stdin>", line 2, in fUnboundLocalError: cannot access local variable 'a' where it is not associated with a value

Found more edge case, I will close the PR. We may need some kinds of use-def analysis

@markshannon
Copy link
Member

I'm still unconvinced that this can do anything that a superoptimizer could not do.

_, a, b, c, _ = 1, 2, 3, 4, 5 is obviously equivalent toa, b, c, _ = 2, 3, 4, 5. A superoptimizer would produce an optimal sequence of code for this.

If you are only looking at linear sequences of instructions that are on a single line then, almost by definition, a superoptimizer will do at least as well.

@corona10
Copy link
MemberAuthor

If you are only looking at linear sequences of instructions that are on a single line then, almost by definition, a superoptimizer will do at least as well.

Yeah, I agree if the super-optimizer is added, I expect that most of the cases will be covered, but I just tried it because the super-optimizer infra does not exist at this moment.

More digging will be more expensive than I expect, so I am not going to cover those cases at this moment.

(At least, we may need to add use-def analysis, but it should be compared to adding a super-optimizer process)

carljm reacted with thumbs up emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@carljmcarljmAwaiting requested review from carljm

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

@iritkatrieliritkatrielAwaiting requested review from iritkatrieliritkatriel is a code owner

@brandtbucherbrandtbucherAwaiting requested review from brandtbucher

Assignees

No one assigned

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@corona10@markshannon@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp