Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Add return const instruction #101632

Closed
Closed
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usage
@penguin-wwy

Description

@penguin-wwy

From the pystats doc (pystats-2023-02-05-python-5a2b984.md), I find thatLOAD_CONST + RETURN_VALUE is a very high frequency (Because the default return of the function is None).

Successors for LOAD_CONST

SuccessorsCountPercentage
RETURN_VALUE969,173,65121.8%
BINARY_OP_ADD_INT418,647,9979.4%
LOAD_CONST403,185,7749.1%
COMPARE_AND_BRANCH_INT314,633,7927.1%
STORE_FAST295,563,6266.6%

And predecessors for RETURN_VALUE

PredecessorsCountPercentage
LOAD_CONST969,173,65129.9%
LOAD_FAST505,933,34315.6%
RETURN_VALUE382,698,37311.8%
BUILD_TUPLE328,532,24010.1%
COMPARE_OP107,210,8033.3%

This means that if we add aRETURN_CONST, we can reduce theRETURN_VALUE instruction by 30% and theLOAD_CONST instruction by 20%.

./bin/python3 -m pyperf timeit -w 3 --compare-to ../python-3.12/bin/python3 -s "def test():    return 10000" "test()"/python-3.12/bin/python3: ..................... 27.0 ns +- 0.3 ns/cpython/bin/python3: ..................... 25.0 ns +- 0.5 nsMean +- std dev: [/python-3.12/bin/python3] 27.0 ns +- 0.3 ns -> [/cpython/bin/python3] 25.0 ns +- 0.5 ns: 1.08x faster./bin/python3 -m pyperf timeit -w 3 --compare-to ../python-3.12/bin/python3 -s "def test():    return None" "test()"/python-3.12/bin/python3: ..................... 27.2 ns +- 1.3 ns/cpython/bin/python3: ..................... 25.1 ns +- 0.6 nsMean +- std dev: [/python-3.12/bin/python3] 27.2 ns +- 1.3 ns -> [/cpython/bin/python3] 25.1 ns +- 0.6 ns: 1.08x faster

From the microbenchmark that there is indeed a ~10% improvement (considering the interference of function calls, I think 10% should be there), which is not very high, but it should be an optimization without adverse effects.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)performancePerformance or resource usage

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp