Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-101525: Disable peephole optimization process of BOLT#103187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
corona10 merged 1 commit intopython:mainfromcorona10:gh-101525
Apr 5, 2023

Conversation

@corona10
Copy link
Member

@corona10corona10 commentedApr 2, 2023
edited
Loading

Background

Experimental optimization techniques using LLVM-BOLT have been applied in CPython 3.12, but in order to provide this as an official feature, it must be ensured that the optimized binary does not change the expected behavior.

It has been confirmed that the peephole optimization provided by LLVM-BOLT re-raises the#53093 issue, and it passes the unit tests normally when this is disabled.

A detailed explanation of LLVM_BOLT will be shared at this year's Python Language Summit lightning talk, and materials will be published after the presentation.
(Also, I already shared the draft material at Discord of Python core team)

Performance Impact

I measured that enabling or disabling peephole optimization of LLVM-BOLT has almost no impact on performance, and it shows only noise-level results in the pyperformance benchmarks also.

pyperformance

Benchmarkas-isto-be
async_generators531 ms550 ms: 1.04x slower
async_tree_cpu_io_mixed1.10 sec1.10 sec: 1.01x slower
chameleon10.3 ms10.4 ms: 1.01x slower
chaos104 ms103 ms: 1.01x faster
bench_thread_pool1.24 ms1.25 ms: 1.01x slower
coroutines44.0 ms44.2 ms: 1.01x slower
coverage151 ms147 ms: 1.02x faster
crypto_pyaes118 ms118 ms: 1.01x slower
deepcopy_reduce4.68 us4.70 us: 1.01x slower
deltablue5.32 ms5.28 ms: 1.01x faster
django_template53.0 ms54.3 ms: 1.02x slower
dulwich_log110 ms111 ms: 1.01x slower
fannkuch630 ms611 ms: 1.03x faster
float116 ms115 ms: 1.01x faster
generators112 ms112 ms: 1.00x slower
genshi_text34.3 ms34.5 ms: 1.01x slower
go214 ms212 ms: 1.01x faster
hexiom9.28 ms9.32 ms: 1.01x slower
json_loads38.6 us39.1 us: 1.01x slower
logging_format10.1 us10.0 us: 1.01x faster
logging_silent157 ns156 ns: 1.01x faster
logging_simple9.04 us9.14 us: 1.01x slower
mako15.4 ms15.7 ms: 1.02x slower
meteor_contest165 ms166 ms: 1.01x slower
nbody166 ms160 ms: 1.04x faster
nqueens130 ms132 ms: 1.02x slower
pathlib26.9 ms27.3 ms: 1.01x slower
pickle_list5.94 us5.90 us: 1.01x faster
pidigits303 ms303 ms: 1.00x slower
pprint_safe_repr1.11 sec1.11 sec: 1.01x slower
pprint_pformat2.26 sec2.27 sec: 1.00x slower
python_startup12.7 ms12.7 ms: 1.00x slower
python_startup_no_site9.40 ms9.39 ms: 1.00x faster
raytrace461 ms463 ms: 1.00x slower
regex_compile194 ms196 ms: 1.01x slower
regex_effbot4.87 ms4.80 ms: 1.01x faster
richards66.6 ms67.5 ms: 1.01x slower
scimark_fft497 ms492 ms: 1.01x faster
scimark_lu178 ms183 ms: 1.03x slower
scimark_monte_carlo107 ms106 ms: 1.01x faster
scimark_sor182 ms180 ms: 1.01x faster
scimark_sparse_mat_mult6.30 ms6.36 ms: 1.01x slower
spectral_norm165 ms169 ms: 1.02x slower
sqlglot_optimize83.0 ms83.4 ms: 1.00x slower
sqlglot_normalize172 ms173 ms: 1.00x slower
sqlite_synth3.89 us3.84 us: 1.01x faster
sympy_expand742 ms745 ms: 1.00x slower
unpickle_list7.53 us7.50 us: 1.00x faster
xml_etree_generate132 ms131 ms: 1.01x faster
Geometric mean(ref)1.00x slower

Benchmark hidden because not significant (31): 2to3, async_tree_none, async_tree_io, async_tree_memoization, bench_mp_pool, deepcopy, deepcopy_memo, docutils, genshi_xml, html5lib, json_dumps, mdp, pickle, pickle_dict, pickle_pure_python, pyflate, regex_dna, regex_v8, sqlglot_parse, sqlglot_transpile, sympy_integrate, sympy_sum, sympy_str, telco, tornado_http, unpack_sequence, unpickle, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_process

@corona10
Copy link
MemberAuthor

@gvanrossum@carljm@erlend-aasland
I will merge this PR by next week if there is no serious objection.

Copy link
Member

@carljmcarljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This looks fine to me.

Do we know why/how BOLT's peephole optimizations break compatibility? It seems that either we are implicitly relying on some undefined behavior, or the optimization is broken (and the BOLT team would likely be interested to know about that.)

hauntsaninja reacted with thumbs up emoji
@corona10
Copy link
MemberAuthor

Do we know why/how BOLT's peephole optimizations break compatibility?

I haven't reached that point yet, I think I need to acquire assembly code after BOLTed,
I will try to investigate the point too.

carljm, hauntsaninja, and arhadthedev reacted with thumbs up emoji

@corona10corona10 merged commita62ff97 intopython:mainApr 5, 2023
@corona10corona10 deleted the gh-101525 branchApril 5, 2023 00:10
gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this pull requestApr 8, 2023
warsaw pushed a commit to warsaw/cpython that referenced this pull requestApr 11, 2023
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@carljmcarljmcarljm approved these changes

@erlend-aaslanderlend-aaslandAwaiting requested review from erlend-aaslanderlend-aasland is a code owner

@gvanrossumgvanrossumAwaiting requested review from gvanrossum

Assignees

No one assigned

Labels

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@corona10@carljm@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp