Experimental optimization techniques using LLVM-BOLT have been applied in CPython 3.12, but in order to provide this as an official feature, it must be ensured that the optimized binary does not change the expected behavior.

It has been confirmed that the peephole optimization provided by LLVM-BOLT re-raises the#53093 issue, and it passes the unit tests normally when this is disabled.

A detailed explanation of LLVM_BOLT will be shared at this year's Python Language Summit lightning talk, and materials will be published after the presentation.
(Also, I already shared the draft material at Discord of Python core team)

Performance Impact

I measured that enabling or disabling peephole optimization of LLVM-BOLT has almost no impact on performance, and it shows only noise-level results in the pyperformance benchmarks also.

pyperformance

Benchmark	as-is	to-be
async_generators	531 ms	550 ms: 1.04x slower
async_tree_cpu_io_mixed	1.10 sec	1.10 sec: 1.01x slower
chameleon	10.3 ms	10.4 ms: 1.01x slower
chaos	104 ms	103 ms: 1.01x faster
bench_thread_pool	1.24 ms	1.25 ms: 1.01x slower
coroutines	44.0 ms	44.2 ms: 1.01x slower
coverage	151 ms	147 ms: 1.02x faster
crypto_pyaes	118 ms	118 ms: 1.01x slower
deepcopy_reduce	4.68 us	4.70 us: 1.01x slower
deltablue	5.32 ms	5.28 ms: 1.01x faster
django_template	53.0 ms	54.3 ms: 1.02x slower
dulwich_log	110 ms	111 ms: 1.01x slower
fannkuch	630 ms	611 ms: 1.03x faster
float	116 ms	115 ms: 1.01x faster
generators	112 ms	112 ms: 1.00x slower
genshi_text	34.3 ms	34.5 ms: 1.01x slower
go	214 ms	212 ms: 1.01x faster
hexiom	9.28 ms	9.32 ms: 1.01x slower
json_loads	38.6 us	39.1 us: 1.01x slower
logging_format	10.1 us	10.0 us: 1.01x faster
logging_silent	157 ns	156 ns: 1.01x faster
logging_simple	9.04 us	9.14 us: 1.01x slower
mako	15.4 ms	15.7 ms: 1.02x slower
meteor_contest	165 ms	166 ms: 1.01x slower
nbody	166 ms	160 ms: 1.04x faster
nqueens	130 ms	132 ms: 1.02x slower
pathlib	26.9 ms	27.3 ms: 1.01x slower
pickle_list	5.94 us	5.90 us: 1.01x faster
pidigits	303 ms	303 ms: 1.00x slower
pprint_safe_repr	1.11 sec	1.11 sec: 1.01x slower
pprint_pformat	2.26 sec	2.27 sec: 1.00x slower
python_startup	12.7 ms	12.7 ms: 1.00x slower
python_startup_no_site	9.40 ms	9.39 ms: 1.00x faster
raytrace	461 ms	463 ms: 1.00x slower
regex_compile	194 ms	196 ms: 1.01x slower
regex_effbot	4.87 ms	4.80 ms: 1.01x faster
richards	66.6 ms	67.5 ms: 1.01x slower
scimark_fft	497 ms	492 ms: 1.01x faster
scimark_lu	178 ms	183 ms: 1.03x slower
scimark_monte_carlo	107 ms	106 ms: 1.01x faster
scimark_sor	182 ms	180 ms: 1.01x faster
scimark_sparse_mat_mult	6.30 ms	6.36 ms: 1.01x slower
spectral_norm	165 ms	169 ms: 1.02x slower
sqlglot_optimize	83.0 ms	83.4 ms: 1.00x slower
sqlglot_normalize	172 ms	173 ms: 1.00x slower
sqlite_synth	3.89 us	3.84 us: 1.01x faster
sympy_expand	742 ms	745 ms: 1.00x slower
unpickle_list	7.53 us	7.50 us: 1.00x faster
xml_etree_generate	132 ms	131 ms: 1.01x faster
Geometric mean	(ref)	1.00x slower

Benchmark hidden because not significant (31): 2to3, async_tree_none, async_tree_io, async_tree_memoization, bench_mp_pool, deepcopy, deepcopy_memo, docutils, genshi_xml, html5lib, json_dumps, mdp, pickle, pickle_dict, pickle_pure_python, pyflate, regex_dna, regex_v8, sqlglot_parse, sqlglot_transpile, sympy_integrate, sympy_sum, sympy_str, telco, tornado_http, unpack_sequence, unpickle, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_process

Issue:Make BOLT as stable feature #101525

pythongh-101525: Disable peephole optimization process of BOLT

16f15a1

bedevere-bot added the awaiting core review label

Apr 2, 2023

bedevere-bot mentioned this pull request

Apr 2, 2023

Make BOLT as stable feature#101525

Open

7 tasks

corona10 added the skip news label

Apr 2, 2023

corona10 requested review fromcarljm,erlend-aasland andgvanrossum

April 2, 2023 08:53

Copy link

MemberAuthor

corona10 commentedApr 2, 2023

@gvanrossum @carljm @erlend-aasland
I will merge this PR by next week if there is no serious objection.

carljm approved these changes

Apr 4, 2023

View reviewed changes

Copy link

Member

carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This looks fine to me.

Do we know why/how BOLT's peephole optimizations break compatibility? It seems that either we are implicitly relying on some undefined behavior, or the optimization is broken (and the BOLT team would likely be interested to know about that.)

bedevere-bot added awaiting merge and removed awaiting core review labels

Apr 4, 2023

Copy link

MemberAuthor

corona10 commentedApr 5, 2023

Do we know why/how BOLT's peephole optimizations break compatibility?

I haven't reached that point yet, I think I need to acquire assembly code after BOLTed,
I will try to investigate the point too.

corona10 merged commita62ff97 intopython:main

Apr 5, 2023

bedevere-bot removed the awaiting merge label

Apr 5, 2023

corona10 deleted the gh-101525 branch

April 5, 2023 00:10

gaogaotiantian pushed a commit to gaogaotiantian/cpython that referenced this pull request

Apr 8, 2023

pythongh-101525: Disable peephole optimization process of BOLT (pytho…

1614139

…ngh-103187)Co-authored-by: Dong-hee Na <donghee.na@linecorp.com>

warsaw pushed a commit to warsaw/cpython that referenced this pull request

Apr 11, 2023

pythongh-101525: Disable peephole optimization process of BOLT (pytho…

9ca78d7

…ngh-103187)Co-authored-by: Dong-hee Na <donghee.na@linecorp.com>

corona10 mentioned this pull request

May 16, 2023

[WIP] gh-104523 overhaul build rules for optimized binaries#104525

Draft

Labels

skip news

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-101525: Disable peephole optimization process of BOLT#103187

gh-101525: Disable peephole optimization process of BOLT#103187

Uh oh!

Conversation

corona10 commentedApr 2, 2023•
edited
Loading

Uh oh!

Background