NotificationsYou must be signed in to change notification settings
Fork33.7k
Star70.4k

gh-97933: add opcode for more efficient comprehension execution#101310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Jump to bottom

Closed

carljm wants to merge3 commits intopython:mainfromcarljm:inlinecomp

Closed

gh-97933: add opcode for more efficient comprehension execution#101310

carljm wants to merge3 commits intopython:mainfromcarljm:inlinecomp

Conversation

Copy link

Member

carljm commentedJan 25, 2023•
edited by bedevere-bot
Loading

This avoids allocating a throwaway single-use function object every time we run a comprehension. Otherwise it shouldn't have any user-visible impact; the comprehension is still a separate code object and runs in its own frame, just as before. Tracebacks look the same, etc. We just have a newCOMPREHENSION opcode that builds a frame directly from code
object and optional closure and inline-calls it, without creating a function object.

In a micro-benchmark of comprehension execution time, this looks like it saves about 25%:

On inlinecomp (e4a68550426dbea79bd9fc6ff0a75395891d35b7):➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 243 ns +- 14 ns➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 240 ns +- 11 nsOn main (f02fa64bf2d03ef7a28650c164e17a5fb5d8543d):➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 324 ns +- 11 ns➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 329 ns +- 18 ns

Currently this doesn't handle async comprehensions or generator expressions; those still create a function object andCALL it. In principle I think they could be handled as well, but to keep the PR smaller I'll defer that to a second PR if this one is merged.

Closes#97933.

Issue:Inline dict/list/set comprehensions in the compiler for better performance #97933

add opcode for more efficient comprehension execution

e4a6855

carljm requested review fromiritkatriel andmarkshannon ascode owners

January 25, 2023 00:53

bedevere-bot added the awaiting review label

Jan 25, 2023

bedevere-bot mentioned this pull request

Jan 25, 2023

Inline dict/list/set comprehensions in the compiler for better performance#97933

Closed

📜🤖 Added by blurb_it.

ee2ad56

Copy link

MemberAuthor

carljm commentedJan 25, 2023

@markshannon or@iritkatriel (or anyone else from faster-cpython team) -- if you could kick off a PyPerformance run with your fancy Github Action runner, I'd be much obliged!

carljm added the performancePerformance or resource usage label

Jan 25, 2023

Copy link

Member

iritkatriel commentedJan 25, 2023

@markshannon or@iritkatriel (or anyone else from faster-cpython team) -- if you could kick off a PyPerformance run with your fancy Github Action runner, I'd be much obliged!

Done.

Copy link

Member

iritkatriel commentedJan 25, 2023

Benchmark results are here:https://github.com/faster-cpython/benchmarking/blob/main/results/bm-20230125-3.12.0a4+-ee2ad56/bm-20230125-linux-x86_64-carljm-inlinecomp-3.12.0a4+-ee2ad56-vs-base.md

(shows no difference overall).

Copy link

MemberAuthor

carljm commentedJan 25, 2023

I can't see the detailed results (private repo), but if there's no impact overall then I guess they aren't that interesting :) We must not have any benchmarks that make heavy use of comprehensions.

Copy link

Member

markshannon commentedJan 25, 2023

All benchmarks:

Benchmark	bm-20230124-linux-x86_64-python-f02fa64bf2d03ef7a286-3.12.0a4+-f02fa64	bm-20230125-linux-x86_64-carljm-inlinecomp-3.12.0a4+-ee2ad56
2to3	251 ms	249 ms: 1.01x faster
async_generators	349 ms	355 ms: 1.02x slower
async_tree_memoization	624 ms	656 ms: 1.05x slower
asyncio_tcp	488 ms	492 ms: 1.01x slower
chaos	66.2 ms	64.6 ms: 1.02x faster
bench_thread_pool	775 us	781 us: 1.01x slower
coroutines	24.8 ms	25.6 ms: 1.03x slower
deepcopy_reduce	2.91 us	2.96 us: 1.02x slower
deepcopy_memo	33.8 us	34.6 us: 1.02x slower
docutils	2.55 sec	2.51 sec: 1.02x faster
fannkuch	373 ms	369 ms: 1.01x faster
float	73.2 ms	72.0 ms: 1.02x faster
create_gc_cycles	1.47 ms	1.44 ms: 1.02x faster
gc_traversal	4.30 ms	3.64 ms: 1.18x faster
generators	76.5 ms	75.9 ms: 1.01x faster
go	138 ms	134 ms: 1.03x faster
gunicorn	1.07 ms	1.06 ms: 1.00x faster
json	4.62 ms	4.69 ms: 1.01x slower
json_dumps	9.32 ms	9.57 ms: 1.03x slower
json_loads	24.5 us	24.2 us: 1.01x faster
logging_format	6.40 us	6.31 us: 1.01x faster
logging_silent	92.8 ns	91.7 ns: 1.01x faster
logging_simple	5.76 us	5.80 us: 1.01x slower
mako	9.80 ms	9.70 ms: 1.01x faster
mdp	2.69 sec	2.51 sec: 1.07x faster
pathlib	17.7 ms	17.9 ms: 1.01x slower
pickle	10.1 us	10.2 us: 1.02x slower
pickle_dict	30.9 us	32.4 us: 1.05x slower
pickle_list	4.12 us	4.29 us: 1.04x slower
pickle_pure_python	286 us	288 us: 1.01x slower
pidigits	189 ms	190 ms: 1.00x slower
pycparser	1.15 sec	1.09 sec: 1.05x faster
pyflate	402 ms	400 ms: 1.01x faster
python_startup	8.98 ms	8.89 ms: 1.01x faster
python_startup_no_site	6.50 ms	6.44 ms: 1.01x faster
raytrace	281 ms	284 ms: 1.01x slower
regex_compile	127 ms	128 ms: 1.01x slower
regex_dna	210 ms	201 ms: 1.05x faster
regex_effbot	3.49 ms	3.42 ms: 1.02x faster
regex_v8	22.4 ms	21.3 ms: 1.05x faster
richards	41.7 ms	42.6 ms: 1.02x slower
scimark_fft	301 ms	303 ms: 1.01x slower
scimark_monte_carlo	65.6 ms	64.7 ms: 1.01x faster
scimark_sparse_mat_mult	3.96 ms	3.99 ms: 1.01x slower
spectral_norm	95.3 ms	96.2 ms: 1.01x slower
sqlglot_optimize	51.0 ms	50.5 ms: 1.01x faster
sqlglot_normalize	105 ms	103 ms: 1.02x faster
sympy_expand	453 ms	450 ms: 1.01x faster
sympy_integrate	19.7 ms	19.7 ms: 1.00x faster
sympy_sum	154 ms	155 ms: 1.00x slower
telco	6.26 ms	6.46 ms: 1.03x slower
thrift	737 us	748 us: 1.01x slower
tornado_http	93.6 ms	94.5 ms: 1.01x slower
unpack_sequence	46.7 ns	44.4 ns: 1.05x faster
unpickle	13.2 us	13.1 us: 1.01x faster
unpickle_pure_python	197 us	201 us: 1.02x slower
xml_etree_iterparse	109 ms	106 ms: 1.02x faster
xml_etree_process	54.1 ms	53.8 ms: 1.01x faster
Geometric mean	(ref)	1.00x faster

Benchmark hidden because not significant (33): aiohttp, async_tree_none, async_tree_cpu_io_mixed, async_tree_io, chameleon, bench_mp_pool, coverage, crypto_pyaes, dask, deepcopy, deltablue, django_template, djangocms, dulwich_log, genshi_text, genshi_xml, hexiom, html5lib, meteor_contest, mypy, nbody, nqueens, pprint_safe_repr, pprint_pformat, scimark_lu, scimark_sor, sqlglot_parse, sqlglot_transpile, sqlite_synth, sympy_str, unpickle_list, xml_etree_parse, xml_etree_generate

This was referencedFeb 14, 2023

Allow thef_func field of the_PyInterpreterFrame struct to be any object (and rename it)#96237

Closed

add comprehensions benchmarkpython/pyperformance#265

Merged

Merge branch 'main' into inlinecomp

544b30c

* main: (225 commits)pythongh-102056: Fix a few bugs in error handling of exception printing code (python#102078)pythongh-102011: use sys.exception() instead of sys.exc_info() in docs where possible (python#102012)pythongh-101566: Sync with zipp 3.14. (pythonGH-102018)pythonGH-99818: improve the documentation for zipfile.Path and Traversable (pythonGH-101589)pythongh-88233: zipfile: handle extras after a zip64 extra (pythonGH-96161)pythongh-101981: Apply HOMEBREW related environment variables (pythongh-102074)pythongh-101907: Stop using `_Py_OPCODE` and `_Py_OPARG` macros (pythonGH-101912)pythongh-101819: Adapt _io types to heap types, batch 1 (pythonGH-101949)pythongh-101981: Build macOS as recommended by the devguide (pythonGH-102070)pythongh-97786: Fix compiler warnings in pytime.c (python#101826)pythongh-101578: Amend PyErr_{Set,Get}RaisedException docs (python#101962)  Misc improvements to the float tutorial (pythonGH-102052)pythongh-85417: Clarify behaviour on branch cuts in cmath module (python#102046)pythongh-100425: Update tutorial docs related to sum() accuracy (FH-101854)  Add missing 'is' to `cmath.log()` docstring (python#102049)pythongh-100210: Correct the comment link for unescaping HTML (python#100212)pythongh-97930: Also include subdirectory in makefile. (python#102030)pythongh-99735: Use required=True in argparse subparsers example (python#100927)  Fix incorrectly documented attribute in csv docs (python#101250)pythonGH-84783: Make the slice object hashable (pythonGH-101264)  ...

Copy link

MemberAuthor

carljm commentedMar 8, 2023

Closing this for now in favor of#101441

May reopen this approach if PEP 709 (implemented by that PR) is rejected.

carljm closed this

Mar 8, 2023

Labels

awaiting review performance

Performance or resource usage

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

gh-97933: add opcode for more efficient comprehension execution#101310

gh-97933: add opcode for more efficient comprehension execution#101310

Uh oh!

Conversation

carljm commentedJan 25, 2023•
edited by bedevere-bot
Loading

Uh oh!

Uh oh!

carljm commentedJan 25, 2023

Uh oh!

iritkatriel commentedJan 25, 2023

Uh oh!

iritkatriel commentedJan 25, 2023

Uh oh!

carljm commentedJan 25, 2023

Uh oh!

markshannon commentedJan 25, 2023

All benchmarks:

Uh oh!

carljm commentedMar 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Movatterモバイル変換

Uh oh!

gh-97933: add opcode for more efficient comprehension execution#101310

gh-97933: add opcode for more efficient comprehension execution#101310

Uh oh!

Conversation

carljm commentedJan 25, 2023• edited by bedevere-botLoading Uh oh!There was an error while loading.Please reload this page.

Uh oh!

Uh oh!

carljm commentedJan 25, 2023

Uh oh!

iritkatriel commentedJan 25, 2023

Uh oh!

iritkatriel commentedJan 25, 2023

Uh oh!

carljm commentedJan 25, 2023

Uh oh!

markshannon commentedJan 25, 2023

All benchmarks:

Uh oh!

carljm commentedMar 8, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

carljm commentedJan 25, 2023•
edited by bedevere-bot
Loading