Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-97933: add opcode for more efficient comprehension execution#101310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
carljm wants to merge3 commits intopython:mainfromcarljm:inlinecomp

Conversation

@carljm
Copy link
Member

@carljmcarljm commentedJan 25, 2023
edited by bedevere-bot
Loading

This avoids allocating a throwaway single-use function object every time we run a comprehension. Otherwise it shouldn't have any user-visible impact; the comprehension is still a separate code object and runs in its own frame, just as before. Tracebacks look the same, etc. We just have a newCOMPREHENSION opcode that builds a frame directly from code
object and optional closure and inline-calls it, without creating a function object.

In a micro-benchmark of comprehension execution time, this looks like it saves about 25%:

On inlinecomp (e4a68550426dbea79bd9fc6ff0a75395891d35b7):➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 243 ns +- 14 ns➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 240 ns +- 11 nsOn main (f02fa64bf2d03ef7a28650c164e17a5fb5d8543d):➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 324 ns +- 11 ns➜ ./python -m pyperf timeit -s 'l = [1, 2, 3, 4, 5]' '[x for x in l]'.....................Mean +- std dev: 329 ns +- 18 ns

Currently this doesn't handle async comprehensions or generator expressions; those still create a function object andCALL it. In principle I think they could be handled as well, but to keep the PR smaller I'll defer that to a second PR if this one is merged.

Closes#97933.

itamaro, dgiger42, AlexWaygood, Fidget-Spinner, and tekknolagi reacted with rocket emoji
@carljm
Copy link
MemberAuthor

@markshannon or@iritkatriel (or anyone else from faster-cpython team) -- if you could kick off a PyPerformance run with your fancy Github Action runner, I'd be much obliged!

@carljmcarljm added the performancePerformance or resource usage labelJan 25, 2023
@iritkatriel
Copy link
Member

@markshannon or@iritkatriel (or anyone else from faster-cpython team) -- if you could kick off a PyPerformance run with your fancy Github Action runner, I'd be much obliged!

Done.

carljm reacted with heart emoji

@iritkatriel
Copy link
Member

Benchmark results are here:https://github.com/faster-cpython/benchmarking/blob/main/results/bm-20230125-3.12.0a4+-ee2ad56/bm-20230125-linux-x86_64-carljm-inlinecomp-3.12.0a4+-ee2ad56-vs-base.md

(shows no difference overall).

carljm reacted with thumbs up emoji

@carljm
Copy link
MemberAuthor

I can't see the detailed results (private repo), but if there's no impact overall then I guess they aren't that interesting :) We must not have any benchmarks that make heavy use of comprehensions.

@markshannon
Copy link
Member

All benchmarks:

Benchmarkbm-20230124-linux-x86_64-python-f02fa64bf2d03ef7a286-3.12.0a4+-f02fa64bm-20230125-linux-x86_64-carljm-inlinecomp-3.12.0a4+-ee2ad56
2to3251 ms249 ms: 1.01x faster
async_generators349 ms355 ms: 1.02x slower
async_tree_memoization624 ms656 ms: 1.05x slower
asyncio_tcp488 ms492 ms: 1.01x slower
chaos66.2 ms64.6 ms: 1.02x faster
bench_thread_pool775 us781 us: 1.01x slower
coroutines24.8 ms25.6 ms: 1.03x slower
deepcopy_reduce2.91 us2.96 us: 1.02x slower
deepcopy_memo33.8 us34.6 us: 1.02x slower
docutils2.55 sec2.51 sec: 1.02x faster
fannkuch373 ms369 ms: 1.01x faster
float73.2 ms72.0 ms: 1.02x faster
create_gc_cycles1.47 ms1.44 ms: 1.02x faster
gc_traversal4.30 ms3.64 ms: 1.18x faster
generators76.5 ms75.9 ms: 1.01x faster
go138 ms134 ms: 1.03x faster
gunicorn1.07 ms1.06 ms: 1.00x faster
json4.62 ms4.69 ms: 1.01x slower
json_dumps9.32 ms9.57 ms: 1.03x slower
json_loads24.5 us24.2 us: 1.01x faster
logging_format6.40 us6.31 us: 1.01x faster
logging_silent92.8 ns91.7 ns: 1.01x faster
logging_simple5.76 us5.80 us: 1.01x slower
mako9.80 ms9.70 ms: 1.01x faster
mdp2.69 sec2.51 sec: 1.07x faster
pathlib17.7 ms17.9 ms: 1.01x slower
pickle10.1 us10.2 us: 1.02x slower
pickle_dict30.9 us32.4 us: 1.05x slower
pickle_list4.12 us4.29 us: 1.04x slower
pickle_pure_python286 us288 us: 1.01x slower
pidigits189 ms190 ms: 1.00x slower
pycparser1.15 sec1.09 sec: 1.05x faster
pyflate402 ms400 ms: 1.01x faster
python_startup8.98 ms8.89 ms: 1.01x faster
python_startup_no_site6.50 ms6.44 ms: 1.01x faster
raytrace281 ms284 ms: 1.01x slower
regex_compile127 ms128 ms: 1.01x slower
regex_dna210 ms201 ms: 1.05x faster
regex_effbot3.49 ms3.42 ms: 1.02x faster
regex_v822.4 ms21.3 ms: 1.05x faster
richards41.7 ms42.6 ms: 1.02x slower
scimark_fft301 ms303 ms: 1.01x slower
scimark_monte_carlo65.6 ms64.7 ms: 1.01x faster
scimark_sparse_mat_mult3.96 ms3.99 ms: 1.01x slower
spectral_norm95.3 ms96.2 ms: 1.01x slower
sqlglot_optimize51.0 ms50.5 ms: 1.01x faster
sqlglot_normalize105 ms103 ms: 1.02x faster
sympy_expand453 ms450 ms: 1.01x faster
sympy_integrate19.7 ms19.7 ms: 1.00x faster
sympy_sum154 ms155 ms: 1.00x slower
telco6.26 ms6.46 ms: 1.03x slower
thrift737 us748 us: 1.01x slower
tornado_http93.6 ms94.5 ms: 1.01x slower
unpack_sequence46.7 ns44.4 ns: 1.05x faster
unpickle13.2 us13.1 us: 1.01x faster
unpickle_pure_python197 us201 us: 1.02x slower
xml_etree_iterparse109 ms106 ms: 1.02x faster
xml_etree_process54.1 ms53.8 ms: 1.01x faster
Geometric mean(ref)1.00x faster

Benchmark hidden because not significant (33): aiohttp, async_tree_none, async_tree_cpu_io_mixed, async_tree_io, chameleon, bench_mp_pool, coverage, crypto_pyaes, dask, deepcopy, deltablue, django_template, djangocms, dulwich_log, genshi_text, genshi_xml, hexiom, html5lib, meteor_contest, mypy, nbody, nqueens, pprint_safe_repr, pprint_pformat, scimark_lu, scimark_sor, sqlglot_parse, sqlglot_transpile, sqlite_synth, sympy_str, unpickle_list, xml_etree_parse, xml_etree_generate

carljm reacted with thumbs up emoji

* main: (225 commits)pythongh-102056: Fix a few bugs in error handling of exception printing code (python#102078)pythongh-102011: use sys.exception() instead of sys.exc_info() in docs where possible (python#102012)pythongh-101566: Sync with zipp 3.14. (pythonGH-102018)pythonGH-99818: improve the documentation for zipfile.Path and Traversable (pythonGH-101589)pythongh-88233: zipfile: handle extras after a zip64 extra (pythonGH-96161)pythongh-101981: Apply HOMEBREW related environment variables (pythongh-102074)pythongh-101907: Stop using `_Py_OPCODE` and `_Py_OPARG` macros (pythonGH-101912)pythongh-101819: Adapt _io types to heap types, batch 1 (pythonGH-101949)pythongh-101981: Build macOS as recommended by the devguide (pythonGH-102070)pythongh-97786: Fix compiler warnings in pytime.c (python#101826)pythongh-101578: Amend PyErr_{Set,Get}RaisedException docs (python#101962)  Misc improvements to the float tutorial (pythonGH-102052)pythongh-85417: Clarify behaviour on branch cuts in cmath module (python#102046)pythongh-100425: Update tutorial docs related to sum() accuracy (FH-101854)  Add missing 'is' to `cmath.log()` docstring (python#102049)pythongh-100210: Correct the comment link for unescaping HTML (python#100212)pythongh-97930: Also include subdirectory in makefile. (python#102030)pythongh-99735: Use required=True in argparse subparsers example (python#100927)  Fix incorrectly documented attribute in csv docs (python#101250)pythonGH-84783: Make the slice object hashable (pythonGH-101264)  ...
@carljm
Copy link
MemberAuthor

Closing this for now in favor of#101441

May reopen this approach if PEP 709 (implemented by that PR) is rejected.

@carljmcarljm closed thisMar 8, 2023
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

@iritkatrieliritkatrielAwaiting requested review from iritkatrieliritkatriel is a code owner

Assignees

No one assigned

Labels

awaiting reviewperformancePerformance or resource usage

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

Inline dict/list/set comprehensions in the compiler for better performance

4 participants

@carljm@iritkatriel@markshannon@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp