Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

bpo-46939: Specialize calls to Python classes (POSTCALL edition)#31936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Conversation

Fidget-Spinner
Copy link
Member

@Fidget-SpinnerFidget-Spinner commentedMar 16, 2022
edited by bedevere-bot
Loading

@Fidget-Spinner
Copy link
MemberAuthor

Final alternative method, if we don't want all the stack manipulation:PRECALL_PY_CLASS shall increfself, then set oparg ofPOST_CALL to 0 to signal it to returnself. ThenPOST_CALL accesses into above the TOS (into the already decrefed arguments) to findarg[0] (self).

Downside is that decref during error handling is tricky. Current method of pushing things onto the stack handles that for us.

@markshannon
Copy link
Member

Have you benchmarked this?
I'm worried that the extra 6 bytes per call, plus the extra code, is going to cost more than the specialization gains.

@Fidget-Spinner
Copy link
MemberAuthor

Have you benchmarked this? I'm worried that the extra 6 bytes per call, plus the extra code, is going to cost more than the specialization gains.

Not yet. I plan to do so tomorrow (your time). The results should be available by friday.

@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commentedMar 18, 2022
edited
Loading

compare_to RETURN_VALUE hack PR -> POSTCALL:

Slower (13):- scimark_lu: 184 ms +- 1 ms -> 191 ms +- 3 ms: 1.04x slower- go: 228 ms +- 2 ms -> 235 ms +- 2 ms: 1.03x slower- pathlib: 30.1 ms +- 0.5 ms -> 30.9 ms +- 0.4 ms: 1.03x slower- pidigits: 302 ms +- 1 ms -> 309 ms +- 1 ms: 1.02x slower- scimark_sor: 184 ms +- 2 ms -> 187 ms +- 4 ms: 1.02x slower- logging_simple: 9.09 us +- 0.14 us -> 9.22 us +- 0.14 us: 1.01x slower- json_dumps: 19.2 ms +- 0.2 ms -> 19.4 ms +- 0.2 ms: 1.01x slower- hexiom: 11.0 ms +- 0.1 ms -> 11.1 ms +- 0.1 ms: 1.01x slower- richards: 76.1 ms +- 1.6 ms -> 77.1 ms +- 1.8 ms: 1.01x slower- logging_silent: 171 ns +- 4 ns -> 173 ns +- 4 ns: 1.01x slower- telco: 10.1 ms +- 0.2 ms -> 10.2 ms +- 0.1 ms: 1.01x slower- dulwich_log: 119 ms +- 1 ms -> 120 ms +- 1 ms: 1.01x slower- deltablue: 6.38 ms +- 0.06 ms -> 6.44 ms +- 0.08 ms: 1.01x slowerFaster (11):- unpickle: 20.1 us +- 0.2 us -> 19.6 us +- 0.1 us: 1.02x faster- sqlite_synth: 3.49 us +- 0.13 us -> 3.41 us +- 0.07 us: 1.02x faster- regex_v8: 37.6 ms +- 0.8 ms -> 36.9 ms +- 0.3 ms: 1.02x faster- nqueens: 133 ms +- 1 ms -> 130 ms +- 1 ms: 1.02x faster- unpickle_list: 7.63 us +- 0.15 us -> 7.50 us +- 0.14 us: 1.02x faster- spectral_norm: 165 ms +- 1 ms -> 162 ms +- 4 ms: 1.02x faster- django_template: 56.8 ms +- 2.0 ms -> 56.0 ms +- 0.5 ms: 1.01x faster- unpack_sequence: 74.0 ns +- 0.7 ns -> 73.0 ns +- 0.8 ns: 1.01x faster- nbody: 147 ms +- 2 ms -> 145 ms +- 1 ms: 1.01x faster- regex_dna: 334 ms +- 1 ms -> 330 ms +- 2 ms: 1.01x faster- pickle: 16.1 us +- 0.2 us -> 15.9 us +- 0.3 us: 1.01x fasterBenchmark hidden because not significant (33): 2to3, chameleon, chaos, crypto_pyaes, fannkuch, float, html5lib, json_loads, logging_format, mako, meteor_contest, pickle_dict, pickle_list, pickle_pure_python, pyflate, python_startup, python_startup_no_site, raytrace, regex_compile, regex_effbot, scimark_fft, scimark_monte_carlo, scimark_sparse_mat_mult, sympy_expand, sympy_integrate, sympy_sum, sympy_str, tornado_http, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_generate, xml_etree_processGeometric mean: 1.00x slower

compare_to main->POSTCALL:

Slower (9):- pidigits: 297 ms +- 1 ms -> 309 ms +- 1 ms: 1.04x slower- fannkuch: 603 ms +- 6 ms -> 626 ms +- 5 ms: 1.04x slower- logging_simple: 8.91 us +- 0.12 us -> 9.22 us +- 0.14 us: 1.03x slower- richards: 75.4 ms +- 1.8 ms -> 77.1 ms +- 1.8 ms: 1.02x slower- logging_format: 10.1 us +- 0.2 us -> 10.2 us +- 0.2 us: 1.02x slower- pickle_dict: 47.6 us +- 0.2 us -> 48.3 us +- 0.2 us: 1.02x slower- mako: 17.7 ms +- 0.1 ms -> 18.0 ms +- 0.1 ms: 1.02x slower- go: 232 ms +- 2 ms -> 235 ms +- 2 ms: 1.01x slower- unpickle_list: 7.42 us +- 0.15 us -> 7.50 us +- 0.14 us: 1.01x slowerFaster (21):- raytrace: 486 ms +- 4 ms -> 445 ms +- 3 ms: 1.09x faster- chaos: 113 ms +- 1 ms -> 105 ms +- 1 ms: 1.07x faster- float: 126 ms +- 1 ms -> 118 ms +- 1 ms: 1.07x faster- nbody: 150 ms +- 2 ms -> 145 ms +- 1 ms: 1.03x faster- chameleon: 11.3 ms +- 0.2 ms -> 10.9 ms +- 0.1 ms: 1.03x faster- unpack_sequence: 75.2 ns +- 0.9 ns -> 73.0 ns +- 0.8 ns: 1.03x faster- scimark_fft: 538 ms +- 12 ms -> 523 ms +- 3 ms: 1.03x faster- regex_effbot: 5.32 ms +- 0.06 ms -> 5.21 ms +- 0.04 ms: 1.02x faster- scimark_sor: 191 ms +- 2 ms -> 187 ms +- 4 ms: 1.02x faster- deltablue: 6.57 ms +- 0.07 ms -> 6.44 ms +- 0.08 ms: 1.02x faster- regex_dna: 336 ms +- 12 ms -> 330 ms +- 2 ms: 1.02x faster- telco: 10.4 ms +- 0.4 ms -> 10.2 ms +- 0.1 ms: 1.02x faster- dulwich_log: 123 ms +- 1 ms -> 120 ms +- 1 ms: 1.02x faster- json_loads: 42.9 us +- 0.4 us -> 42.2 us +- 0.3 us: 1.02x faster- regex_v8: 37.5 ms +- 0.8 ms -> 36.9 ms +- 0.3 ms: 1.02x faster- pyflate: 711 ms +- 9 ms -> 702 ms +- 6 ms: 1.01x faster- unpickle: 19.9 us +- 0.3 us -> 19.6 us +- 0.1 us: 1.01x faster- regex_compile: 224 ms +- 1 ms -> 221 ms +- 1 ms: 1.01x faster- spectral_norm: 164 ms +- 3 ms -> 162 ms +- 4 ms: 1.01x faster- json_dumps: 19.6 ms +- 0.3 ms -> 19.4 ms +- 0.2 ms: 1.01x faster- tornado_http: 207 ms +- 4 ms -> 205 ms +- 4 ms: 1.01x fasterBenchmark hidden because not significant (27): 2to3, crypto_pyaes, django_template, hexiom, html5lib, logging_silent, meteor_contest, nqueens, pathlib, pickle, pickle_list, pickle_pure_python, python_startup, python_startup_no_site, scimark_lu, scimark_monte_carlo, scimark_sparse_mat_mult, sqlite_synth, sympy_expand, sympy_integrate, sympy_sum, sympy_str, unpickle_pure_python, xml_etree_parse, xml_etree_iterparse, xml_etree_generate, xml_etree_processGeometric mean: 1.01x faster

@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commentedMar 24, 2022
edited
Loading

@markshannon in short,POSTCALL isslightly slower than theRETURN_VALUE hack. But both approaches are still 1.01x faster thanmain.

Shall we go ahead withPOSTCALL then? I can update this PR with main if you're satisfied.

@Fidget-Spinner
Copy link
MemberAuthor

test_dis and_bootstrap_external are core developer merge race conditions :(.

@Fidget-Spinner
Copy link
MemberAuthor

Fidget-Spinner commentedApr 6, 2022
edited
Loading

If this is accepted, I have one more specialization in mind before 3.11 beta. Namely, calls to Python classes with non-Python__init__. E.g this common code pattern:

class MyOwnException(Exception): ......raise MyOwnException()   # Optimize this...

@brandtbucher
Copy link
Member

Should this be closed in favor of#99331?

@Fidget-SpinnerFidget-Spinner deleted the specialize_py_class_postcall branchJune 22, 2023 13:38
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

Assignees
No one assigned
Labels
awaiting core reviewperformancePerformance or resource usage
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

7 participants
@Fidget-Spinner@markshannon@brandtbucher@iritkatriel@the-knights-who-say-ni@ezio-melotti@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp