Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-91049: Introduce set vectorcall field API for PyFunctionObject#92257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@adphrost
Copy link
Contributor

@adphrostadphrost commentedMay 3, 2022
edited by bedevere-bot
Loading

Avoid specializing functions with overridden vectorcall field

@bedevere-bot
Copy link

Every change to Pythonrequires a NEWS entry.

Please, add it using theblurb_it Web app or theblurb command-line tool.

@adphrost
Copy link
ContributorAuthor

@markshannon Can you let me know if there should be any other checks inceval.c besides the one added in this PR?

For context, this check fixes an edge case where calling function with a set vectorfield only worked with a unpacked tuple, e.g. f(1) failed butf(*(1,)) succeeded (now both succeed)

Copy link
Contributor

@itamaroitamaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

thanks for working on this@adphrost !

intpositional_args=total_args-KWNAMES_LEN();
// Check if the call can be inlined or not
if (Py_TYPE(function)==&PyFunction_Type&&tstate->interp->eval_frame==NULL) {
if (Py_TYPE(function)==&PyFunction_Type&&tstate->interp->eval_frame==NULL&& ((PyFunctionObject*)function)->vectorcall==_PyFunction_Vectorcall) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

this probably exceeds line length

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

What is the max line length allowed? And is there a linter I can run to handle this for me?


This PR introduces an API call where vectorcall field can be modified like so:

`void PyFunction_SetVectorcall(PyFunctionObject *func, vectorcallfunc vectorcall);`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please document the function in Doc/c-api/function.rst and in the NEWS entry, refer to use with:

:c:func:`PyFunction_SetVectorcall`

New public functions should also be documented in What's New in Python 3.11 > C API > New features: Doc/whatsnew/3.11.rst.

Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

A few comments.
Have you attempted to measure the performance impact of this? (I'd expect it to be small or negligible).

Change how assertion is doneCo-authored-by: Itamar Ostricher <itamarost@gmail.com>
@ghost
Copy link

ghost commentedMay 3, 2022
edited by ghost
Loading

All commit authors signed the Contributor License Agreement.
CLA signed

@itamaro
Copy link
Contributor

Have you attempted to measure the performance impact of this? (I'd expect it to be small or negligible).

@markshannon is there a simple way to ask a buildbot to run pyperformance comparing this against the base revision? we can do that manually, but a buildbot doing it for us would be nice :)

@itamaro
Copy link
Contributor

circling back to@markshannon 's performance impact question. -- tldrGeometric mean: 1.00x faster

some more details:

rebased this PR ontoa4460f2 and ran pyperformance ona4460f2 and on the rebased branch

both baseline and branch were built using./configure --enable-optimizations --with-lto && make -j

used pyperformance 1.0.5 (installed directly from GitHub) on bare metal AWS machine (Linux-5.13.0-1023-aws-x86_64-with-glibc2.31 with 72 logical CPUs), and excluded the sqlalchemy benchmarks because they tried building greenlet and failed

pyperformance run --benchmarks=-sqlalchemy_declarative,-sqlalchemy_imperative --python builds/3.12-opt/python -o main.json --rigorouspyperformance run --benchmarks=-sqlalchemy_declarative,-sqlalchemy_imperative --python builds/3.12-gh-91049-vectorcall-opt/python -o gh-91049.json --rigorous

compare results:

pyperf compare_to main.json gh-91049.json -GSlower (20):- regex_effbot: 3.02 ms +- 0.01 ms -> 3.17 ms +- 0.01 ms: 1.05x slower- meteor_contest: 110 ms +- 2 ms -> 114 ms +- 4 ms: 1.04x slower- pidigits: 200 ms +- 0 ms -> 208 ms +- 0 ms: 1.04x slower- telco: 7.17 ms +- 0.21 ms -> 7.39 ms +- 0.34 ms: 1.03x slower- html5lib: 64.0 ms +- 2.7 ms -> 65.6 ms +- 2.6 ms: 1.02x slower- unpickle_list: 5.30 us +- 0.08 us -> 5.40 us +- 0.05 us: 1.02x slower- pyflate: 430 ms +- 3 ms -> 438 ms +- 4 ms: 1.02x slower- sympy_sum: 168 ms +- 2 ms -> 171 ms +- 2 ms: 1.02x slower- xml_etree_iterparse: 110 ms +- 1 ms -> 112 ms +- 3 ms: 1.02x slower- async_tree_memoization: 640 ms +- 25 ms -> 650 ms +- 25 ms: 1.01x slower- chaos: 71.8 ms +- 0.7 ms -> 72.8 ms +- 0.6 ms: 1.01x slower- hexiom: 6.71 ms +- 0.06 ms -> 6.80 ms +- 0.06 ms: 1.01x slower- sympy_str: 299 ms +- 3 ms -> 302 ms +- 4 ms: 1.01x slower- spectral_norm: 101 ms +- 1 ms -> 103 ms +- 1 ms: 1.01x slower- sympy_integrate: 21.5 ms +- 0.2 ms -> 21.8 ms +- 0.2 ms: 1.01x slower- logging_silent: 105 ns +- 1 ns -> 107 ns +- 3 ns: 1.01x slower- raytrace: 303 ms +- 2 ms -> 306 ms +- 4 ms: 1.01x slower- regex_v8: 23.1 ms +- 0.3 ms -> 23.3 ms +- 0.1 ms: 1.01x slower- unpickle_pure_python: 239 us +- 2 us -> 240 us +- 2 us: 1.01x slower- 2to3: 282 ms +- 2 ms -> 283 ms +- 1 ms: 1.00x slowerFaster (25):- genshi_xml: 57.1 ms +- 2.7 ms -> 54.3 ms +- 1.8 ms: 1.05x faster- scimark_sparse_mat_mult: 4.93 ms +- 0.06 ms -> 4.70 ms +- 0.10 ms: 1.05x faster- scimark_lu: 112 ms +- 2 ms -> 109 ms +- 2 ms: 1.03x faster- json_dumps: 13.3 ms +- 0.2 ms -> 12.9 ms +- 0.1 ms: 1.03x faster- scimark_fft: 337 ms +- 3 ms -> 331 ms +- 3 ms: 1.02x faster- nbody: 97.6 ms +- 1.8 ms -> 95.9 ms +- 1.8 ms: 1.02x faster- pathlib: 23.0 ms +- 1.2 ms -> 22.6 ms +- 0.8 ms: 1.02x faster- pickle_pure_python: 324 us +- 3 us -> 318 us +- 3 us: 1.02x faster- genshi_text: 23.2 ms +- 0.4 ms -> 22.8 ms +- 0.3 ms: 1.02x faster- unpack_sequence: 46.5 ns +- 0.6 ns -> 45.7 ns +- 0.7 ns: 1.02x faster- fannkuch: 401 ms +- 6 ms -> 396 ms +- 4 ms: 1.01x faster- chameleon: 7.06 ms +- 0.06 ms -> 6.98 ms +- 0.08 ms: 1.01x faster- pickle_dict: 27.5 us +- 0.1 us -> 27.2 us +- 1.0 us: 1.01x faster- pickle: 10.2 us +- 0.1 us -> 10.1 us +- 0.1 us: 1.01x faster- xml_etree_parse: 170 ms +- 2 ms -> 168 ms +- 3 ms: 1.01x faster- xml_etree_process: 56.8 ms +- 0.5 ms -> 56.3 ms +- 0.5 ms: 1.01x faster- nqueens: 88.0 ms +- 1.1 ms -> 87.2 ms +- 1.1 ms: 1.01x faster- json_loads: 26.3 us +- 0.3 us -> 26.1 us +- 0.2 us: 1.01x faster- async_tree_none: 549 ms +- 15 ms -> 544 ms +- 12 ms: 1.01x faster- async_tree_cpu_io_mixed: 771 ms +- 13 ms -> 766 ms +- 10 ms: 1.01x faster- mako: 10.5 ms +- 0.0 ms -> 10.4 ms +- 0.1 ms: 1.01x faster- crypto_pyaes: 77.2 ms +- 0.8 ms -> 76.8 ms +- 0.9 ms: 1.01x faster- xml_etree_generate: 81.2 ms +- 0.6 ms -> 80.8 ms +- 0.5 ms: 1.00x faster- regex_compile: 142 ms +- 1 ms -> 141 ms +- 1 ms: 1.00x faster- scimark_sor: 121 ms +- 1 ms -> 120 ms +- 1 ms: 1.00x fasterBenchmark hidden because not significant (18): async_tree_io, deltablue, django_template, dulwich_log, float, go, logging_format, logging_simple, pickle_list, python_startup, python_startup_no_site, regex_dna, richards, scimark_monte_carlo, sqlite_synth, sympy_expand, tornado_http, unpickleGeometric mean: 1.00x faster

thanks@ansley for doing the perf analysis on this :)

Copy link
Member

@vstinnervstinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Oh. This change reminds me my oldhttps://peps.python.org/pep-0510/ nice :-)

staticPyObject*
override_vectorcall(
PyObject*callable,PyObject*const*args,size_tnargsf,PyObject*kwnames) {
Py_RETURN_NONE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Could you modify the test to return the string "overridden" instead of just returning None?

(Contributed by Wenzel Jakob in:gh:`93012`.)

* Add new function:c:func:`PyFunction_SetVectorcall` to the C API
which sets the vectorcall field of a given PyFunctionObject
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You may explain the purpose of the function here, like mention Cinder JIT and Pyjion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'd rather not have a "purpose". We should state clearly what the function does, not bless some particular use case.

@adphrost
Could you add a warning that the new function pointer must haveexactly the same behavior as the unaltered function.
I am a little concerned that C extensions will abuse this, and we will have deal with the bug reports.

CPython extensions providing optimized execution of Python bytecode (like Cinder JIT and Pyjion)
can benefit from being able to modify the vectorcall field on instances of PyFunctionObject to allow calling the optimized path (e.g. JIT-compiled) directly.

This PR introduces an API call where vectorcall field can be modified like so:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This text lands intohttps://docs.python.org/dev/whatsnew/changelog.html#changelog which is a changlog, you should not mention "a PR" there.

I suggest to just copy/paste what you wrote in whatsnew.

(Contributed by Wenzel Jakob in:gh:`93012`.)

* Add new function:c:func:`PyFunction_SetVectorcall` to the C API
which sets the vectorcall field of a given PyFunctionObject
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
which sets the vectorcall field of a given PyFunctionObject
which sets the vectorcall field of a given:c:type:`PyFunctionObject`.

@itamaro
Copy link
Contributor

@vstinner@markshannon I think all comments were addressed - any further feedback?

..c:function::voidPyFunction_SetVectorcall(PyFunctionObject *func, vectorcallfunc vectorcall)
Set the vectorcall field of a given function object *func*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
Set the vectorcall field of a given function object *func*
Set the vectorcall field of a given function object *func*.

(Contributed by Wenzel Jakob in:gh:`93012`.)

* Add new function:c:func:`PyFunction_SetVectorcall` to the C API
which sets the vectorcall field of a given:c:type:`PyFunctionObject`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
which sets the vectorcall field of a given:c:type:`PyFunctionObject`
which sets the vectorcall field of a given:c:type:`PyFunctionObject`.

* Add new function:c:func:`PyFunction_SetVectorcall` to the C API
which sets the vectorcall field of a given:c:type:`PyFunctionObject`
Warning: extensions using this API must preserve the behavior
of the unaltered function!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think that this document is the right place to put a warning. Why this warning is not in the documentation of the function?

which sets the vectorcall field of a given:c:type:`PyFunctionObject`

Warning: extensions using this API must preserve the behavior
of the unaltered function!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Please remove this warning.

@@ -0,0 +1,5 @@
Add new function:c:func:`PyFunction_SetVectorcall` to the C API
which sets the vectorcall field of a given:c:type:`PyFunctionObject`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
which sets the vectorcall field of a given:c:type:`PyFunctionObject`
which sets the vectorcall field of a given:c:type:`PyFunctionObject`.

@itamaro
Copy link
Contributor

@vstinner@markshannon thanks for the feedback. with@markshannon 's last response I think the remaining feedback is minor styling and phrasing. if that's all that left for this to be merged, I'm happy to perform these changes, but also totally ok with a core dev making these last changes when merging 😄

@itamaro
Copy link
Contributor

pushed a fix toLOAD_ATTR_GETATTRIBUTE_OVERRIDDEN to deopt on function version instead of argcount

@markshannon I also reviewed the interpreter loop and didn't found any other cases that inline calls without deopting on function version. the methodology I used is searching for occurrences ofinlined_py_calls - let me know if this should cover everything!

(btw, I noticed thatBINARY_SUBSCR_GETITEM containsCALL_STAT_INC(inlined_py_calls) twice - is that intentional?)

Copy link
Member

@Fidget-SpinnerFidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The parts touching the specializer and specialized opcodes LGTM.

@Fidget-Spinner
Copy link
Member

@itamaro

(btw, I noticed that BINARY_SUBSCR_GETITEM contains CALL_STAT_INC(inlined_py_calls) twice - is that intentional?)

No that's wrong.

Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

One minor formatting nit. Otherwise looks good.

@itamaro
Copy link
Contributor

(btw, I noticed that BINARY_SUBSCR_GETITEM contains CALL_STAT_INC(inlined_py_calls) twice - is that intentional?)

No that's wrong.

I'll file a separate issue for that

@markshannon
Copy link
Member

markshannon commentedSep 7, 2022
edited
Loading

All looks good to me.
@vstinner anything else before I merge this?

Copy link
Member

@Fidget-SpinnerFidget-Spinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The test LGTM in general but I have a few comments. Thanks!

itamaro reacted with thumbs up emoji
1. test specialization doesn't trigger when vectorcall is already overridden2. test specialization gets deopted when vectorcall is overridden after specialization triggered
@gvanrossum
Copy link
Member

Almost ready. Who's on the hook to get this merged?

itamaro reacted with thumbs up emoji

Copy link
Member

@markshannonmarkshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good now.

itamaro reacted with thumbs up emoji
@markshannon
Copy link
Member

I'll merge once the conflict has been resolved.

@itamaro
Copy link
Contributor

I'll merge once the conflict has been resolved.

resolved!

@markshannonmarkshannon merged commita41ed97 intopython:mainSep 15, 2022
@vstinner
Copy link
Member

PyFunction_SetVectorcall() is really a cool feature, I love it! I'm not sure yet how people will manage to abuse it, but I love it :-D

itamaro reacted with heart emojitonybaloney reacted with rocket emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@vstinnervstinnervstinner left review comments

@itamaroitamaroitamaro left review comments

@Fidget-SpinnerFidget-SpinnerFidget-Spinner left review comments

@markshannonmarkshannonmarkshannon approved these changes

@erlend-aaslanderlend-aaslanderlend-aasland approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

9 participants

@adphrost@bedevere-bot@itamaro@markshannon@carljm@Fidget-Spinner@gvanrossum@vstinner@erlend-aasland

[8]ページ先頭

©2009-2025 Movatter.jp