Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-90370: Avoid temporaryvarargs tuple creation in argument passing#30312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
colorfulappl wants to merge27 commits intopython:mainfromcolorfulappl:opt_ac

Conversation

colorfulappl
Copy link
Contributor

@colorfulapplcolorfulappl commentedDec 31, 2021
edited by erlend-aasland
Loading

When "Augument Clinic generated code" are parsing arguments, the args are packed to a tuple before passing to callee. This may be unnecessary.

Pass a raw pointer which points to on-stack varargs, and a varargssize integer to indicate how many varargs are passed, can save the time of tuple creation/destruction and value copy.

arhadthedev, dgiger42, and erlend-aasland reacted with thumbs up emojieendebakpt reacted with eyes emoji
@the-knights-who-say-ni

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept this contribution by verifying everyone involved has signed thePSF contributor agreement (CLA).

CLA Missing

Our records indicate the following people have not signed the CLA:

@colorfulappl

For legal reasons we need all the people listed to sign the CLA before we can look at your contribution. Please followthe steps outlined in the CPython devguide to rectify this issue.

If you have recently signed the CLA, please wait at least one business day
before our records are updated.

You cancheck yourself to see if the CLA has been received.

Thanks again for the contribution, we look forward to reviewing it!

Copy link
Member

@sweeneydesweeneyde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Just a couple of comments here.

I wonder if it would be simpler to restrict the scope of such a change to only signatures likedef f(*args, kw1=<default1>, kw2=<default2>, ...), likeprint()/min()/max() have.

@@ -3341,16 +3342,16 @@ test_vararg_and_posonly(PyObject *module, PyObject *const *args, Py_ssize_t narg
for (Py_ssize_t i = 0; i < nargs - 1; ++i) {
PyTuple_SET_ITEM(__clinic_args, i, args[1 + i]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This seems suspicious -- isn't __clinic_args a pointer-to-PyObject-pointer, not a pointer-to-tuple?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

My fault. The tuple doesn't need to be created.

if (!args) {
goto exit;
}
a = args[0];
__clinic_args = args[1];
return_value = test_vararg_impl(module, a, __clinic_args);
__clinic_args = (PyObject *const *)args[1];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It would be nice to avoid such unsafe casts. Maybe passing aPyObject *** into_PyArg_UnpackKeywordsWithVarargFast instead?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This copy and cast are unnecessary.
Maybe__clinic_args = args + 1; is enough, note theargs refers to the 2nd argument of function test_vararg, assuming it's not assigned in line 3389.

buf[vararg] = (PyObject *)&args[vararg];

/* copy required positional args */
for (i = 0; i < vararg; i++) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Isn't it possible to avoid copying at all by directly passing some sub-buffer of the*args buffer into the..._impl function?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You are right. None of the positional args or variable-length args need to be copied. I will let _PyArg_UnpackKeywordsWithVarargFast parse default and keyword args only.

@isidenticalisidentical self-requested a reviewJanuary 21, 2022 14:04
@isidentical
Copy link
Member

Thanks for the PR and benchmarks@colorfulappl, I am current blocked a bit but plan to review this soon!

@colorfulappl
Copy link
ContributorAuthor

Thanks for the PR and benchmarks@colorfulappl, I am current blocked a bit but plan to review this soon!

Thank you@isidentical !
With@sweeneyde 's comment I have found that my patch is actually buggy.
I am writing a new patch but blocked by some affairs last a few days.
You can review after I push a new commit and ignore this version. Sorry :)

isidentical reacted with thumbs up emoji

@isidentical
Copy link
Member

Sure, then just ping me whenever the new patch ready and I'll try to take a look at it then!

colorfulappl reacted with thumbs up emoji

@colorfulappl
Copy link
ContributorAuthor

Sorry for such a long delay.

When generate parsing code for function with varargs, this PR:

  1. Introduce a new function_PyArg_UnpackKeywordsWithVarargKwonly.
    It's a copy of_PyArg_UnpackKeywordsWithVararg but copying of positional args and packing of varargs are deleted.
    _PyArg_UnpackKeywordsWithVarargKwonly returns an array pointer of parsed keyword args only.
  2. Positional args are not copied toargsbuf.argsbuf only saves parsed keyword args.
  3. Passvarargsize and vararg array pointer to XXXX_impl, this avoid pack/unpack of varargs.

I'd really appreciate it if you could take a look and give me some advice@isidentical .

@colorfulappl
Copy link
ContributorAuthor

Benchmark Result

Benchmark:
https://bugs.python.org/file50533

Environment:
OS: macOS 12.3
CPU: Apple M1 Pro
Compiler: Apple clang version 13.1.6
Build args: --enable-optimizations --with-lto

+-----------------------------------------------------+---------+-----------------------+| Benchmark                                           | base    | opt                   |+=====================================================+=========+=======================+| print(a)                                            | 133 ns  | 126 ns: 1.06x faster  |+-----------------------------------------------------+---------+-----------------------+| print(a, b, c)                                      | 337 ns  | 326 ns: 1.04x faster  |+-----------------------------------------------------+---------+-----------------------+| print(a, b, c, *v)                                  | 679 ns  | 661 ns: 1.03x faster  |+-----------------------------------------------------+---------+-----------------------+| print(a, sep='', file=stdout)                       | 134 ns  | 125 ns: 1.07x faster  |+-----------------------------------------------------+---------+-----------------------+| print(a, sep='', flush=True, file=stdout)           | 571 ns  | 563 ns: 1.01x faster  |+-----------------------------------------------------+---------+-----------------------+| print(*v, sep='', flush=True, file=stdout)          | 833 ns  | 819 ns: 1.02x faster  |+-----------------------------------------------------+---------+-----------------------+| print(a, b, c, *v, sep='', flush=True, file=stdout) | 1.18 us | 1.16 us: 1.02x faster |+-----------------------------------------------------+---------+-----------------------+| Geometric mean                                      | (ref)   | 1.03x faster          |+-----------------------------------------------------+---------+-----------------------+

@colorfulappl
Copy link
ContributorAuthor

I found there are some bugs when argument clinic processes varargs and made a fix in#32092.
Maybe this patch should wait until these bugs are fixed?

erlend-aasland reacted with thumbs up emoji

@arhadthedev

This comment was marked as outdated.

@colorfulappl
Copy link
ContributorAuthor

we prefer keeping all of the PR history since it makes reviewing easier, so please do not force-push.

Thanks for the advice; I will pay attention to it.

I added some test cases for the class method__new__ (without actually creating the objects, only packing and returning the arguments). And I found that when the__new__ or__init__ function accepts a vararg only, the argument clinic generated code does not check keyword arguments.

e.g.

# Signature of this function:## @classmethod# _testclinic.TestClassVarargOnly.__new__(cls, *args)test_class_new_vararg_only(kw=1)# Here we expect an error like "TypeError: TestClassVarargOnly.__new__() got an unexpected keyword argument 'kw'",# but nothing raised.

We may need to fix it in another issue.

erlend-aasland reacted with thumbs up emoji

@colorfulappl

This comment was marked as outdated.

@arhadthedev

This comment was marked as outdated.

max_pos
), indent=4))
p.converter.parser_name,
p.name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I would expect herep.converter.name instead.

I'm trying your patch for converting some vararg functions in the math module (see theold conversion attempt), and it seems that without this change autogenerated code is broken, because it references a Python parameter name instead of C (i.e.,coordinates, instead ofargs):

staticPyObject*math_hypot_impl(PyObject*module,Py_ssize_tvarargssize,PyObject*const*args);staticPyObject*math_hypot(PyObject*module,PyObject*const*args,Py_ssize_tnargs){PyObject*return_value=NULL;Py_ssize_tvarargssize=nargs;PyObject*const*__clinic_args;if (!_PyArg_CheckPositional("hypot",nargs,0,PY_SSIZE_T_MAX)) {        gotoexit;    }__clinic_args=coordinates+0;return_value=math_hypot_impl(module,varargssize,__clinic_args);exit:returnreturn_value;}

Seedocs. BTW, I think there should be a test for that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hardcoding "args" will work too, but I'm not sure it's a good idea.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for the advice.
When the function signature isMETH_FASTCALL, the pass-in parameter is always namedargs, so replacingp.name withargs works.
I made a fix and added two tests. It looks correct now.

def test_vararg_with_default_with_other_name(self):
with self.assertRaises(TypeError):
ac_tester.vararg_with_default_with_other_name()
self.assertEqual(ac_tester.vararg_with_default_with_other_name(1, b=False), (1, (), False))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You could add an exception test with_b.

@skirpichev
Copy link
Contributor

Not sure, if this is a good idea, but could you consider using nargs instead of varargsize for *_impl() functions? This will reduce needs for changing the body of converted function (take math_gcd as an example to convert from scratch).

@colorfulappl
Copy link
ContributorAuthor

When "Augument Clinic generated code" are parsing arguments, the args are packed to a tuple before passing to callee. This may be unnecessary.

This patch intends to reduce the overhead on passingvararg from Argument Clinic generated code to user-implemented *.impl() function.

Before this patch,vararg arguments are packed to a tuple, *_impl() function fetches these arguments by access the tuple items. In such a case, the number ofvararg is implied in the tuple object.
After this patch, a pointer which points tovararg array is passed to *_impl() function directly, so thevarargssize is necessary to tell the user how manyvararg arguments is received.

but could you consider using nargs instead of varargsize for *_impl() functions?

@skirpichev Do you mean pass bothnargs andmax_pos to *_impl() function and let the user to calculate the number ofvararg arguments? This may be confusing for users, and violate the purpose of Argument Clinic (Argument Clinic How-To - Abstract):

Argument Clinic is a preprocessor for CPython C files. Its purpose is to automate all the boilerplate involved with writing argument parsing code for “builtins”.

@skirpichev
Copy link
Contributor

Indeed, my suggestion only valid for only-vararg case (as gcd, lcm and so on). Not sure if this case worth a special treatment...

@erlend-aaslanderlend-aasland changed the titlebpo-46212: Avoid temporaryvarargs tuple creation in argument passinggh-90370: Avoid temporaryvarargs tuple creation in argument passingFeb 27, 2023
@skirpichev
Copy link
Contributor

@colorfulappl, could you fix merge conflicts?

@skirpichevskirpichev added the staleStale PR or inactive for long period of time. labelOct 26, 2024
@skirpichev
Copy link
Contributor

Thanks for your work, but this has merge conflicts. Also, issue seems to be fixed. I'm closing this.

@skirpichevskirpichev removed the staleStale PR or inactive for long period of time. labelNov 8, 2024
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@skirpichevskirpichevskirpichev requested changes

@sweeneydesweeneydesweeneyde left review comments

@isidenticalisidenticalAwaiting requested review from isidentical

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

12 participants
@colorfulappl@the-knights-who-say-ni@isidentical@hauntsaninja@erlend-aasland@arhadthedev@skirpichev@sweeneyde@serhiy-storchaka@ezio-melotti@bedevere-bot@AlexWaygood

[8]ページ先頭

©2009-2025 Movatter.jp