Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-143361: Pass PY_VECTORCALL_ARGUMENTS_OFFSET in _Py_CallBuiltinClass_StackRefSteal#143367

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@a12k
Copy link
Contributor

@a12ka12k commentedJan 2, 2026

This PR fixes a missed optimization.

The function_Py_CallBuiltinClass_StackRefSteal usesSTACKREFS_TO_PYOBJECTS. However, the code was not passing thePY_VECTORCALL_ARGUMENTS_OFFSET flag to the callee. This forced vector-callable types to reallocate and copy the arguments whenever they needed to prepend an argument.

Verification
Using LLDB to inspectlong_vectorcall when reached via this path:

  • Before fix:nargsf = 1
  • After fix:nargsf = 0x8000000000000001 (High bit correctly set)

Benchmarks
I ran a small benchmark file allocating a class in a tight loop to trigger the path:

  • Baseline: 1.6193s
  • Optimized: 1.5560s
  • Result: ~3.9% speedup in this specific specialized call path.

gh-143361

Benchmark script
importtimeclassWorker:def__init__(self,a,b,c):self.a=aself.b=bself.c=cdefrun_benchmark(iterations):cls=Workerstart=time.perf_counter()for_inrange(iterations):cls(1,2,3)cls(4,5,6)cls(7,8,9)cls(10,11,12)cls(13,14,15)returntime.perf_counter()-startrun_benchmark(10000)#warm it upITERS=1_000_000print(f"Time:{run_benchmark(ITERS):.4f}s")</details>

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't think we need this, this is internal.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I mentally went back and forth on this. It's true that guideline is that "strictly internal changes with no user-visible effects" do not warrant an entry. My reasoning for inclusion was that a 3.9% speedup in a core code path is indeed significant enough to external, end-users that this change warrants an entry. It also seems to be how this sort of change inceval.c. has been handled in the past. However, I’ll gladly defer to the guidance of the core developers who review the PR.

hauntsaninja reacted with thumbs up emoji
@Fidget-SpinnerFidget-Spinner merged commitb538c28 intopython:mainJan 2, 2026
93 of 95 checks passed
@a12ka12k deleted the fix-vectorcall-offset-ceval branchJanuary 2, 2026 20:53
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@StanFromIrelandStanFromIrelandStanFromIreland left review comments

@Fidget-SpinnerFidget-SpinnerFidget-Spinner approved these changes

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@a12k@Fidget-Spinner@StanFromIreland

[8]ページ先頭

©2009-2026 Movatter.jp