Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-133231: Changes to executor management to support proposedsys._jit module#133287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
markshannon merged 8 commits intopython:mainfromfaster-cpython:current-executor
May 4, 2025

Conversation

markshannon
Copy link
Member

@markshannonmarkshannon commentedMay 2, 2025
edited by bedevere-appbot
Loading

This PR:

  • Tracks the current executor on the thread-state, not the previous one
  • Sets the current executor to NULL when entering PyEval_EvalDefault and resetting it when leaving.
  • Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.

Should simplify the implementation ofsys._jit.is_enabled() as it is simplytstate->current_executor != NULL and doesn't require additional state to be tracked.

… Batch executors for deallocation to avoid having to constantly incref executors; this is an ad-hoc form of deferred reference counting.
@markshannon
Copy link
MemberAuthor

The msvc jit failure might be fixed by#132746

@Fidget-Spinner
Copy link
Member

The msvc jit failure might be fixed by#132746

Wow I did not expect that to fix it.

Copy link
Member

@brandtbucherbrandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think the overall approach is neat, but I think there are a couple of potential issues (and possible cleanups) here.

Long story short, anything that uses theentry_frame for this purpose is going to run into issues with the tail-calling interpreter (whether it's your approach, or mine). It's a hard problem.

Comment on lines +993 to +996
typedef struct {
_PyInterpreterFrame frame;
_PyStackRef stack[1];
} _PyEntryFrame;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is this needed? We only need one slot inlocalsplus, which_PyInterpreterFrame already has.

Copy link
MemberAuthor

@markshannonmarkshannonMay 3, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We now need two slots. One on the stack for the return value, as before, and one local variable to hold the current executor.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Ah, I see. Forgot about the return value.

Maybe add a comment that the one stack slot is for the return value, and the one local is for the executor.

@bedevere-app
Copy link

When you're done making the requested changes, leave the comment:I have made the requested changes; please review again.

@brandtbucher
Copy link
Member

Idead: if we have a structstruct { PyExecutorObject *executor, struct node *prev} node;, we can stack allocate one of these in everyGOTO_TIER_TWO call and restore it afterwards. Then keep the top on thetstate.

Something like this:

diff --git a/Python/ceval_macros.h b/Python/ceval_macros.hindex e1d2673848c..c77c2cafee7 100644--- a/Python/ceval_macros.h+++ b/Python/ceval_macros.h@@ -360,10 +360,13 @@ do {                                                   \     OPT_STAT_INC(traces_executed);                     \     _PyExecutorObject *_executor = (EXECUTOR);         \     jit_func jitted = _executor->jit_code;             \+    node prev = tstate->current_executor;              \+    tstate->current_executor = (node){_executor, &prev}; \     /* Keep the shim frame alive via the executor: */  \     Py_INCREF(_executor);                              \     next_instr = jitted(frame, stack_pointer, tstate); \     Py_DECREF(_executor);                              \+    tstate->current_executor = prev;                   \     Py_CLEAR(tstate->previous_executor);               \     frame = tstate->current_frame;                     \     stack_pointer = _PyFrame_GetStackPointer(frame);   \@@ -379,6 +382,9 @@ do { \     OPT_STAT_INC(traces_executed); \     next_uop = (EXECUTOR)->trace; \     assert(next_uop->opcode == _START_EXECUTOR); \+    /* prev is declared local to the tier two interpreter: */ \+    prev = tstate->current_executor; \+    tstate->current_executor = (node){_executor, &prev}; \     goto enter_tier_two; \ } while (0) #endif@@ -386,6 +392,7 @@ do { \ #define GOTO_TIER_ONE(TARGET)                                         \     do                                                                \     {                                                                 \+        tstate->current_executor = prev;                              \         next_instr = (TARGET);                                        \         OPT_HIST(trace_uop_execution_counter, trace_run_length_hist); \         _PyFrame_SetStackPointer(frame, stack_pointer);               \

No entry frame needed.

@markshannon
Copy link
MemberAuthor

I don't see how that would work, as macros don't have scope. But its moot anyway as wecan use the entry frame.

The entry frame exists in the tail-calling interpreter, it is theentry variable that we can't use. But we don't need to. Anywhere wereturn, either inINTERPRETER_EXIT orexit_unwind, the frame is the entry frame:frame == &entry.frame.

@markshannon
Copy link
MemberAuthor

I have made the requested changes; please review again

@bedevere-app
Copy link

Thanks for making the requested changes!

@brandtbucher: please review the changes made to this pull request.

@brandtbucher
Copy link
Member

I don't see how that would work, as macros don't have scope.

I mean, it would just be a local declared in the_PyEval_EvalFrameDefault itself for the tail-calling interpreter, and would become local to whatever bytecode is using the macro on other builds.

But as you say...

But its moot anyway as we can use the entry frame.

The entry frame exists in the tail-calling interpreter, it is theentry variable that we can't use. But we don't need to. Anywhere wereturn, either inINTERPRETER_EXIT orexit_unwind, the frame is the entry frame:frame == &entry.frame.

Ah, this is what I missed. Makes sense: we're always able to set it when entering because we haveentry handy. And we're always able to reset it when exiting because it's just the current frame. Thanks for explaining.

Copy link
Member

@brandtbucherbrandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Looks good, thanks. I'm just going to kick off some JIT benchmarks to be safe.

next_instr = (TARGET); \
assert(tstate->current_executor == NULL); \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Minor: this seems redundant, given that we're setting it one line above.

__attribute__((musttail)) return jitted(frame, stack_pointer, tstate); \
} while (0)

#undef GOTO_TIER_ONE
#define GOTO_TIER_ONE(TARGET) \
do { \
tstate->current_executor = NULL; \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

You can either do this here, or in the second half of the_Py_JIT implementation ofGOTO_TIER_TWO inceval_macros.h, after the JIT code returns. No need to do it in both places.

markshannon reacted with thumbs up emoji
@@ -5316,7 +5322,6 @@ dummy_func(
}

tier2 op(_START_EXECUTOR, (executor/4 --)) {
Copy link
Member

@brandtbucherbrandtbucherMay 3, 2025
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Side note, this instruction now exists solely to set thecurrent_executor in the tier two interpreter. We should try to remove it as a follow-up (doesn't need to be in this PR).

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

We can replace all uses ofcurrent_executor with an operand, which will remove thecurrent_executor local variable from the tier 2 interpreter and the_JIT_EXECUTOR patch from the JIT template.

We can also shrink_CHECK_VALIDITY a bit by replacingcurrent_executor->vm_data.valid with*validity_ptr:

tier2op(_CHECK_VALIDITY, (validity_ptr/4--)) {DEOPT_IF(*validity_ptr==0);}

For another PR.

@brandtbucher
Copy link
Member

Looks fine.

@markshannonmarkshannon merged commitac7d5ba intopython:mainMay 4, 2025
71 checks passed
diegorusso added a commit to diegorusso/cpython that referenced this pull requestMay 4, 2025
* origin/main: (111 commits)pythongh-91048: Add filename and line number to external inspection routines (pythonGH-133385)pythongh-131178: Add tests for `ast` command-line interface (python#133329)  Regenerate pcbuild.sln in Visual Studio 2022 (python#133394)pythongh-133042: disable HACL* HMAC on Emscripten (python#133064)pythongh-133351: Fix remote PDB's multi-line block tab completion (python#133387)pythongh-109700: Improve stress tests for interpreter creation (pythonGH-109946)pythongh-81793: Skip tests for os.link() to symlink on Android (pythonGH-133388)pythongh-126835: Rename `ast_opt.c` to `ast_preprocess.c` and related stuff after moving const folding to the peephole optimizier (python#131830)pythongh-91048: Relax test_async_global_awaited_by to fix flakyness (python#133368)pythongh-132457: make staticmethod and classmethod generic (python#132460)pythongh-132805: annotationlib: Fix handling of non-constant values in FORWARDREF (python#132812)pythongh-132426: Add get_annotate_from_class_namespace replacing get_annotate_function (python#132490)pythongh-81793: Always call linkat() from os.link(), if available (pythonGH-132517)pythongh-122559: Synchronize C and Python implementation of the io module about pickling (pythonGH-122628)pythongh-69605: Add PyREPL import autocomplete feature to 'What's New' (python#133358)  bpo-44172: Keep reference to original window in curses subwindow objects (pythonGH-26226)pythonGH-133231: Changes to executor management to support proposed `sys._jit` module (pythonGH-133287)pythongh-133363: Fix Cmd completion for lines beginning with `! ` (python#133364)pythongh-132983: Introduce `_zstd` bindings module (pythonGH-133027)pythonGH-91048: Add utils for printing the call stack for asyncio tasks (python#133284)  ...
@markshannonmarkshannon deleted the current-executor branchMay 5, 2025 11:53
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@brandtbucherbrandtbucherbrandtbucher approved these changes

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

@savannahostrowskisavannahostrowskiAwaiting requested review from savannahostrowskisavannahostrowski is a code owner

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@markshannon@Fidget-Spinner@brandtbucher

[8]ページ先頭

©2009-2025 Movatter.jp