Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-128375: Better instrument forFOR_ITER#128445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

markshannon
Copy link
Member

@markshannonmarkshannon commentedJan 3, 2025
edited by bedevere-appbot
Loading

This PR:

  • Changes thePOP_TOP and the end of afor loop toPOP_ITER (it pops the iterator)
  • Removes theNOT_TAKEN followingFOR_ITER and instruments theFOR_ITER andPOP_ITER instead
  • Fixes handling ofEXTENDED_ARGS inco_branches().

The last one isn't strictly related to the issue, but came up while fixing the main issue.

Changing the instrumentation pattern has two effects:

  • For uninstrumented code, there is one less instruction dispatch (theNOT_TAKEN) per iteration, at the cost of one additional dispatch when the loop is exited.
  • For instrumented code, once the non-exit branch isDISABLEd, theFOR_ITER can be specialized and jitted.

Performance is neutral on the standard benchmarks for tier1, as expected.

Comment on lines 908 to 910
code_offset_to_line(code, src)-base,
code_offset_to_line(code, left)-base,
code_offset_to_line(code, right)-base
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
code_offset_to_line(code,src)-base,
code_offset_to_line(code,left)-base,
code_offset_to_line(code,right)-base
code_offset_to_line(code,src)-base,
code_offset_to_line(code,left)-base,
code_offset_to_line(code,right)-base,


tier1 inst(INSTRUMENTED_END_FOR, (receiver, value -- receiver)) {
/* Need to create a fake StopIteration error here,
* to conform to PEP 380 */
//* to conform to PEP 380ED_NO*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

?

* This has the benign side effect that if value is
* finalized it will see the location as the FOR_ITER's.
*/
frame->instr_ptr = prev_instr;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Is there a precedent for this? Seems like the kind of thing that can cause confusion/problems later on.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The tier 2 code generator doesn't like it either. I think I'll need to add a new annotation "no_saved_ip" to tell the code generator not to updateframe->instr_ptr.

Any suggestions for a better name than "no_saved_ip"?

@iritkatriel
Copy link
Member

I was wondering if we couldn't, instead of putting inNOT_TAKEN instructions, have a static table on the code object with the fallthrough locations. They are only needed for instrumentation, so could be processed only when instrumentation is enabled.

@nedbat
Copy link
Member

Trying this PR compared tofa985be, more of my tests fail (on a work-in-progress branch to account for BRANCH_LEFT etc). I'll debug my code, but I wish there were a way to get a concrete plan to work from.

@markshannon
Copy link
MemberAuthor

I was wondering if we couldn't, instead of putting in NOT_TAKEN instructions, have a static table on the code object with the fallthrough locations. They are only needed for instrumentation, so could be processed only when instrumentation is enabled.

I think that would be worse overall, for a few reasons:

  • We need to be able to disable the left and right events separately. Without aNOT_TAKEN instruction, eachPOP_JUMP... would needPOP_JUMP...INSTRUMENTED,POP_JUMP...INSTRUMENTED_LEFT_ONLY, andPOP_JUMP...INSTRUMENTED_RIGHT_ONLY.
    With the current 4POP_JUMP variants that would be 12 instructions instead of the 6 we need withNOT_TAKEN.
  • Such a table would be a lot bigger than extra space taken by theNOT_TAKEN instructions
  • It wouldn't be significantly faster when not instrumented. The cost ofNOT_TAKEN is very low as it normally skipped and is eliminated entirely in the jit.
  • It would be quite a lot slower when instrumented thanks to the table lookups.

@nedbat
Copy link
Member

nedbat commentedJan 3, 2025
edited
Loading

Is this intended?

continue.py:

foriinrange(10):a=icontinue# 3a=99asserta==9# 5

Disassembled with this branch:

% python3 -m dis -O continue.py  0          0       RESUME                   0  1          2       LOAD_NAME                0 (range)             4       PUSH_NULL             6       LOAD_SMALL_INT          10             8       CALL                     1            16       GET_ITER      L1:   18       FOR_ITER                 5 (to L2)            22       STORE_NAME               1 (i)  2         24       LOAD_NAME                1 (i)            26       STORE_NAME               2 (a)  3         28       JUMP_BACKWARD            7 (to L1)  1   L2:   32       END_FOR            34       POP_ITER  5         36       LOAD_NAME                2 (a)            38       LOAD_SMALL_INT           9            40       COMPARE_OP              88 (bool(==))            44       POP_JUMP_IF_TRUE         3 (to L3)            48       NOT_TAKEN            50       LOAD_COMMON_CONSTANT     0 (AssertionError)            52       RAISE_VARARGS            1      L3:   54       LOAD_CONST               0 (None)            56       RETURN_VALUE

sys.monitoring events usingrun_sysmon.py:

3.14.0a3+ (heads/pr/128445:511733c91b3, Jan  3 2025, 10:10:48) [Clang 16.0.0 (clang-1600.0.26.6)]PY_START: continue.py@0 #0LINE: continue.py #1BRANCH_LEFT: continue.py@18->22 #1->1LINE: continue.py #2LINE: continue.py #3JUMP: continue.py@28->18 #3->1LINE: continue.py #1BRANCH_RIGHT: continue.py@34->36 #1->5                  ******LINE: continue.py #5BRANCH_RIGHT: continue.py@44->54 #5->5PY_RETURN: continue.py@56 #5

Is it right that the starred line has a BRANCH_RIGHT event from a POP_ITER instruction?

@markshannon
Copy link
MemberAuthor

Is it right that the starred line has a BRANCH_RIGHT event from a POP_ITER instruction?

No. It should be from theFOR_ITER. The problem was that thePOP_ITER was being instrumented withINSTRUMENT_LINE which lost the previous instruction offset. I think I have that fixed now.

@nedbat
Copy link
Member

Two more examples of branch event surprises.run_sysmon.py is used to see the sys.monitoring events. These are using commitcbe6c7c from PR#128445.

ifdebug.py
forvaluein [True,False]:ifvalue:if__debug__:x=4else:x=6

Disassembled:

% python3 -m dis -O ifdebug.py  0          0       RESUME                   0  1          2       LOAD_CONST               0 ((True, False))             4       GET_ITER      L1:    6       FOR_ITER                18 (to L3)            10       STORE_NAME               0 (value)  2         12       LOAD_NAME                0 (value)            14       TO_BOOL            22       POP_JUMP_IF_FALSE        6 (to L2)            26       NOT_TAKEN  3         28       NOT_TAKEN  4         30       LOAD_SMALL_INT           4            32       STORE_NAME               1 (x)            34       JUMP_BACKWARD           16 (to L1)  6   L2:   38       LOAD_SMALL_INT           6            40       STORE_NAME               1 (x)            42       JUMP_BACKWARD           20 (to L1)  1   L3:   46       END_FOR            48       POP_ITER            50       LOAD_CONST               1 (None)            52       RETURN_VALUE

Events:

3.14.0a3+ (heads/pr/128445:cbe6c7c085c, Jan  4 2025, 05:39:38) [Clang 16.0.0 (clang-1600.0.26.6)]PY_START: ifdebug.py@0 #0LINE: ifdebug.py #1BRANCH_LEFT: ifdebug.py@6->10 #1->1LINE: ifdebug.py #2BRANCH_LEFT: ifdebug.py@22->28 #2->3LINE: ifdebug.py #3BRANCH_LEFT: ifdebug.py@28->30 #3->4        ****LINE: ifdebug.py #4JUMP: ifdebug.py@34->6 #4->1LINE: ifdebug.py #1BRANCH_RIGHT: ifdebug.py@22->38 #2->6LINE: ifdebug.py #6JUMP: ifdebug.py@42->6 #6->1BRANCH_RIGHT: ifdebug.py@6->50 #1->1PY_RETURN: ifdebug.py@52 #1

The starred line is a BRANCH event from a NOT_TAKEN instruction.

1176_async.py
importasyncioasyncdefasync_gen():yield4asyncdefasync_test():asyncforiinasync_gen():print(i+8)asyncio.run(async_test())

Disassembled:

% python3 -m dis -O 1176_async.py  0          0       RESUME                   0  1          2       LOAD_SMALL_INT           0             4       LOAD_CONST               0 (None)             6       IMPORT_NAME              0 (asyncio)             8       STORE_NAME               0 (asyncio)  3         10       LOAD_CONST               1 (<code object async_gen at 0x103781c50, file "1176_async.py", line 3>)            12       MAKE_FUNCTION            14       STORE_NAME               1 (async_gen)  6         16       LOAD_CONST               2 (<code object async_test at 0x1037c5b80, file "1176_async.py", line 6>)            18       MAKE_FUNCTION            20       STORE_NAME               2 (async_test) 10         22       LOAD_NAME                0 (asyncio)            24       LOAD_ATTR                6 (run)            44       PUSH_NULL            46       LOAD_NAME                2 (async_test)            48       PUSH_NULL            50       CALL                     0            58       CALL                     1            66       POP_TOP            68       LOAD_CONST               0 (None)            70       RETURN_VALUEDisassembly of <code object async_gen at 0x103781c50, file "1176_async.py", line 3>:   3          0       RETURN_GENERATOR              2       POP_TOP       L1:    4       RESUME                   0   4          6       LOAD_SMALL_INT           4              8       CALL_INTRINSIC_1         4 (INTRINSIC_ASYNC_GEN_WRAP)             10       YIELD_VALUE              0             12       RESUME                   5             14       POP_TOP             16       LOAD_CONST               0 (None)             18       RETURN_VALUE  --   L2:   20       CALL_INTRINSIC_1         3 (INTRINSIC_STOPITERATION_ERROR)             22       RERAISE                  1ExceptionTable:  L1 to L2 -> L2 [0] lastiDisassembly of <code object async_test at 0x1037c5b80, file "1176_async.py", line 6>:   6           0       RETURN_GENERATOR               2       POP_TOP        L1:    4       RESUME                   0   7           6       LOAD_GLOBAL              1 (async_gen + NULL)              16       CALL                     0              24       GET_AITER        L2:   26       GET_ANEXT              28       LOAD_CONST               0 (None)        L3:   30       SEND                     3 (to L6)        L4:   34       YIELD_VALUE              1        L5:   36       RESUME                   3              38       JUMP_BACKWARD_NO_INTERRUPT 5 (to L3)        L6:   40       END_SEND        L7:   42       STORE_FAST               0 (i)   8          44       LOAD_GLOBAL              3 (print + NULL)              54       LOAD_FAST                0 (i)              56       LOAD_SMALL_INT           8              58       BINARY_OP                0 (+)              62       CALL                     1              70       POP_TOP              72       JUMP_BACKWARD           25 (to L2)   7    L8:   76       CLEANUP_THROW        L9:   78       JUMP_BACKWARD_NO_INTERRUPT 20 (to L6)       L10:   80       END_ASYNC_FOR              82       LOAD_CONST               0 (None)              84       RETURN_VALUE  --   L11:   86       CALL_INTRINSIC_1         3 (INTRINSIC_STOPITERATION_ERROR)              88       RERAISE                  1ExceptionTable:  L1 to L2 -> L11 [0] lasti  L2 to L4 -> L10 [1]  L4 to L5 -> L8 [3]  L5 to L7 -> L10 [1]  L7 to L8 -> L11 [0] lasti  L8 to L9 -> L10 [1]  L10 to L11 -> L11 [0] lasti

Events:

3.14.0a3+ (heads/pr/128445:cbe6c7c085c, Jan  4 2025, 05:39:38) [Clang 16.0.0 (clang-1600.0.26.6)]PY_START: 1176_async.py@0 #0LINE: 1176_async.py #1LINE: 1176_async.py #3LINE: 1176_async.py #6LINE: 1176_async.py #10PY_START: 1176_async.py@4 #6LINE: 1176_async.py #7PY_START: 1176_async.py@4 #3LINE: 1176_async.py #4LINE: 1176_async.py #812JUMP: 1176_async.py@72->26 #8->7LINE: 1176_async.py #7PY_RESUME: 1176_async.py@12 #4PY_RETURN: 1176_async.py@18 #4PY_RETURN: 1176_async.py@84 #7PY_RETURN: 1176_async.py@70 #10

Here there are no BRANCH events at all. I would expect some for theasync for line, analogous to how synchronousfor is evented.

@markshannon
Copy link
MemberAuthor

These examples are really helpful, but I don't think they are relevant to this PR.
Could you add these examples to theissue, otherwise they get lost.

Theifdebug.py example is a case of theNOT_TAKEN not being removed when the correspondingJUMP is removed. The1176_async.py is a result of the weird way thatasync for is currently implemented. I've added tasks for both.

* the FOR_ITER as the previous instruction.
* This has the benign side effect that if value is
* finalized it will see the location as the FOR_ITER's.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Can't we just teachPOP_ITER to look atprev_instr-1?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

prev_instr is the previously executed instruction, which is probably not the previous instruction in the code.
There is no way forPOP_ITER to tell where the jump came from if we overwriteframe->instr_ptr

iritkatriel reacted with thumbs up emoji
@markshannonmarkshannon merged commitf826bec intopython:mainJan 6, 2025
67 checks passed
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull requestJan 8, 2025
@markshannonmarkshannon deleted the better-instrument-for-iter branchJanuary 10, 2025 16:22
Yhg1s added a commit to Yhg1s/cpython that referenced this pull requestJan 21, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@iritkatrieliritkatrieliritkatriel approved these changes

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

Assignees
No one assigned
Labels
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@markshannon@iritkatriel@nedbat

[8]ページ先頭

©2009-2025 Movatter.jp