Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

GH-122548: Implement branch taken and not taken events for sys.monitoring#122564

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged

Conversation

@markshannon
Copy link
Member

@markshannonmarkshannon commentedAug 1, 2024
edited by bedevere-appbot
Loading

This is a draft PR.

Before merging, it needs:

  • Feedback on whether this fixes the problem
  • Docs
  • Benchmarking results

@nedbat
@jaltmayerpizzorno

@nedbat
Copy link
Member

Thanks! I will try this out in the next few days.

@nedbat
Copy link
Member

Playing with it on some simple code, the basics look really good here. I'll work on adapting coverage.py to get some timings.

@markshannon
Copy link
MemberAuthor

I've addco_branches() iterator to this PR, which provides an iterator ofsrc, not_taken, taken triples.
To do this I've changed theBRANCH_NOT_TAKEN event source offset so that it matches theBRANCH_TAKEN source offset.
That way you can track offsets during recording and only map back to locations during reporting, which should make things a bit more efficient.

@jaltmayerpizzorno
Copy link

jaltmayerpizzorno commentedAug 14, 2024
edited
Loading

Isco_branches() supposed to change contents as the function is executed? I am not sure I mind if it changes, but it's surprising to me -- I thought it'd be a static static list of the branches present. I'm seeing it change for this:

deffoo():foriinrange(3):print(i)
>>> list(foo.__code__.co_branches())[(24, 30, 58)]>>> foo()012>>> list(foo.__code__.co_branches())[]

This is without requesting anysys.monitoring.

Edit: mmm, I don't seeco_branches() change in other cases, this may be a bug.

@jaltmayerpizzorno
Copy link

jaltmayerpizzorno commentedAug 14, 2024
edited
Loading

I'm missing some branch events for this code:

deffoo(n):whilen<3:print(n)n+=1returnNoneif__name__=="__main__":foo(0)

Here's what happens:

  1          0       RESUME                   0  2          2       LOAD_FAST                0 (n)             4       LOAD_CONST               1 (3)             6       COMPARE_OP              18 (bool(<))            10       POP_JUMP_IF_FALSE       26 (to L3)            14       NOT_TAKEN  3   L1:   16       LOAD_GLOBAL              1 (print + NULL)            26       LOAD_FAST                0 (n)            28       CALL                     1            36       POP_TOP  4         38       LOAD_FAST                0 (n)            40       LOAD_CONST               2 (1)            42       BINARY_OP               13 (+=)            46       STORE_FAST               0 (n)  2         48       LOAD_FAST                0 (n)            50       LOAD_CONST               1 (3)            52       COMPARE_OP              18 (bool(<))            56       POP_JUMP_IF_FALSE        2 (to L2)            60       JUMP_BACKWARD           24 (to L1)      L2:   64       NOT_TAKEN  6   L3:   66       RETURN_CONST             0 (None)line ex3.py:1line ex3.py:8brch ex3.py 8:3-8:25 "__name__ == "__main__"" (@16) -> 9:4-9:7 "foo" (@22)line ex3.py:9line ex3.py:2brch ex3.py 2:10-2:13 "n<3" (foo@10) -> 3:8-3:13 "print" (foo@16)line ex3.py:30line ex3.py:4line ex3.py:2line ex3.py:31line ex3.py:4line ex3.py:2line ex3.py:32line ex3.py:4line ex3.py:2brch ex3.py 2:10-2:13 "n<3" (foo@56) -> 2:10-2:13 "n<3" (foo@64)brch ex3.py 2:10-2:13 "n<3" (foo@60) -> 6:11-6:15 "None" (foo@66)line ex3.py:6

I was expecting branch events for iterations 1 and 2. This is without ever returningDISABLE (and with the same handler for bothBRANCH_TAKEN andBRANCH_NOT_TAKEN).

@markshannon
Copy link
MemberAuthor

markshannon commentedAug 15, 2024
edited
Loading

Is co_branches() supposed to change contents as the function is executed?

No it should be immutable, likeco_lines.

... this may be a bug.

It is a bug. I'll take a look.


Fixed now

@markshannon
Copy link
MemberAuthor

I'm missing some branch events for this code:

def foo(n):    while n<3:    ...

This is mostly likely because I haven't merged in the fix in#122934 yet.

@markshannon
Copy link
MemberAuthor

Should all be fixed now.

jaltmayerpizzorno reacted with thumbs up emoji

@nedbat
Copy link
Member

It is great to see progress on this, thanks for pushing it forward. I have some questions about the behavior ofwith statements.

This is my program for seeing the sys.monitoring events:

run_sysmon.py
# Licensed under the Apache License: http://www.apache.org/licenses/LICENSE-2.0# For details: https://github.com/nedbat/coveragepy/blob/master/NOTICE.txt"""Run sys.monitoring on a file of Python code."""importfunctoolsimportsysprint(sys.version)the_program=sys.argv[1]code=compile(open(the_program).read(),filename=the_program,mode="exec")my_id=sys.monitoring.COVERAGE_IDsys.monitoring.use_tool_id(my_id,"run_sysmon.py")register=functools.partial(sys.monitoring.register_callback,my_id)events=sys.monitoring.eventsdefbytes_to_lines(code):"""Make a dict mapping byte code offsets to line numbers."""b2l= {}cur_line=0forbstart,bend,linenoincode.co_lines():forboffsetinrange(bstart,bend,2):b2l[boffset]=linenoreturnb2ldefshow_off(label,code,instruction_offset):ifcode.co_filename==the_program:b2l=bytes_to_lines(code)print(f"{label}:{code.co_filename}@{instruction_offset} #{b2l[instruction_offset]}")defshow_line(label,code,line_number):ifcode.co_filename==the_program:print(f"{label}:{code.co_filename} #{line_number}")defshow_off_off(label,code,instruction_offset,destination_offset):ifcode.co_filename==the_program:b2l=bytes_to_lines(code)print(f"{label}:{code.co_filename}@{instruction_offset}->{destination_offset} "+f"#{b2l[instruction_offset]}->{b2l[destination_offset]}"        )defsysmon_py_start(code,instruction_offset):show_off("PY_START",code,instruction_offset)sys.monitoring.set_local_events(my_id,code,events.PY_RETURN|events.PY_RESUME|events.LINE|events.BRANCH_TAKEN|events.BRANCH_NOT_TAKEN|events.JUMP,    )defsysmon_py_resume(code,instruction_offset):show_off("PY_RESUME",code,instruction_offset)returnsys.monitoring.DISABLEdefsysmon_py_return(code,instruction_offset,retval):show_off("PY_RETURN",code,instruction_offset)returnsys.monitoring.DISABLEdefsysmon_line(code,line_number):show_line("LINE",code,line_number)returnsys.monitoring.DISABLEdefsysmon_branch(code,instruction_offset,destination_offset):show_off_off("BRANCH",code,instruction_offset,destination_offset)returnsys.monitoring.DISABLEdefsysmon_branch_taken(code,instruction_offset,destination_offset):show_off_off("BRANCH_TAKEN",code,instruction_offset,destination_offset)returnsys.monitoring.DISABLEdefsysmon_branch_not_taken(code,instruction_offset,destination_offset):show_off_off("BRANCH_NOT_TAKEN",code,instruction_offset,destination_offset)returnsys.monitoring.DISABLEdefsysmon_jump(code,instruction_offset,destination_offset):show_off_off("JUMP",code,instruction_offset,destination_offset)returnsys.monitoring.DISABLEsys.monitoring.set_events(my_id,events.PY_START|events.PY_UNWIND,)register(events.PY_START,sysmon_py_start)register(events.PY_RESUME,sysmon_py_resume)register(events.PY_RETURN,sysmon_py_return)# register(events.PY_UNWIND, sysmon_py_unwind_arcs)register(events.LINE,sysmon_line)#register(events.BRANCH, sysmon_branch)register(events.BRANCH_TAKEN,sysmon_branch_taken)register(events.BRANCH_NOT_TAKEN,sysmon_branch_not_taken)register(events.JUMP,sysmon_jump)exec(code)

My test program (withbranches.py):

fromcontextlibimportnullcontextdefok():withnullcontext():return"good"defbad():withnullcontext():1/0print(ok())try:print(bad())except:print("whoops")print("done")

I get this output:

% python3.14 run_sysmon.py withbranches.py3.14.0a0 (heads/pr/122564:538da381d47, Aug 15 2024, 08:09:47) [Clang 15.0.0 (clang-1500.3.9.4)]PY_START: withbranches.py@0 #0LINE: withbranches.py #1LINE: withbranches.py #3LINE: withbranches.py #7LINE: withbranches.py #11PY_START: withbranches.py@0 #3LINE: withbranches.py #4LINE: withbranches.py #5LINE: withbranches.py #4PY_RETURN: withbranches.py@58 #4goodLINE: withbranches.py #13LINE: withbranches.py #14PY_START: withbranches.py@0 #7LINE: withbranches.py #8LINE: withbranches.py #9LINE: withbranches.py #8BRANCH_NOT_TAKEN: withbranches.py@80->86 #8->8LINE: withbranches.py #15LINE: withbranches.py #16whoopsLINE: withbranches.py #17donePY_RETURN: withbranches.py@96 #17

I have two questions:

  1. When an exception happens in thewith, there's a BRANCH_NOT_TAKEN event on thewith line, but when there is no exception, there's noBRANCH_TAKEN event. A with statement isn't where I would expect a branch event at all. Is this correct?
  2. Although I returnsys.monitoring.DISABLE for every event, I get two LINE events for thewith lines (4 and 8).

@markshannon
Copy link
MemberAuthor

markshannon commentedAug 15, 2024
edited
Loading

With statements are complicated thingshttps://peps.python.org/pep-0343/#specification-the-with-statement.

withmngr:# line 1BODY# line 2

is equivalent to:

__enter = mngr.__enter__,   # line 1__exit = mngr.__exit__  # line 1__enter()  # line 1try:    BODY  # line 2except:    __flag = __exit(*sys.exc_info())     # line 1    if not __flag:   #  <--   this is the branch        raiseelse:    __exit(None, None, None)     # line 1

It looks like we decided (well, I decided and no one disagreed 🙂) that the call to__exit__ when leaving thewith statement should have the location of thewith.#88099.

@nedbat
Copy link
Member

I see, thanks. I mistakenly thought the branch was "exception happened or it didn't," but it's "exception happened and was handled by the context manager or it wasn't." I don't think that's a branch that a coverage.py user would want exposed to them, but I could be swayed.

The re-visiting of thewith line during exit I'm familiar with, was just surprised that a disabled line event was repeated. It's OK with me to leave it that way since it will only be two events for the line, no matter how many times thewith is entered. We should update the sys.monitoring docs to indicate that it could happen.

@jaltmayerpizzorno
Copy link

#123044

@jaltmayerpizzorno
Copy link

#123048

@jaltmayerpizzorno
Copy link

#123050

@markshannon
Copy link
MemberAuthor

@nedbatwith statements aren't the only case where two line events will be generated for the same line:

deffoo(a,b):return (a+b)

Execution order:a thenb then+ resulting in two line events for the linereturn (a +

DISABLE will disable each event separately, not all events for the same line.

@markshannon
Copy link
MemberAuthor

markshannon commentedAug 16, 2024
edited
Loading

I've updated this branch to useBRANCH_LEFT andBRANCH_RIGHT instead ofBRANCH_NOT_TAKEN andBRANCH_TAKEN.

I've left theBRANCH_NOT_TAKEN andBRANCH_TAKEN names insys.monitoring.events as pseudonyms, to not disrupt any experiments or tests.

jaltmayerpizzorno reacted with thumbs up emoji

@jaltmayerpizzorno
Copy link

#123076

@nedbat
Copy link
Member

I'm trying to use the new events to track what branches have happened. I'm having a problem because the events report
on bytecode offsets, but I need line numbers. Mapping from the offsets to the line numbers isn't giving me the data I need.

This is forloop.py, where I'm focused on the if statement on line 3:

defforloop(seq):forninseq:ifn>5:# line 3breakprint('Done')forloop([1,9])

Disassembled it looks like this:

1:0-1:0             0       RESUME                   02:13-2:16           2       LOAD_FAST                0 (seq)2:13-2:16           4       GET_ITER2:13-2:16    L1:    6       FOR_ITER                14 (to L3)2:13-2:16          10       NOT_TAKEN2:8-2:9            12       STORE_FAST               1 (n)3:11-3:12          14       LOAD_FAST                1 (n)3:15-3:16          16       LOAD_CONST               1 (5)3:11-3:16          18       COMPARE_OP             148 (bool(>))3:11-3:16          22       POP_JUMP_IF_TRUE         3 (to L2)3:11-3:16          26       NOT_TAKEN3:11-3:16          28       JUMP_BACKWARD           13 (to L1)  --         L2:   32       NOP4:12-4:17          34       POP_TOP4:12-4:17          36       JUMP_FORWARD             2 (to L4)2:13-2:16    L3:   38       END_FOR2:13-2:16          40       POP_TOP5:4-5:9      L4:   42       LOAD_GLOBAL              1 (print + NULL)5:10-5:16          52       LOAD_CONST               2 ('Done')5:4-5:17           54       CALL                     15:4-5:17           62       POP_TOP5:4-5:17           64       RETURN_CONST             0 (None)

When I run it with sys.monitoring, I get these events:

3.14.0a0 (heads/pr/122564:8fc38dd8506, Aug 25 2024, 07:11:06) [Clang 15.0.0 (clang-1500.3.9.4)]PY_START: forloop.py@0 #0LINE: forloop.py #1LINE: forloop.py #7PY_START: forloop.py@0 #1LINE: forloop.py #2BRANCH_NOT_TAKEN: forloop.py@6->12 #2->2LINE: forloop.py #3BRANCH_NOT_TAKEN: forloop.py@22->28 #3->3JUMP: forloop.py@28->6 #3->2LINE: forloop.py #2BRANCH_TAKEN: forloop.py@22->32 #3->NoneLINE: forloop.py #4JUMP: forloop.py@36->42 #4->5LINE: forloop.py #5DonePY_RETURN: forloop.py@64 #5PY_RETURN: forloop.py@28 #7

The@XX notations are bytecode offsets, the #xx notations are line numbers obtained by mapping the offsets through co_lines().

There are two branch events from line 3. The first is BRANCH_NOT_TAKEN 22->28, but that maps to line 3->3. Bytecode 28 is a jump backward, but how can I properly interpret this as a branch not taken from line 3 to line 2?

The second event is BRANCH_TAKEN 22->32, which maps to line 3->None. What should I do with a NOP with no line number?

@nedbat
Copy link
Member

Sorry, I guess my example is duplicating#123050?

@nedbat
Copy link
Member

@markshannon@iritkatriel Is there some way we can work together to push this forward?

@jaltmayerpizzorno
Copy link

jaltmayerpizzorno commentedSep 16, 2024
edited
Loading

Sorry, I guess my example is duplicating#123050?

Yes, I think so; with the current code in this pull, the example from#123050 also yields aNOP without a line number.

@nedbat
Copy link
Member

@markshannon I'm pleased to see more activity here. What's the best way to talk about how it's behaving? I tried a simple program:

for i in range(10):    a = iassert a == 9

Using a build of this PR, I disassembled it as:

  0          0       RESUME                   0  1          2       LOAD_NAME                0 (range)             4       PUSH_NULL             6       LOAD_SMALL_INT          10             8       CALL                     1            16       GET_ITER      L1:   18       FOR_ITER                 6 (to L2)            22       NOT_TAKEN            24       STORE_NAME               1 (i)  2         26       LOAD_NAME                1 (i)            28       STORE_NAME               2 (a)            30       JUMP_BACKWARD            8 (to L1)  1   L2:   34       END_FOR            36       POP_TOP  3         38       LOAD_NAME                2 (a)            40       LOAD_SMALL_INT           9            42       COMPARE_OP              88 (bool(==))            46       POP_JUMP_IF_TRUE         3 (to L3)            50       NOT_TAKEN            52       LOAD_COMMON_CONSTANT     0 (AssertionError)            54       RAISE_VARARGS            1      L3:   56       LOAD_CONST               0 (None)            58       RETURN_VALUE

Looking at the sys.monitoring events, I see this:

3.14.0a2+ (heads/pr/122564:a604a085fb3, Dec 14 2024, 11:47:37) [Clang 16.0.0 (clang-1600.0.26.6)]PY_START: foo.py@0 #0LINE: foo.py #1BRANCH_LEFT: foo.py@18->24 #1->1LINE: foo.py #2JUMP: foo.py@30->18 #2->1LINE: foo.py #1BRANCH_RIGHT: foo.py@18->38 #1->3LINE: foo.py #3BRANCH_RIGHT: foo.py@46->56 #3->3PY_RETURN: foo.py@58 #3

(@ are byte offsets, # are line numbers)

The BRANCH_LEFT is indicating a branch from line 1 to line 1, so we never "see" the branch from 1 to 2. Is that something you can fix, or is there some way I should be adjusting for that?

@markshannon
Copy link
MemberAuthor

There is no branch from line 1 to 2, the "left" branch is from line 1 to line 1.

This code:

1.foriins:2.A3.B

this is roughly equivalent to:

1.tmp1=iter(i);tmp2=next(tmp1);iftmp2isNULLgotoend;s=tmp22.Aend:3.B

The "left" branch is fromif tmp2 is NULL goto end tos = tmp2 which are both on line 1.

ADDOP_COMPARE(c,LOC(e),asdl_seq_GET(e->v.Compare.ops,n));
ADDOP(c,LOC(e),TO_BOOL);
ADDOP_JUMP(c,LOC(e),cond ?POP_JUMP_IF_TRUE :POP_JUMP_IF_FALSE,next);
ADDOP(c,NO_LOCATION,NOT_TAKEN);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The peephole optimiser replaces a conditional jump with const condition by an unconditional jump and a NOP. We probably need another peepholer rule to remove any NOT_TAKEN this leaves behind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Alternatively, could we not add NOT_TAKEN in codegen, and have the assembler inject it where it's needed?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I worry this might be not work if the optimizer merges are rearranges branches.
It is possible that the correspondence to the original source could be lost.

@jaltmayerpizzorno
Copy link

Sorry for the delay in responding: I am about to leave on a holiday trip and things tend to get hectic then.

I'd love to have this to be merged in! Being able to disable just each direction of a branch event at a time and getting a list of the possible branches without having to go through the bytecode are great additions and make these events much easier to use.

Where are we with#123050 and other related issues? Not that I think resolving them should block what's been done so far.

@markshannon
Copy link
MemberAuthor

Where are we with#123050 and other related issues?

Let's discuss that on those issues. Being able to disable the branch events is orthogonal to mapping branch locations back to source.

casePOP_JUMP_IF_TRUE:
casePOP_JUMP_IF_NONE:
casePOP_JUMP_IF_NOT_NONE:
caseFOR_ITER:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Might be worth adding a macro for conditional branches next to this one

#defineIS_UNCONDITIONAL_JUMP_OPCODE(opcode) \

@nedbat
Copy link
Member

There is no branch from line 1 to 2, the "left" branch is from line 1 to line 1.

Maybe first we should decide: from the user's point of view, does line 1 have two possible destinations (line 2 and line 3) or does it not? Should line 1 be considered a branch for the purposes of branch coverage?

If we can't map the collected data back to a user-understandable interpretation, then we haven't finished the feature. Can you help me puzzle through this? I'd like to get efficient branch coverage measurement. So far, we seem to be talking about different things (byte codes vs lines) and we aren't meeting in the middle.

@markshannon
Copy link
MemberAuthor

Maybe first we should decide: from the user's point of view, does line 1 have two possible destinations (line 2 and line 3) or does it not? Should line 1 be considered a branch for the purposes of branch coverage?

Can we discuss this on the issue? This PR is about adding the ability to disable the two branches independently.

So far, we seem to be talking about different things (byte codes vs lines) and we aren't meeting in the middle.

Indeed. This PR is about disabling branches and offsets.
The other discussion, about mapping offsets to lines, is just as important. But this PR isn't the place for that discussion.

Issue#122548 or any of the related issues that@jaltmayerpizzorno opened would the place to discuss that.

@markshannonmarkshannon merged commitd2f1d91 intopython:mainDec 19, 2024
57 checks passed
@markshannonmarkshannon deleted the monitoring-branch-taken branchDecember 19, 2024 16:59
Python 3.14a1 3607 (Add pseudo instructions JUMP_IF_TRUE/FALSE)
Python 3.14a1 3608 (Add support for slices)
Python 3.14a2 3609 (Add LOAD_SMALL_INT and LOAD_CONST_IMMORTAL instructions, remove RETURN_CONST)
Python 3.14a3 3610 (Add NOT_TAKEN instruction)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This says 3610, but the implementation says 3611.

This says a3, but a3 is already released.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for spotting that.

srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull requestDec 23, 2024
srinivasreddy pushed a commit to srinivasreddy/cpython that referenced this pull requestJan 8, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@nedbatnedbatnedbat left review comments

@iritkatrieliritkatrieliritkatriel approved these changes

@ericsnowcurrentlyericsnowcurrentlyAwaiting requested review from ericsnowcurrentlyericsnowcurrently is a code owner

+1 more reviewer

@hroncokhroncokhroncok left review comments

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

5 participants

@markshannon@nedbat@jaltmayerpizzorno@iritkatriel@hroncok

[8]ページ先頭

©2009-2025 Movatter.jp