Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-91048: Add support for reconstructing async call stacks#103976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
mpage wants to merge8 commits intopython:mainfrommpage:port-awaiters

Conversation

mpage
Copy link
Contributor

@mpagempage commentedApr 28, 2023
edited by bedevere-bot
Loading

When profiling an async Python application it's useful to see both the
stack for the currently executing task as well as the chain of coroutines
that are transitively awaiting the task. Consider the following example,
where T represents a task, C represents a coroutine, and A '->' B
indicates A is awaiting B.

T0    +---> T1|     |     |C0    |     C2|     |     |v     |     vC1    |     C3|     |+-----|

The async stack from C3 would be C3, C2, C1, C0. In contrast, the
synchronous call stack while C3 is executing is only C3, C2. It's
possible to reconstruct this view in most cases using what is
currently available in CPython, however it's difficult to do so
efficiently, and would be very challenging to do so, let alone
efficiently, in an out of process profiler that leverages eBPF.

This introduces a new field onto coroutines and async generators
that makes it easy to efficiently reconstruct the async call stack.
The field stores an owned reference, set by the interpreter, to
the coroutine or async generator that is awaiting the field's owner.
To reconstruct the chain of coroutines/async generators one only
needs to walk the new field backwards.

Intermediate awaitables (e.g.Task,_GatheringFuture) complicate
maintaining a complete chain of awaiters. A special method,__set_awaiter__
is introduced to simplify the process. Types can provide an implementation
of this method to forward the awaiter on child objects.

itamaro, cpvandehey, Glinte, and smheidrich reacted with heart emojicarljm, Glinte, and smheidrich reacted with rocket emoji
When profiling an async Python application it's useful to see both thestack for the currently executing task as well as the chain of coroutinesthat are transitively awaiting the task. Consider the following example,where T represents a task, C represents a coroutine, and A '->' Bindicates A is awaiting B.    T0    +---> T1    |     |     |    C0    |     C2    |     |     |    v     |     v    C1    |     C3    |     |    +-----|The async stack from C3 would be C3, C2, C1, C0. In contrast, thesynchronous call stack while C3 is executing is only C3, C2. It'spossible to reconstruct this view in most cases using what iscurrently available in CPython, however it's difficult to do soefficiently, and would be very challenging to do so, let aloneefficiently, in an out of process profiler that leverages eBPF.This introduces a new field onto coroutines and async generatorsthat makes it easy to efficiently reconstruct the async call stack.The field stores an owned reference, set by the interpreter, tothe coroutine or async generator that is awaiting the field's owner.To reconstruct the chain of coroutines/async generators one onlyneeds to walk the new field backwards.Intermediate awaitables (e.g. `Task`, `_GatheringFuture`) complicatemaintaining a complete chain of awaiters. A special method, `__set_awaiter__`is introduced to simplify the process. Types can provide an implementationof this method to forward the awaiter on child objects.
@mpagempage marked this pull request as ready for reviewApril 28, 2023 20:33
Copy link
Member

@carljmcarljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Nice! Do you have pyperformance benchmark results?

@mpage
Copy link
ContributorAuthor

Nice! Do you have pyperformance benchmark results?

Nope, will run them.

carljm reacted with thumbs up emoji

@arhadthedev
Copy link
Member

This branch has conflicts that must be resolved
[...]
Python/generated_cases.c.h

I tried to regeneratePython/generated_cases.c.h by porting the make rule:

cpython/Makefile.pre.in

Lines 1490 to 1503 ina679c3d

regen-cases:
# Regenerate Python/generated_cases.c.h
# and Python/opcode_metadata.h
# from Python/bytecodes.c
# using Tools/cases_generator/generate_cases.py
PYTHONPATH=$(srcdir)/Tools/cases_generator \
$(PYTHON_FOR_REGEN) \
$(srcdir)/Tools/cases_generator/generate_cases.py \
--emit-line-directives \
-o $(srcdir)/Python/generated_cases.c.h.new \
-m $(srcdir)/Python/opcode_metadata.h.new \
$(srcdir)/Python/bytecodes.c
$(UPDATE_FILE) $(srcdir)/Python/generated_cases.c.h $(srcdir)/Python/generated_cases.c.h.new
$(UPDATE_FILE) $(srcdir)/Python/opcode_metadata.h $(srcdir)/Python/opcode_metadata.h.new

to the Windows cmd. However, I get an exception:

D:\Oleg\cpython>set PYTHONPATH=D:\Oleg\cpythonD:\Oleg\cpython>python Tools/cases_generator/generate_cases.py --emit-line-directives Python/generated_cases.c.h.new Python/opcode_metadata.h.newRunning Release|x64 interpreter...Traceback (most recent call last):  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 1259, in <module>    main()  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 1250, in main    a.parse()  # Raises SyntaxError on failure    ^^^^^^^^^  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 557, in parse    self.parse_file(filename, instrs_idx)  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 568, in parse_file    with open(filename) as file:         ^^^^^^^^^^^^^^FileNotFoundError: [Errno 2] No such file or directory: 'Python/generated_cases.c.h.new'D:\Oleg\cpython>python Tools/cases_generator/generate_cases.py --emit-line-directives Python/generated_cases.c.h Python/opcode_metadata.hRunning Release|x64 interpreter...Traceback (most recent call last):  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 1259, in <module>    main()  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 1250, in main    a.parse()  # Raises SyntaxError on failure    ^^^^^^^^^  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 557, in parse    self.parse_file(filename, instrs_idx)  File "D:\Oleg\cpython\Tools\cases_generator\generate_cases.py", line 583, in parse_file    raise psr.make_syntax_error(  File "Python/generated_cases.c.h", line 4658    }    ^SyntaxError: Couldn't find '// BEGIN BYTECODES //' in Python/generated_cases.c.hD:\Oleg\cpython>pythonRunning Release|x64 interpreter...Python 3.12.0a5+ (tags/v3.12.0a5-186-gf300a1fa4c:f300a1fa4c, Apr 30 2023, 22:46:31) [MSC v.1929 64 bit (AMD64)] on win32

How should I call it correctly?

@carljm
Copy link
Member

@arhadthedev it would be better to make a new separate issue for difficulties regenerating cases on Windows, rather than conflating it into this PR, which isn't really related.

arhadthedev reacted with thumbs up emoji

@gvanrossum
Copy link
Member

FYI you should be able to run the script without any parameters and it will use the right files.

Also I thought it is run automatically by the Windows build.bat script.

@mpage
Copy link
ContributorAuthor

mpage commentedMay 1, 2023
edited
Loading

PyPerformance results:https://gist.github.com/mpage/b8f712396755c3fee277f84352058608

Looks neutral except for unpack sequence (1.23x faster) and unpickle (1.30x slower). I'm surprised that this change would affect either of these benchmarks. Are they considered to be reliable?

@carljm
Copy link
Member

carljm commentedMay 1, 2023
edited
Loading

Just chatted with@mpage and there are a few things we could do to streamline this more:

  1. TheFRAME_OWNED_BY_GENERATOR pre-check shouldn't be necessary for correctness, we can probably just eliminate it?
  2. Since we are doing a runtime check here based on characteristics of the code object (is it anasync def) it would make more sense to push that into the compiler. We could extract this into a dedicatedSET_AWAITER opcode that the compiler emits only in async defs; then no runtime check is needed.
  3. Alternatively if we don't want it in a separate opcode, we could at least specialize away the check by having two versions ofSEND_GEN, one where the caller is known to be an async def and one where it isn't.
  4. We could decide to have this feature enabled only by a runtime flag (since it's for improved debugging/profiling). In that case we probably wouldn't want it handled in the compiler, since then we'd need different sets of pycs for flag on/off.

My inclination is for#2 (do it in the compiler, new opcode.) Would welcome comments from other reviewers (e.g.@markshannon,@gvanrossum,@willingc).

@markshannon
Copy link
Member

This adds considerable overhead to the fast path of generators, generator expressions and creating coroutines.
It would be more efficient to link the tasks.
So instead of

T0    +---> T1|     |     |C0    |     C2|     |     |v     |     vC1    |     C3|     |+-----|

We would have

T0  ---> T1|        |C0       C2|        |v        vC1       C3

The original could then be reconstructed as needed.

@njsmith
Copy link
Contributor

I guess this is a similar question to Mark's... this seems like something that you could do much more simply directly in asyncio, without touching the core interpreter? When a task doesawait some_task, then it invokesasyncio.Task.__await__, which is arbitrary code that asyncio controls -- wouldn't it be simpler for that code to make a note about the relationship between the tasks? In some cases it might even let you chain together relationships/"causality" through regularFutures, not just tasks.

Also, I might be missing something, but I think the code here gives the wrong answer in cases where a task has multiple waiters, which is allowed by asyncio semantics:

async def wait_on(task):    await taskasync def main():    sleep_task = asyncio.ensure_future(asyncio.sleep(10))    task1 = asyncio.ensure_future(wait_on(sleep_task))    task2 = asyncio.ensure_future(wait_on(sleep_task))

I think in this case the link from task1 -> sleep_task would get overwritten and lost?

@mpage
Copy link
ContributorAuthor

This adds considerable overhead to the fast path of generators, generator expressions and creating coroutines. It would be more efficient to link the tasks. So instead of

The original could then be reconstructed as needed.

It's worth mentioning that the choice to maintain the async stacks on coroutines was to make it as simple as possible to reconstruct the stack from an eBPF probe. I think that linking tasks should be doable, it'll mean that we have to recreate the logic of_PyGen_yf in a probe, but that isn't terribly complicated.

Where were you thinking that we would update the links between tasks?

@mpage
Copy link
ContributorAuthor

I guess this is a similar question to Mark's... this seems like something that you could do much more simply directly in asyncio, without touching the core interpreter? When a task doesawait some_task, then it invokesasyncio.Task.__await__, which is arbitrary code that asyncio controls -- wouldn't it be simpler for that code to make a note about the relationship between the tasks? In some cases it might even let you chain together relationships/"causality" through regularFutures, not just tasks.

It probably doesn't matter in practice, but since__await__ is an ordinary method there's nothing that guarantees that whatever calls__await__ will await the awaitable that is returned (wow that was a mouthful!).

Also, I might be missing something, but I think the code here gives the wrong answer in cases where a task has multiple waiters, which is allowed by asyncio semantics:

async def wait_on(task):    await taskasync def main():    sleep_task = asyncio.ensure_future(asyncio.sleep(10))    task1 = asyncio.ensure_future(wait_on(sleep_task))    task2 = asyncio.ensure_future(wait_on(sleep_task))

I think in this case the link from task1 -> sleep_task would get overwritten and lost?

Yep, this is a limitation of the implementation, the last awaiter wins. This has worked out OK for us so far.

@gvanrossum
Copy link
Member

I'm not going to be able to give this enough attention any time soon. Would it be okay to consider this for 3.13 instead of 3.12? 3.12 feature freeze (beta 1) was supposed to be today, and while the RM has postponed that by two weeks, he also gave guidance that this is not meant as an encouragement for new features (the SC ruled on a few pending PEPs, allowing some, postponing others to 3.13).

carljm reacted with thumbs up emoji

@carljm
Copy link
Member

I was already assuming this was under discussion for 3.13, not a candidate for 3.12.

gvanrossum reacted with thumbs up emoji

@jbower-fb
Copy link
Contributor

I'm having another stab at this taking into account what was discussed here. For more details see my comment here:#91048 (comment).

@ambv
Copy link
Contributor

Thanks for this, we went withgh-124640 instead.

@ambvambv closed thisJan 22, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@carljmcarljmcarljm left review comments

@markshannonmarkshannonAwaiting requested review from markshannonmarkshannon is a code owner

@gvanrossumgvanrossumAwaiting requested review from gvanrossum

@willingcwillingcAwaiting requested review from willingcwillingc is a code owner

Assignees
No one assigned
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

9 participants
@mpage@arhadthedev@carljm@gvanrossum@markshannon@njsmith@jbower-fb@ambv@bedevere-bot

[8]ページ先頭

©2009-2025 Movatter.jp