Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

gh-91887: Store strong references to pending tasks#121264

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Open
alexhartl wants to merge5 commits intopython:main
base:main
Choose a base branch
Loading
fromalexhartl:strong-refs-for-bg-tasks

Conversation

alexhartl
Copy link

@alexhartlalexhartl commentedJul 2, 2024
edited by bedevere-appbot
Loading

This adds a_pending_tasks set toBaseEventLoop. OnTask creation, a (strong) reference to the task is added to this set in_register_task. When a task completes, the respective reference is removed from_pending_tasks in_unregister_task. See the discussion at#91887.

elenakrittik reacted with thumbs up emoji
@ghost
Copy link

ghost commentedJul 2, 2024
edited by ghost
Loading

All commit authors signed the Contributor License Agreement.
CLA signed

Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Very nice PR! Just a few nitpicks.

return True

def __schedule_callbacks(self):
"""Internal: Ask the event loop to call all callbacks.
def _finish_execution(self):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For clarification, what's the reason for this name change? This is still, effectively, scheduling all callbacks -- "finish execution" is a bit more ambiguous to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yes, inFuture, "_schedule_callbacks" is perfectly fine to describe what the function does. When overriding this function inTask, I think_finish_execution is more meaningful to indicate that it will be called when the task completes.

ZeroIntensity reacted with thumbs up emojiAronmasF reacted with heart emoji

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It might be a good idea to add a test to make sure that this actually fixes#91887, to make sure that someone doesn't accidentally break this in the future.

Likely would be something like:

deftest_strong_task_references(self):called=Falseasyncdefcoro():nonlocalcalledcalled=Trueasyncdefmain():asyncio.create_task(coro())loop=asyncio.new_event_loop()try:loop.run_until_complete(main())finally:loop.close()self.assertTrue(called)

alexhartland others added2 commitsJuly 11, 2024 08:30
Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
Copy link
Member

@ZeroIntensityZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM

@1st1
Copy link
Member

I can ponder on this during the core sprint (and think how this will play with uvloop).

@freakboy3742
Copy link
Contributor

@alexhartl I'm at the CPython core team sprint, so I've taken the liberty of merging with main so I can discuss this with@1st1 and others. There were some conflicts introduced as a result of#120974.

@alexhartl
Copy link
Author

Thank you for picking this up@freakboy3742 ! Yes, the last time I checked, the asyncio C code appeared to be in the middle of some restructuring. Let me know if I can help with anything.

paravoid added a commit to paravoid/ircstream that referenced this pull requestSep 30, 2024
Store strong references into a global set as well. Hopefully can getremoved one day, aspython/cpython#121264 wasmerged just this week :)
Copy link
Member

@1st11st1 left a comment
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I'm blocking this PR from being merged to ponder on it (feel free to dismiss the block in a week). For one I'm really not sure I like the event loop to have new APIs, I think this is better be solved at asyncio level in an event loop independent way.

cc@ambv@pablogsal

@bedevere-app
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@1st1
Copy link
Member

1st1 commentedOct 9, 2024

@alexhartl have you considered just making_scheduled_tasks a regular set instead of usingweakref.WeakSet()? What's the downside of that? IMO it doesn't matter much where a strong reference is stored -- in the event loop or in the asyncio module. Either way a proper cleanup is required, but using a regular set would be a trivial change and everything in the ecosystem would just work.

@gvanrossum
Copy link
Member

@alexhartl have you considered just making_scheduled_tasks a regular set instead of usingweakref.WeakSet()? What's the downside of that? IMO it doesn't matter much where a strong reference is stored -- in the event loop or in the asyncio module. Either way a proper cleanup is required, but using a regular set would be a trivial change and everything in the ecosystem would just work.

@1st1 But that would keep the awkward design where all loops share the global "all tasks" set, which is not how it's ever used. (I believe this design was just a historical accident; I've never found an explanation.)

@1st1
Copy link
Member

1st1 commentedOct 9, 2024

@1st1 But that would keep the awkward design where all loops share the global "all tasks" set, which is not how it's ever used. (I believe this design was just a historical accident; I've never found an explanation.)

Yeah, but what's awkward about it? There's a low level API that manages theall tasks set and other event loops (at least uvloop)already use that API. Adding yet another tracking API for running tasks is harder than switching the tracking API wealready have to just use a regular set. The additional tracking will only introduce additional, albeit minuscule, overhead. At least this is how I see this. I do think it would be quite elegant to make the existing APIsasyncio._register_task andasyncio._unregister_task have some additional functionality.

Adding new API (like what this PR is doing) means that other loops will have to always implement it (or be broken). Which I obviously can do for uvloop, quite easily, but it will grow the API surface which is already huge.

Lastly, a minor point: event loop doesn't have a lot to do with tasks. The loop is mostly concerned with running callbacks. Task is a self-contained primitive that just schedules callbacks to the event loop. So it rubs me the wrong way to introduce tracking to the loop for Tasks, I believe a global threadlocal mapping is a better solution, which is what "all tasks" can be.

@alexhartl
Copy link
Author

alexhartl commentedOct 10, 2024
edited
Loading

I'd prefer to avoid holding strong references in global scope to reduce the potential for memory leaks. I.e. the loop owns these references and we can be sure that everything is cleaned up at latest when the loop is destructed.

Also, the modifications that@kumaraditya303 did lately were a lot about performance improvements. Registering all tasks in a dict, and unregistering in a done callback is quite likely to negate these improvements to some extent, no matter where this dict is stored. Having this set and weakset separate might give us the opportunity to make this behaviour optional, so that the user is able to retain the performance improvements.

Adding new API (like what this PR is doing) means that other loops will have to always implement it (or be broken). Which I obviously can do for uvloop, quite easily, but it will grow the API surface which is already huge.

Yes, true.

Lastly, a minor point: event loop doesn't have a lot to do with tasks. The loop is mostly concerned with running callbacks. Task is a self-contained primitive that just schedules callbacks to the event loop. So it rubs me the wrong way to introduce tracking to the loop for Tasks, I believe a global threadlocal mapping is a better solution, which is what "all tasks" can be.

Technically, that is true. But from a user's perspective it's the loop that basically represents the state of asyncio. I think it would make a lot of sense if tasks are owned by this loop.

@1st1
Copy link
Member

I'd prefer to avoid holding strong references in global scope to reduce the potential for memory leaks. I.e. the loop owns these references and we can be sure that everything is cleaned up at latest when the loop is destructed.

IMO this isn't a strong argument. Actual applications rarely have more than one event loop during the whole program run (be it a short program or a web server). Now, if there's a bug not cleaning up tasks properly, then the bug is better to be actually fixed, otherwise the fact that event loop clears its taskswon't help at all.

Also, the modifications that@kumaraditya303 did lately were a lot about performance improvements.

What are those modifications? Link?

Registering all tasks in a dict, and unregistering in a done callback is quite likely to negate these improvements to some extent, no matter where this dict is stored.

This is such a minor micro-performance thing in a context of the relatively heavy cost of 'await' and the Task abstraction itself that IMO it's not worth talking about it. Adding/removing a Task from a dict will not be detectable even in a micro-benchmark, it will be below noise floor.

Technically, that is true. But from a user's perspective it's the loop that basically represents the state of asyncio. I think it would make a lot of sense if tasks are owned by this loop.

I see the logic here, but I still think that adding this API to the loop does not make sense in light of other APIs to track tasks thatalready exist in asyncio:_register_task(task),_enter_task(task).

Also I find adding_-leading APIs to the event loop inelegant.

So bottom line, I'm -1 on merging this PR as is. Please let's work together to re-align it and make it better fit with the existing asyncio APIs.


I suggest we keep in place the part of this PR that deterministically calls_unregister_task._scheduled_tasks becomes aset(). That's it.

@alexhartl
Copy link
Author

Sure. Are there other opinions? Otherwise, I can create a PR on this suggestion.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@1st11st11st1 requested changes

@ZeroIntensityZeroIntensityZeroIntensity approved these changes

@asvetlovasvetlovAwaiting requested review from asvetlovasvetlov is a code owner

@gvanrossumgvanrossumAwaiting requested review from gvanrossum

@kumaraditya303kumaraditya303Awaiting requested review from kumaraditya303kumaraditya303 is a code owner

@willingcwillingcAwaiting requested review from willingcwillingc is a code owner

Assignees
No one assigned
Projects
Status: In Progress
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

7 participants
@alexhartl@1st1@freakboy3742@gvanrossum@ZeroIntensity@hugovk@kumaraditya303

[8]ページ先頭

©2009-2025 Movatter.jp