Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

bpo-29842: Make Executor.map less eager so it handles large/unbounded…#707

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
MojoVampire wants to merge6 commits intopython:mainfromMojoVampire:fix-issue-29842

Conversation

@MojoVampire
Copy link
Contributor

@MojoVampireMojoVampire commentedMar 18, 2017
edited by bedevere-bot
Loading

sadovnychyi, pheanex, chkno, graingert, hfudev, d-ryzhykau, illagrenan, Tronic, and b0ri5 reacted with thumbs up emoji
@the-knights-who-say-ni

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed thePSF contributor agreement (CLA).

Unfortunately we couldn't find an account corresponding to your GitHub username onbugs.python.org (b.p.o) to verify you have signed the CLA. This is necessary for legal reasons before we can look at your contribution. Please follow these steps to help rectify the issue:

  1. If you don't have an account on b.p.o, pleasecreate one
  2. Make sure your GitHub username is listed in"Your Details" at b.p.o
  3. If you have not already done so, please sign thePSF contributor agreement. The "bugs.python.org username " requested by the form is the "Login name" field under "Your Details".
  4. If you just signed the CLA, pleasewait at least one US business day and then check "Your Details" onbugs.python.org to see if your account has been marked as having signed the CLA (the delay is due to a person having to manually check your signed CLA)
  5. Reply here saying you have completed the above steps

Thanks again to your contribution and we look forward to looking at it!

@mention-bot
Copy link

@MojoVampire, thanks for your PR! By analyzing the history of the files in this pull request, we identified@birkenfeld,@brianquinlan and@ezio-melotti to be potential reviewers.

@MojoVampire
Copy link
ContributorAuthor

I've already submitted a contributor agreement pre-GitHub migration. I just updated my b.p.o. user profile (josh.r) to link to my GitHub account name. Is that sufficient, or do I need to submit a new contributor agreement based on my GitHub e-mail address?

@MojoVampire
Copy link
ContributorAuthor

Hmm... Is the failure of continuous-integration/travis-ci/pr something real? Clicking Details just tells me it can't find a python/cpython repository at all...

@serhiy-storchakaserhiy-storchaka added the type-featureA feature request or enhancement labelMar 19, 2017
@pkch
Copy link

You can also take a look at my implementation that I uploaded tohttps://github.com/pkch/executors. It does something more like what I described in the issue tracker, the main benefit being that it's not blocking.

@methanemethane requested a review frompitrouJuly 25, 2018 21:33
Copy link
Member

@methanemethane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM at code level.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@leezu
Copy link

@MojoVampire could you share your plans about this PR? Do you plan to drive it forward?

pheanex reacted with thumbs up emoji

@MojoVampire
Copy link
ContributorAuthor

I have made the requested changes; please review again.

I did not add a Misc/NEWS entry since the file no longer exists (it's autogenerated from commit messages now, correct?).

@bedevere-bot
Copy link

Thanks for making the requested changes!

@methane: please review the changes made to this pull request.

@tirkarthi
Copy link
Member

I did not add a Misc/NEWS entry since the file no longer exists (it's autogenerated from commit messages now, correct?).

NEWS entries can be generated usingblurb orblurb-it

Please see :https://devguide.python.org/committing/?highlight=news#what-s-new-and-news-entries

MojoVampire reacted with thumbs up emoji

@MojoVampire
Copy link
ContributorAuthor

MojoVampire commentedMay 6, 2019
edited
Loading

I have made the requested changes; please review again.

Actually made the Misc/NEWS entry properly. Sorry for confusion; I haven't made a PR since the news Misc/NEWS regime began and didn't know about the blurb tool. Thanks for the assist@tirkarthi

tirkarthi reacted with thumbs up emoji

@bedevere-bot
Copy link

Thanks for making the requested changes!

@methane: please review the changes made to this pull request.

Copy link
Member

@pitroupitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Some comments below. A significant issue is that this changes the behaviour ofshutdown(wait=True) to not wait for completion of all pending futures. I don't think that's an acceptable change.


By default, a reasonable number of tasks are
queued beyond the number of workers, an explicit *prefetch* count may be
provided to specify how many extra tasks should be queued.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Using "chunks" here would be more precise than "tasks".

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The documentation for chunksize uses the phrasing "this method chops iterables into a number of chunks which it submits to the pool as separate tasks", and since not all executors even use chunks (ThreadPoolExecutor ignores the argument), I figured I'd stick with "tasks". It does kind of leave out a term to describe a single work item; the docs uses chunks and tasks as synonyms, with no term for a single work item.

self.assertCountEqual(finished,range(10))
# No guarantees on how many tasks dispatched,
# but at least one should have been dispatched
self.assertGreater(len(finished),0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think this change breaks compatibility. The doc forshutdown says:

If wait is True then this method will not return until all the pending futures are done executing and the resources associated with the executor have been freed.

So all futures should have executed, instead of being cancelled.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

At the time I wrote it, it didn't conflict with the documentation precisely; the original documentation said that map was "Equivalent to map(func, *iterables) except func is executed asynchronously and several calls to func may be made concurrently.", but doesn't guarantee that any actual futures exist (it's implemented in terms of submit and futures, but doesn't actually require such a design).

That said, it looks like you updated the documentation to add "the iterables are collected immediately rather than lazily;", which, if considered a guarantee, rather than a warning, would make this a breaking change even ignoring the "cancel vs. wait" issue.

Do you have any suggestions? If strict adherence to your newly (as of late 2017) documented behavior is needed, I suppose I could change the default behavior from "reasonable prefetch" to "exhaustive prefetch", so when prefetch isn't passed, every task is submitted, but it would be kind of annoying to lose the "good by default" behavior of limited prefetching.

The reason I cancelled rather than waiting on the result is that I was trying to follow the normal use pattern for map; since the results are yielded lazily, if the iterator goes away or is closed explicitly (or you explicitly shut down the executor), you're done; having the outstanding futures complete when you're not able to see the results means you're either:

  1. Expecting the tasks to complete without running out the Executor.map (which doesn't work with Py3's map at all, so the analogy to map should allow it; if you don't run it out, you have no guarantees anything was done)
  2. Not planning to use any further results (in which case running any submitted but unscheduled futures means doing work no one can see the results of)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actually, I think there are two problems to discuss:

  1. What happens whenshutdown(wait=True) is called. Currently it waits for all outstanding tasks. I don't think we can change that (the explicitwait flag exists for a reason).
  2. Whethermap() can be silently switched to a lazy mode of operation. There's a (perhaps minor) problem with that. Currently, if one of iterables raises an error,map() propagates the exception. With your proposal, the exception may be raised later inside the result iterator.

I think 2) might easily be worked around by introducing a separate method (lazy_map?).

It seems it would be good to discuss those questions on themailing-list.

Copy link
ContributorAuthor

@MojoVampireMojoVampireMay 6, 2019
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yeah, the problem with using the "lazy_map" name is that it feels like recreating the same annoying distinctions between map and imap from the Py2 era, and it would actually have Executor.map (which is supposed to match map, which lazily consumes the input(s)) be less similar to map than Executor.lazy_map.

If it's necessary to gain acceptance, I could change the default behavior to use prefetch=sys.maxsize - self._max_workers. It would match the pre-existing behavior for just about anything that conceivably worked before (modulo the tiny differences in memory usage of deque vs. list for storing the futures) since:

  1. All tasks would be submitted fully up front, so shutdown(wait=True) would in fact wait on them (and no further calls to submit would occur in the generator, so submitting wouldn't occur post-shutdown, which would raise a RuntimeError and cause the cancellation
  2. It wouldn't be lazy for anything by default (it would either work eagerly or crash, in the same manner it currently behaves)

If you passed a reasonable prefetch, you wouldn't have these behaviors (and we should document that interaction), but at least existing code would continue to work identically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't have a strong opinion. I think discussing those alternatives on the ML, to gather more opinions and arguments, would be useful.

@bedevere-bot
Copy link

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

…lds no reference to result at the moment it yieldsReduce line lengths to PEP8 limits
@flavianh
Copy link

@MojoVampire I think you need to comment your PR withI have made the requested changes; please review again for it to pass

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actionsgithub-actionsbot added the staleStale PR or inactive for long period of time. labelFeb 19, 2022
@rhettingerrhettinger requested a review fromapplioMay 10, 2022 21:56
@erlend-aasland
Copy link
Contributor

@MojoVampire, are you going to follow up this PR?

@erlend-aaslanderlend-aasland added the pendingThe issue will be closed if no feedback is provided labelJun 29, 2022
@github-actionsgithub-actionsbot removed the staleStale PR or inactive for long period of time. labelJun 30, 2022
@MojoVampire
Copy link
ContributorAuthor

@erlend-aasland: I'd like to apply this, but I never got any idea of what would constitute an acceptable final result. Executor.map is, frankly, useless for many of its intended purposes right now; in an effort to improve performance on huge inputs, you end up prefetching theentire input and pre-schedulingall the tasks before you can process any of them.

@erlend-aasland
Copy link
Contributor

cc.@kumaraditya303, is this something you'd be interested in reviewing?

@kumaraditya303
Copy link
Contributor

is this something you'd be interested in reviewing?

I'll take a look soon, but it seems there are two PRs for the same thing and this one has conflicts so maybe we should continue on#18566

erlend-aasland reacted with thumbs up emoji

@kumaraditya303
Copy link
Contributor

Closing this in favor of#18566, thanks for the PR!

erlend-aasland reacted with thumbs up emoji

@AA-TurnerAA-Turner removed the pendingThe issue will be closed if no feedback is provided labelApr 6, 2025
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@pitroupitrouAwaiting requested review from pitrou

@methanemethaneAwaiting requested review from methane

@applioapplioAwaiting requested review from applio

1 more reviewer

@pkchpkchpkch requested changes

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

awaiting changestype-featureA feature request or enhancement

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

17 participants

@MojoVampire@the-knights-who-say-ni@mention-bot@pkch@bedevere-bot@leezu@tirkarthi@flavianh@erlend-aasland@kumaraditya303@methane@pitrou@brettcannon@serhiy-storchaka@Mariatta@AA-Turner@ezio-melotti

[8]ページ先頭

©2009-2025 Movatter.jp