Movatterモバイル変換

Doc/library/concurrent.futures.rstShow resolvedHide resolved

Lib/concurrent/futures/_base.pyShow resolvedHide resolved

pkch reviewed

May 17, 2017

Adding LazyThreadPoolExecutor classfastai/fastai#262

Lib/concurrent/futures/_base.pyShow resolvedHide resolved

Copy link

pkch commentedMay 17, 2017

You can also take a look at my implementation that I uploaded tohttps://github.com/pkch/executors. It does something more like what I described in the issue tracker, the main benefit being that it's not blocking.

Mariatta added needs rebase and removed needs rebase labels

Oct 9, 2017

brettcannon added the awaiting core review label

Feb 21, 2018

bearpelican mentioned this pull request

Mar 29, 2018

Merged

MojoVampire commented

Doc/library/concurrent.futures.rst OutdatedShow resolvedHide resolved

methane requested a review frompitrou

July 25, 2018 21:33

methane requested changes

Copy link

Member

methane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM at code level.

Misc/NEWS OutdatedShow resolvedHide resolved

bedevere-bot removed the awaiting core review label

Copy link

bedevere-bot commentedJul 25, 2018

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

bedevere-bot added the awaiting changes label

methane requested changes

Lib/concurrent/futures/_base.pyShow resolvedHide resolved

Lib/concurrent/futures/process.pyShow resolvedHide resolved

Copy link

leezu commentedNov 13, 2018

@MojoVampire could you share your plans about this PR? Do you plan to drive it forward?

MojoVampire added2 commits

May 6, 2019 12:18

Merge branch 'master' into fix-issue-29842

6281379

Add prefetch info to docstrings

8394e34

Copy link

ContributorAuthor

MojoVampire commentedMay 6, 2019

I have made the requested changes; please review again.

I did not add a Misc/NEWS entry since the file no longer exists (it's autogenerated from commit messages now, correct?).

bedevere-bot added awaiting change review and removed awaiting changes labels

Copy link

bedevere-bot commentedMay 6, 2019

Thanks for making the requested changes!

@methane: please review the changes made to this pull request.

Copy link

Member

tirkarthi commentedMay 6, 2019

I did not add a Misc/NEWS entry since the file no longer exists (it's autogenerated from commit messages now, correct?).

NEWS entries can be generated usingblurb orblurb-it

Please see :https://devguide.python.org/committing/?highlight=news#what-s-new-and-news-entries

Add Misc/NEWS entry

634a3de

Copy link

ContributorAuthor

MojoVampire commentedMay 6, 2019•
edited
Loading

I have made the requested changes; please review again.

Actually made the Misc/NEWS entry properly. Sorry for confusion; I haven't made a PR since the news Misc/NEWS regime began and didn't know about the blurb tool. Thanks for the assist@tirkarthi

Copy link

bedevere-bot commentedMay 6, 2019

Thanks for making the requested changes!

@methane: please review the changes made to this pull request.

pitrou requested changes

Doc/library/concurrent.futures.rst

Copy link

Member

pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Some comments below. A significant issue is that this changes the behaviour ofshutdown(wait=True) to not wait for completion of all pending futures. I don't think that's an acceptable change.

Lib/concurrent/futures/_base.py OutdatedShow resolvedHide resolved


		By default, a reasonable number of tasks are
		queued beyond the number of workers, an explicit prefetch count may be
		provided to specify how many extra tasks should be queued.

Copy link

Member

pitrouMay 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Using "chunks" here would be more precise than "tasks".

Copy link

ContributorAuthor

MojoVampireMay 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The documentation for chunksize uses the phrasing "this method chops iterables into a number of chunks which it submits to the pool as separate tasks", and since not all executors even use chunks (ThreadPoolExecutor ignores the argument), I figured I'd stick with "tasks". It does kind of leave out a term to describe a single work item; the docs uses chunks and tasks as synonyms, with no term for a single work item.

Lib/test/test_concurrent_futures.py

		self.assertCountEqual(finished,range(10))
		# No guarantees on how many tasks dispatched,
		# but at least one should have been dispatched
		self.assertGreater(len(finished),0)

Copy link

Member

pitrouMay 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think this change breaks compatibility. The doc forshutdown says:

If wait is True then this method will not return until all the pending futures are done executing and the resources associated with the executor have been freed.

So all futures should have executed, instead of being cancelled.

Copy link

ContributorAuthor

MojoVampireMay 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

At the time I wrote it, it didn't conflict with the documentation precisely; the original documentation said that map was "Equivalent to map(func, *iterables) except func is executed asynchronously and several calls to func may be made concurrently.", but doesn't guarantee that any actual futures exist (it's implemented in terms of submit and futures, but doesn't actually require such a design).

That said, it looks like you updated the documentation to add "the iterables are collected immediately rather than lazily;", which, if considered a guarantee, rather than a warning, would make this a breaking change even ignoring the "cancel vs. wait" issue.

Do you have any suggestions? If strict adherence to your newly (as of late 2017) documented behavior is needed, I suppose I could change the default behavior from "reasonable prefetch" to "exhaustive prefetch", so when prefetch isn't passed, every task is submitted, but it would be kind of annoying to lose the "good by default" behavior of limited prefetching.

The reason I cancelled rather than waiting on the result is that I was trying to follow the normal use pattern for map; since the results are yielded lazily, if the iterator goes away or is closed explicitly (or you explicitly shut down the executor), you're done; having the outstanding futures complete when you're not able to see the results means you're either:

Expecting the tasks to complete without running out the Executor.map (which doesn't work with Py3's map at all, so the analogy to map should allow it; if you don't run it out, you have no guarantees anything was done)
Not planning to use any further results (in which case running any submitted but unscheduled futures means doing work no one can see the results of)

Copy link

Member

pitrouMay 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actually, I think there are two problems to discuss:

What happens whenshutdown(wait=True) is called. Currently it waits for all outstanding tasks. I don't think we can change that (the explicitwait flag exists for a reason).
Whethermap() can be silently switched to a lazy mode of operation. There's a (perhaps minor) problem with that. Currently, if one of iterables raises an error,map() propagates the exception. With your proposal, the exception may be raised later inside the result iterator.

I think 2) might easily be worked around by introducing a separate method (lazy_map?).

It seems it would be good to discuss those questions on themailing-list.

Copy link

ContributorAuthor

MojoVampireMay 6, 2019•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Yeah, the problem with using the "lazy_map" name is that it feels like recreating the same annoying distinctions between map and imap from the Py2 era, and it would actually have Executor.map (which is supposed to match map, which lazily consumes the input(s)) be less similar to map than Executor.lazy_map.

If it's necessary to gain acceptance, I could change the default behavior to use prefetch=sys.maxsize - self._max_workers. It would match the pre-existing behavior for just about anything that conceivably worked before (modulo the tiny differences in memory usage of deque vs. list for storing the futures) since:

All tasks would be submitted fully up front, so shutdown(wait=True) would in fact wait on them (and no further calls to submit would occur in the generator, so submitting wouldn't occur post-shutdown, which would raise a RuntimeError and cause the cancellation
It wouldn't be lazy for anything by default (it would either work eagerly or crash, in the same manner it currently behaves)

If you passed a reasonable prefetch, you wouldn't have these behaviors (and we should document that interaction), but at least existing code would continue to work identically.

Copy link

Member

pitrouMay 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't have a strong opinion. I think discussing those alternatives on the ML, to gather more opinions and arguments, would be useful.

bedevere-bot removed the awaiting change review label

Copy link

bedevere-bot commentedMay 6, 2019

bedevere-bot added the awaiting changes label