Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue19675

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:Pool dies with excessive workers, but does not cleanup
Type:resource usageStage:resolved
Components:Library (Lib)Versions:Python 3.7, Python 3.6, Python 2.7
process
Status:closedResolution:fixed
Dependencies:Superseder:
Assigned To:Nosy List: Vladimir Feinberg, Winterflower, davin, dsoprea, mdk, sbt
Priority:normalKeywords:patch

Created on2013-11-21 01:52 bydsoprea, last changed2022-04-11 14:57 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
pool.py.patch_2.7.6_20131120-1959dsoprea,2013-11-21 01:52Patch to pool.py for Pool fork-fail cleanup bug.
trace.txtVladimir Feinberg,2017-02-07 15:04Trace of bug happening on my platform
bug19675git.patchWinterflower,2017-02-12 20:26
Pull Requests
URLStatusLinkedEdit
PR 57closeddsoprea,2017-02-14 17:34
PR 5614mergedmdk,2018-02-10 22:44
Messages (12)
msg203556 -(view)Author: Dustin Oprea (dsoprea)*Date: 2013-11-21 01:52
If you provide a number of processes to a Pool that the OS can't fulfill, Pool will raise an OSError and die, but does not cleanup any of the processes that it has forked.This is a session in Python where I can allocate a large, but fulfillable, number of processes (just to exhibit what's possible in my current system):>>> from multiprocessing import Pool>>> p = Pool(500)>>> p.close()>>> p.join()Now, this is a request that will fail. However, even after this fails, I can't allocate even a single worker:>>> p = Pool(700)Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool    return Pool(processes, initializer, initargs, maxtasksperchild)  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 159, in __init__    self._repopulate_pool()  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool    w.start()  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 130, in start    self._popen = Popen(self)  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py", line 121, in __init__    self.pid = os.fork()OSError: [Errno 35] Resource temporarily unavailable>>> p = Pool(1)Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool    return Pool(processes, initializer, initargs, maxtasksperchild)  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 159, in __init__    self._repopulate_pool()  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool    w.start()  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 130, in start    self._popen = Popen(self)  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py", line 121, in __init__    self.pid = os.fork()OSError: [Errno 35] Resource temporarily unavailableThe only way to clean this up is to close the parent (the interpreter).I'm submitting a patch for 2.7.6 that intercepts exceptions and cleans-up the workers before bubbling. The affected method is _repopulate_pool(), and appears to be the same in 2.7.6, 3.3.3, and probably every other recent version of Python.This is the old version:        for i in range(self._processes - len(self._pool)):            w = self.Process(target=worker,                             args=(self._inqueue, self._outqueue,                                   self._initializer,                                   self._initargs, self._maxtasksperchild)                            )            self._pool.append(w)            w.name = w.name.replace('Process', 'PoolWorker')            w.daemon = True            w.start()            debug('added worker')This is the new version:        try:            for i in range(self._processes - len(self._pool)):                w = self.Process(target=worker,                                 args=(self._inqueue, self._outqueue,                                       self._initializer,                                       self._initargs, self._maxtasksperchild)                                )                self._pool.append(w)                w.name = w.name.replace('Process', 'PoolWorker')                w.daemon = True                w.start()                debug('added worker')        except:            debug("Process creation error. Cleaning-up (%d) workers." % (len(self._pool)))            for process in self._pool:                if process.is_alive() is False:                    continue                process.terminate()                process.join()            debug("Processing cleaning-up. Bubbling error.")            raiseThis is what happens, now: I can go from requesting a number that's too high to immediately requesting one that's also high but within limits, and there's now no problem as all resources have been freed:>>> from multiprocessing import Pool>>> p = Pool(700)Traceback (most recent call last):  File "<stdin>", line 1, in <module>  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/__init__.py", line 232, in Pool    return Pool(processes, initializer, initargs, maxtasksperchild)  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 159, in __init__    self._repopulate_pool()  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/pool.py", line 224, in _repopulate_pool    w.start()  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/process.py", line 130, in start    self._popen = Popen(self)  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/multiprocessing/forking.py", line 121, in __init__    self.pid = os.fork()OSError: [Errno 35] Resource temporarily unavailable>>> p = Pool(500)>>> p.close()>>> p.join()
msg287235 -(view)Author: Vladimir Feinberg (Vladimir Feinberg)Date: 2017-02-07 15:04
I'm still hitting this issue in Python 3.6.0 :: Anaconda 4.3.0 (64-bit). Is there a reason this patch has been ignored?This is on CentOS release 6.5
msg287648 -(view)Author: Camilla Montonen (Winterflower)Date: 2017-02-12 20:26
I've reformatted Dustin Oprea's original patch to be compatible with git, applied it to master and executed _test_multiprocessing suite. All tests pass locally.
msg287658 -(view)Author: Camilla Montonen (Winterflower)Date: 2017-02-12 22:29
@dsoprea: would you like to open a PR for this issue on Github? if not, are you happy for me to do it?
msg287660 -(view)Author: Dustin Oprea (dsoprea)*Date: 2017-02-12 23:38
https://github.com/python/cpython/pull/57On Sun, Feb 12, 2017 at 5:29 PM, Camilla Montonen <report@bugs.python.org>wrote:>> Camilla Montonen added the comment:>> @dsoprea: would you like to open a PR for this issue on Github? if not,> are you happy for me to do it?>> ---------->> _______________________________________> Python tracker <report@bugs.python.org>> <http://bugs.python.org/issue19675>> _______________________________________>
msg287709 -(view)Author: Davin Potts (davin)*(Python committer)Date: 2017-02-13 16:50
@Winterflower: Thank you for encouraging @dsoprea to create the new PR and working to convert the previous patch.@dsoprea: Thank you for taking the time to create the PR especially after this has been sitting unloved for so long.Though the new workflow using PR's is still in a bit of a state of flux, my understanding is that we will want to have one PR per feature branch (i.e. one for each of 2.7, 3.6, 3.7) that we want to target.Now that we seem to have spawned two parallel discussion tracks (one here and one in the PRhttps://github.com/python/cpython/pull/57), I'm not sure how best to resolve that but for the time being I'll offer code-related comments here as they're much more likely to be preserved (and thus discoverable) for posterity:  we do need some sort of tests around this to complete the patch -- something that would exercise both the non-exception and exception paths (and thus would detect that intended call to util.debug()).
msg287711 -(view)Author: Dustin Oprea (dsoprea)*Date: 2017-02-13 17:00
Okay. Thanks for weighing-in.I'm trying to figure out how to write the tests. The existing set of testsfor multiprocessing is a near nightmare. It seems like I might have to useone of the existing "source code" definitions to test for the no-cleanup(traditional) scenario but then have to define a new one to define the poolwith an initializer that fails. However, then, I'll have to define threenew TestCase-based classes to test for the various start-methods.Any suggestions?On Mon, Feb 13, 2017 at 11:50 AM, Davin Potts <report@bugs.python.org>wrote:>> Changes by Davin Potts <python@discontinuity.net>:>>> ----------> versions: +Python 2.7, Python 3.7>> _______________________________________> Python tracker <report@bugs.python.org>> <http://bugs.python.org/issue19675>> _______________________________________>
msg287712 -(view)Author: Davin Potts (davin)*(Python committer)Date: 2017-02-13 17:19
For triggering the exception, supplying a Process target that deliberately fails sounds right.As for tests for the various start methods (fork/forkserver/spawn), if you are looking at the 3.x branches you'll find this was been consolidated so that one test could conceivably be written to handle multiple variants (seeLib/test/_test_multiprocessing.py).
msg287785 -(view)Author: Dustin Oprea (dsoprea)*Date: 2017-02-14 17:34
I don't think this can be tested. Throwing exceptions in the remote process causes exceptions that can't be caught in the same way (when the initializer fails the pool just attempts to recreate the process over and over) and I don't think it'd be acceptable to try to spawn too many processes in order to induce the original problem. There's not a lot of surface area to what we've doing here. We can't simulate failures any other way.Can I get an agreement on this from someone?
msg288044 -(view)Author: Camilla Montonen (Winterflower)Date: 2017-02-17 23:04
Would it be possible to see a review of the existing attempts at testing this issue?
msg289556 -(view)Author: (ppperry)Date: 2017-03-14 00:07
Can't you just mock the Process class to have a start method that always raises an error?
msg329273 -(view)Author: Julien Palard (mdk)*(Python committer)Date: 2018-11-04 22:40
New changeset5d236cafd7126e640fb25541fcc7e0a494450143 by Julien Palard in branch 'master':bpo-19675: Terminate processes if construction of a pool is failing. (GH-5614)https://github.com/python/cpython/commit/5d236cafd7126e640fb25541fcc7e0a494450143
History
DateUserActionArgs
2022-04-11 14:57:53adminsetgithub: 63874
2018-11-05 08:37:42mdksetstatus: open -> closed
resolution: fixed
stage: patch review -> resolved
2018-11-04 22:50:14ppperrysetnosy: -ppperry
2018-11-04 22:40:34mdksetnosy: +mdk
messages: +msg329273
2018-02-10 22:44:19mdksetstage: patch review
pull_requests: +pull_request5424
2017-03-14 00:07:41ppperrysetnosy: +ppperry
messages: +msg289556
2017-02-17 23:04:27Winterflowersetmessages: +msg288044
2017-02-14 17:34:38dsopreasetmessages: +msg287785
pull_requests: +pull_request60
2017-02-13 17:19:22davinsetmessages: +msg287712
2017-02-13 17:00:12dsopreasetmessages: +msg287711
2017-02-13 16:50:40davinsetversions: + Python 2.7, Python 3.7
2017-02-13 16:50:29davinsetmessages: +msg287709
2017-02-12 23:38:04dsopreasetmessages: +msg287660
2017-02-12 22:29:35Winterflowersetmessages: +msg287658
2017-02-12 20:26:39Winterflowersetfiles: +bug19675git.patch

messages: +msg287648
2017-02-11 10:10:19Winterflowersetnosy: +Winterflower
2017-02-07 15:35:35ned.deilysetnosy: +davin
2017-02-07 15:04:37Vladimir Feinbergsetfiles: +trace.txt
versions: + Python 3.6, - Python 2.7, Python 3.3
nosy: +Vladimir Feinberg

messages: +msg287235
2013-11-21 02:18:03ned.deilysetkeywords: +patch
nosy: +sbt
2013-11-21 01:52:19dsopreacreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp