Movatterモバイル変換


[0]ホーム

URL:


homepage

Issue3789

This issue trackerhas been migrated toGitHub, and is currentlyread-only.
For more information, see the GitHub FAQs in the Python's Developer Guide.

classification
Title:multiprocessing deadlocks when sending large data through Queue with timeout
Type:Stage:
Components:Library (Lib)Versions:Python 2.6
process
Status:closedResolution:not a bug
Dependencies:Superseder:
Assigned To: jnollerNosy List: DavidDecotigny, jnoller
Priority:normalKeywords:

Created on2008-09-05 22:35 byDavidDecotigny, last changed2022-04-11 14:56 byadmin. This issue is nowclosed.

Files
File nameUploadedDescriptionEdit
c.pyDavidDecotigny,2008-09-05 22:35Example showing the bug ("Happy" never displayed)
Messages (6)
msg72640 -(view)Author: David Decotigny (DavidDecotigny)Date: 2008-09-05 22:35
With the attached script, then demo() called with for exampledatasize=40*1024*1024 and timeout=1 will deadlock: the program neverterminates.The bug appears on Linux (RHEL4) / intel x86 with "multiprocessing"coming with python 2.6b3 and I think it can be easily reproduced onother Unices. It also appears with python 2.5 and the standaloneprocessing package 0.52(https://developer.berlios.de/bugs/?func=detailbug&bug_id=14453&group_id=9001).After a quick investigation, it seems to be a deadlock between waitpidin the parent process, and a pipe::send in the "_feed" thread of thechild process. Indeed, the problem seems to be that "_feed" is stillsending data (the data is laaarge) to the pipe while the parent processalready called waitpid (because of the "short" timeout): the pipe fillsup because no consumer is eating the data (consumer already in waitpid)and hence the "_feed" thread in the child blocks forever. Since thechild process does a _feed.join() before exiting (after function f), itnever exits. And hence the waitpid in the parent process never returnsbecause the child never exits.This doesn't happen anymore if I use timeout=None or a larger timeout(eg. 10 seconds). Because in both cases, waitpid is called /after/ the"_feed" thread in the child process could send all of its data throughthe pipe.
msg72655 -(view)Author: David Decotigny (DavidDecotigny)Date: 2008-09-06 00:38
A quick fix in the user code, when we are sure we don't need the childprocess if a timeout happens, is to call worker.terminate() in an exceptEmpty clause.
msg72657 -(view)Author: Jesse Noller (jnoller)*(Python committer)Date: 2008-09-06 01:10
Seehttp://docs.python.org/dev/library/multiprocessing.html#multiprocessing-programmingSpecifically:Joining processes that use queuesBear in mind that a process that has put items in a queue will wait before terminating until all the buffered items are fed by the “feeder” thread to the underlying pipe. (The child process can call the Queue.cancel_join() method of the queue to avoid this behaviour.)This means that whenever you use a queue you need to make sure that all items which have been put on the queue will eventually be removed before the process is joined. Otherwise you cannot be sure that processes which have put items on the queue will terminate. Remember also that non-daemonic processes will be automatically be joined.
msg72658 -(view)Author: Jesse Noller (jnoller)*(Python committer)Date: 2008-09-06 01:16
In a later release, I'd like to massage this in such a way that you do not have to wait for a child queue to be drained prior to calling join.One way to work around this David, is to call Queue.cancel_join_thread():def f(datasize, q):    q.cancel_join_thread()    q.put(range(datasize))
msg72659 -(view)Author: David Decotigny (DavidDecotigny)Date: 2008-09-06 01:45
Thank you Jesse. When I read this passage, I thought naively that atimeout raised in a get() would not be harmful: that somehow the wholeget() request would be aborted. But now I realize that it would makethings rather complicated and dangerous: the data would get dropped, andwill never be recovered by subsequent get().So thank you for the hint, and leave the things as they are, it's better.
msg72660 -(view)Author: Jesse Noller (jnoller)*(Python committer)Date: 2008-09-06 01:55
No problem David, you're the 4th person to ask me about this in the past 2 months :)
History
DateUserActionArgs
2022-04-11 14:56:38adminsetgithub: 48039
2008-09-06 01:55:37jnollersetmessages: +msg72660
2008-09-06 01:45:24DavidDecotignysetmessages: +msg72659
2008-09-06 01:20:34jnollersetstatus: open -> closed
resolution: not a bug
2008-09-06 01:16:24jnollersetmessages: +msg72658
2008-09-06 01:10:44jnollersetmessages: +msg72657
2008-09-06 00:38:34DavidDecotignysetmessages: +msg72655
2008-09-05 22:36:18benjamin.petersonsetassignee:jnoller
nosy: +jnoller
2008-09-05 22:35:25DavidDecotignycreate
Supported byThe Python Software Foundation,
Powered byRoundup
Copyright © 1990-2022,Python Software Foundation
Legal Statements

[8]ページ先頭

©2009-2026 Movatter.jp