Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Description
Bug report
When using a ProcessPoolExecutor with forked child processes, if one of the child processes suddenly dies (segmentation fault, not a Python exception) and if simultaneously data is being sent into the call queue, then the parent process hangs forever.
Reproduction
import ctypesfrom concurrent.futures import ProcessPoolExecutordef segfault(): ctypes.string_at(0)def func(i, data): print(f"Start {i}.") if i == 1: segfault() print(f"Done {i}.") return idata = list(range(100_000_000))count = 10with ProcessPoolExecutor(2) as pool: list(pool.map(func, range(count), [data] * count))print(f"OK")In Python 3.8.10 it raises a BrokenProcessPool exception whereas in 3.9.13 and 3.10.5 it hangs.
Analysis
When a crash happens in a child process, all workers are terminated and they stop reading in communication pipes. However if data is being send in the call queue, the call queue thread which writes data from buffer to pipe (multiprocessing.queues.Queue._feed) can get stuck insend_bytes(obj) when the unix pipe it's writing to is full._ExecutorManagerThread is blocked inself.join_executor_internals() on line
cpython/Lib/concurrent/futures/process.py
Line 515 inda49128
| self.call_queue.join_thread() |
self.terminate_broken()). The main thread itself is blocked oncpython/Lib/concurrent/futures/process.py
Line 775 inda49128
| self._executor_manager_thread.join() |
__exit__ method of the Executor.Proposed solution
Drain call queue buffer either interminate_broken method before callingjoin_executor_internals or in queueclose method.
I will create a pull request with a possible implementation.
Your environment
- CPython versions tested on: reproduced in 3.10.5 and 3.9.13 (works well in 3.8.10: BrokenProcessPool exception)
- Operating system and architecture: Linux, x86_64