Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork33.7k
Closed
Description
Bug report
Submitting many tasks to aconcurrent.futures.ProcessPoolExecutor pool
deadlocks with all three start methods.
When running the same example withmultiprocessing.pool.Pool we have NOT been
able to cause a deadlock.
Different set of parameters affect how likely it is to get a deadlock
- All start methods
spawn,fork, andforkserverexhibit the deadlock
(the examples below are withspawnmethod) - It's possible to get a deadlock with num_processes 1-24
- As long as NUM_TASKS is high, TASK_DATA and TASK_SIZE can be low/removed and
still cause a hang. (see example script) - Set
DO_PRINT = Falsefor higher probability of hanging.
Example stack trace excerpts in hanged scenarios
reading the queue:
- 1 thread stuck at:
read (libpthread-2.27.so)recv_bytes (multiprocessing/connection.py:221)get (multiprocessing/queues.py:103) - other threads stuck at:
do_futex_wait.constprop.1 (libpthread-2.27.so)_multiprocessing_SemLock_acquire_impl (semaphore.c:355)get (multiprocessing/queues.py:102)
- 1 thread stuck at:
writing the queue:
- 1 thread stuck at:
write (libpthread-2.27.so)send_bytes (multiprocessing/connection.py:205)put (multiprocessing/queues.py:377) - other threads stuck at:
do_futex_wait.constprop.1 (libpthread-2.27.so)_multiprocessing_SemLock_acquire_impl (semaphore.c:355)put (multiprocessing/queues.py:376)
- 1 thread stuck at:
Example script exhibiting deadlock behavior
#!/usr/bin/env python3""" Example that hangs with concurrent.futures.ProcessPoolExecutor """importmultiprocessingimportconcurrent.futures# Tweaking parametersNUM_TASKS=500000TASK_DATA=1TASK_SIZE=1DO_PRINT=True# Set to false for almost guaranteed hangSTART_METHOD="spawn"# Does not seem to matterNUM_PROCESSES=4# multiprocessing.cpu_count()defmain():print("Starting pool")ctx=multiprocessing.get_context(START_METHOD)withconcurrent.futures.ProcessPoolExecutor(max_workers=NUM_PROCESSES,mp_context=ctx)aspool:future_results=submit_to_pool(pool)print("Collecting results")assertFalse# Never reachedcollect_results(future_results)defcollect_results(future_results):return [r.result()forrinfuture_results]defsubmit_to_pool(pool):future_results= []fortask_idxinrange(NUM_TASKS):ifDO_PRINTandtask_idx%20000==0:# Too much printing here makes the hang to go away!!!print("\nsubmit",task_idx)task_name=f"task{task_idx}"*TASK_DATAfuture_results.append(pool.submit(task,task_idx,task_name))returnfuture_resultsdeftask(task_idx,task_name):""" Do some dummy work """s=""foriinrange(TASK_SIZE):s+=str(i)ifDO_PRINT:# Too much printing here makes the hang to go away!!!print(".",end="",flush=True)if__name__=="__main__":main()
Environment
- My environment:
- Ubuntu 18.04.6 LTS (bionic)
- Python 3.10.5
- My colleagues environment:
- Ubuntu 22.04.2 LTS (jammy)
- Either:
- Python 3.10.5
- Python 3.11.0rc1
Details
Detailed stack traces in comments.
Linked PRs
- gh-105829: Fix concurrent.futures.ProcessPoolExecutor deadlock #108513
- [3.11] gh-105829: Fix concurrent.futures.ProcessPoolExecutor deadlock (GH-108513) #109783
- [3.12] gh-105829: Fix concurrent.futures.ProcessPoolExecutor deadlock (GH-108513) #109784
- gh-109917, gh-105829: Fix concurrent.futures _ThreadWakeup.wakeup() #110129
Metadata
Metadata
Assignees
Projects
Status
Done