Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

The main thread hang at drop_gil due to FORCE_SWITCHING after daemon thread exited #96387

Closed
Labels
interpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error
@milizao

Description

@milizao

Bug report

Main thread hang atdrop_gil after daemon thread exited, my Python version is 3.9.2, but I think the same issue is exists in the Python 3.10 newest version.

(gdb) info threads  Id   Target Id                                   Frame * 1    Thread 0xffff909dd010 (LWP 226769) "python" 0x0000ffff909355c0 in futex_wait_cancelable (private=<optimized out>, expected=0,     futex_word=0xaaaadf99e8d4 <_PyRuntime+524>) at ../sysdeps/unix/sysv/linux/futex-internal.h:88(gdb) bt#0  0x0000ffff909355c0 in futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0xaaaadf99e8d4 <_PyRuntime+524>)    at ../sysdeps/unix/sysv/linux/futex-internal.h:88#1  __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0xaaaadf99e8d8 <_PyRuntime+528>, cond=0xaaaadf99e8a8 <_PyRuntime+480>) at pthread_cond_wait.c:508#2  __pthread_cond_wait (cond=cond@entry=0xaaaadf99e8a8 <_PyRuntime+480>, mutex=mutex@entry=0xaaaadf99e8d8 <_PyRuntime+528>) at pthread_cond_wait.c:638#3  0x0000aaaadf35d184 in drop_gil (ceval=0xaaaadf99e820 <_PyRuntime+344>, ceval@entry=0xaaaadf300274 <PyObject_Hash+116>, ceval2=ceval2@entry=0xaaaafbc61ff0,     tstate=tstate@entry=0xaaaafbc64a00) at Python/ceval_gil.h:178#4  0x0000aaaadf2c05c4 in eval_frame_handle_pending (tstate=0xaaaafbc64a00) at Python/ceval.c:882#5  _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0xffff83929d40, throwflag=<optimized out>) at Python/ceval.c:1402#6  0x0000aaaadf2bee1c in _PyEval_EvalFrame (throwflag=0, f=0xffff83929d40, tstate=0xaaaafbc64a00) at ./Include/internal/pycore_ceval.h:40#7  function_code_fastcall (tstate=0xaaaafbc64a00, co=<optimized out>, args=<optimized out>, nargs=1, globals=<optimized out>) at Objects/call.c:330#8  0x0000aaaadf31ac04 in _PyObject_VectorcallTstate (kwnames=<optimized out>, nargsf=<optimized out>, args=<optimized out>, callable=<optimized out>,     tstate=<optimized out>) at ./Include/cpython/abstract.h:118#9  _PyObject_CallNoArg (func=<optimized out>) at ./Include/cpython/abstract.h:172#10 call_unbound_noarg (self=<optimized out>, func=0xffff83691ee0, unbound=<optimized out>) at Objects/typeobject.c:1515#11 slot_tp_finalize (self=<optimized out>) at Objects/typeobject.c:6992#12 0x0000aaaadf30011c in PyObject_CallFinalizer (self=self@entry=0xffff837ea6d0) at Objects/object.c:195#13 0x0000aaaadf302694 in PyObject_CallFinalizerFromDealloc (self=self@entry=0xffff837ea6d0) at Objects/object.c:213#14 0x0000aaaadf314b1c in subtype_dealloc (self=0xffff837ea6d0) at Objects/typeobject.c:1265#15 0x0000aaaadf2f3f4c in _Py_DECREF (op=<optimized out>) at ./Include/object.h:430#16 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:497#17 insertdict (mp=mp@entry=0xffff8383de80, key=0xffff837450b0, hash=-5519671433175608536, value=value@entry=0xaaaadf94be68 <_Py_NoneStruct>)    at Objects/dictobject.c:1123#18 0x0000aaaadf2f52d4 in PyDict_SetItem (op=op@entry=0xffff8383de80, key=<optimized out>, value=value@entry=0xaaaadf94be68 <_Py_NoneStruct>)    at Objects/dictobject.c:1573#19 0x0000aaaadf2ffa48 in _PyModule_ClearDict (d=0xffff8383de80) at Objects/moduleobject.c:612#20 0x0000aaaadf2ffc84 in _PyModule_Clear (m=m@entry=0xffff837f1a40) at Objects/moduleobject.c:560#21 0x0000aaaadf3798e8 in _PyImport_Cleanup (tstate=tstate@entry=0xaaaafbc64a00) at Python/import.c:606#22 0x0000aaaadf38e63c in Py_FinalizeEx () at Python/pylifecycle.c:1426#23 0x0000aaaadf2c9110 in Py_RunMain () at Modules/main.c:679#24 0x0000aaaadf2c99f8 in pymain_main (args=args@entry=0xffffd3754d88) at Modules/main.c:707#25 0x0000aaaadf2c9ad4 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731#26 0x0000ffff906c1f80 in __libc_start_main (main=0x0, argc=0, argv=0x0, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,     stack_end=<optimized out>) at ../csu/libc-start.c:308#27 0x0000aaaadf2c87bc in _start ()Backtrace stopped: previous frame identical to this frame (corrupt stack?)(gdb) py-btTraceback (most recent call first):  File "/home/ossadm/zhao/runtime.py", line 8, in __del__    def __del__(self):(gdb)

Rough analysis

In Python version 3.9.2, thetake_gil is implemented as following:

staticvoidtake_gil(PyThreadState*tstate){...while (_Py_atomic_load_relaxed(&gil->locked)) {unsigned longsaved_switchnum=gil->switch_number;unsigned longinterval= (gil->interval >=1 ?gil->interval :1);inttimed_out=0;COND_TIMED_WAIT(gil->cond,gil->mutex,interval,timed_out);/* If we timed out and no switch occurred in the meantime, it is time           to ask the GIL-holding thread to drop it. */if (timed_out&&_Py_atomic_load_relaxed(&gil->locked)&&gil->switch_number==saved_switchnum)        {if (tstate_must_exit(tstate)) {MUTEX_UNLOCK(gil->mutex);PyThread_exit_thread();// [2]            }assert(is_tstate_valid(tstate));SET_GIL_DROP_REQUEST(interp);// [1]        }    }...}

A daemon thread may callSET_GIL_DROP_REQUEST and then exits fromPyThread_exit_thread, so this thread will not get GIL really. This happens if the main thread is on the finalizing procedure and it calls_PyRuntimeState_SetFinalizing inPy_FinalizeEx before the daemon thread gets GIL (The daemon thread wait 5ms at most before exit).

After that, the flaggil_drop_request is set, but only the main thread is left. Then, if the main thread runs intodrop_gil again, it will hang atFORCE_SWITCHING:

COND_WAIT(gil->switch_cond,gil->switch_mutex);

Reproducible example

This issue could be reproduced by:

$python-mhttp.server5678&$for ((i=0;;i++));doecho $i;python-Wallservice.py;done

And the scripts used are:

runtime.py

importsocketclassTestUnclosedSocket:def__init__(self):self.s=socket.socket()def__del__(self):self.s.connect(('127.0.0.1',5678))ins=TestUnclosedSocket()

service.py

importthreadingimportruntimeimportsyssys.setswitchinterval(0.001)defcalc():sum=0foriinrange(100000):sum+=i*idefdaemon_func():whileTrue:calc()if__name__=='__main__':threading.Thread(target=daemon_func,name="daemon",daemon=True).start()calc()

I change GIL switch interval to 1ms to increase the probability here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    interpreter-core(Objects, Python, Grammar, and Parser dirs)type-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions


      [8]ページ先頭

      ©2009-2025 Movatter.jp