-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Description
Bug report
Bug description:
Reported by @pablogsal / @godlygeek from memray
Stack trace:
https://gist.github.com/pablogsal/513fa8b0c29cda852ce11c86ce3b1345
We have two threads, the main thread (M) and a daemon thread (D). The main thread starts _Py_Finalize() and performs a global stop the world. The daemon thread is disabling profiling and so tries to performa a stop-the-world specific to it's interpreter:
Lines 2256 to 2267 in 9745976
| static void | |
| stop_the_world(struct _stoptheworld_state *stw) | |
| { | |
| _PyRuntimeState *runtime = &_PyRuntime; | |
| PyMutex_Lock(&stw->mutex); | |
| if (stw->is_global) { | |
| _PyRWMutex_Lock(&runtime->stoptheworld_mutex); | |
| } | |
| else { | |
| _PyRWMutex_RLock(&runtime->stoptheworld_mutex); | |
| } |
M: _PyEval_StopTheWorldAll():
M: acquires runtime->stoptheworld->mutex
M: acquires RW lock runtime->stoptheworld_mutex in W (exclusive) mode
M: ... waits on threads
D: _PyEval_StopTheWorld(interp):
D: acquires interp->stoptheworld->mutex
D: ... blocks trying to acquire runtime->stoptheworld_mutex in R mode. Later, the daemon thread will hang in _PyThreadState_HangThread() when trying to re-attach it's thread state.
M: _PyEval_StopTheWorldAll() finishes, marks the interpreter as finalizing
M: ...
M: calls _PyGC_CollectNoFail() which tries to run _PyEval_StopTheWorld(interp)
M: ... blocks trying to acquire interp->stoptheworld->mutex, which is still held by the daemon thread!
Deadlock! Summary:
The daemon thread holds interp->stoptheworld->mutex and is hanging because the interpreter is shutting down.
The main thread is trying to perform the shutdown procedure, including calling the GC a few times, which requires interp->stoptheworld->mutex.
Fix???
- Release the previously acquired
interp->stoptheworld->mutexwhen hanging the thread if necessary? Crosses a bunch of abstraction barriers, which is messy and tricky
CPython versions tested on:
CPython main branch
Operating systems tested on:
No response