-
Notifications
You must be signed in to change notification settings - Fork 937
Description
Describe the bug
It appears that there's a deadlock that's reliably reproduced if one uses tcmalloc (i.e. via LD_PRELOAD
) and AP together.
AP side
The profiler goes through the glibc’s function dl_iterate_phdr() which performs a lock on glibc’s side to ensure the library is not dlclose() at the same time we iterate through the symbols, i.e.
-
locks
dl_load_write_lock
-
performs some malloc() in its callback, which may grow TCMalloc’s local cache and then reserve some memory from the OS, locking
tcmalloc::Static::pageheap_lock_
TCMalloc side
When TCMalloc actually reserves some memory from the OS:
-
it locks
tcmalloc::Static::pageheap_lock_
to perform the memory allocations -
when this happens it keeps the current stack trace for the heap profiler
-
this heap profiler then fetches the stack trace, calling glibc’s ELF symbols lookups and locking
dl_load_write_lock
(this can perform some mallocs but they do not re-trigger the heap profiler thanks to a recursive call check, thus tcmalloc cannot deadlock itself)
Reproduction Steps
Using LD_PRELOAD
with tcmalloc on x86_64, specifically in order to rely on the deadlocking libunwind
-based stacktrace mechanism, should yield a fairly high defect rate. See https://github.com/gperftools/gperftools/wiki/gperftools'-stacktrace-capturing-methods-and-their-issues
Async-profiler version
latest + whenever dl_iterate locks were introduced, in theory
Environment details
x86_64 machine, libunwind
based ST mechanism for tcmalloc, tcmalloc via LD_PRELOAD
Thanks to @trazfr for identifying & root-causing this internally. For our Datadog fork of AP, we are looking at writing a custom allocator for use within the callback function + eliminating the explicit malloc
calls used by the CodeCache.