-
Notifications
You must be signed in to change notification settings - Fork 937
Description
Describe the bug
When running async profiler normally from the CLI we asprof hangs and the VM eventually becomes unresponsive. This happens when attaching with the following command:
$ ./bin/asprof collect -d 10s <pid>
When running asprof, it never returns, we never see "profiling started" statement. The command just hangs indefinitely. This happens with 3.0 and all the latest nightly builds.
This may be circumstantial but in some instances running the load
command before running a collect
allows us to get samples, however we get the following error when attempt to load the profiler:
$ ./bin/asprof load <pid>
Connected to remote JVM
JVM response code = -1
was not loaded.
lib.so: cannot open shared object file: No such file or directory
Again i'm not sure if this is luck or an actual workaround but it has worked a handful of times.
In some cases, even when the VM becomes unresponsive, I do see that libasyncProfiler.so
is mapped into the programs address space. However running any commands from the asprof binary simply hang.
7fc4ef787000-7fc4ef78d000 r--p 00000000 00:1d 43533 /tmp/async-profiler-3.0-e6e0494-linux-x64/lib/libasyncProfiler.so
7fc4ef78d000-7fc4ef7dd000 r-xp 00006000 00:1d 43533 /tmp/async-profiler-3.0-e6e0494-linux-x64/lib/libasyncProfiler.so
7fc4ef7dd000-7fc4ef7f1000 r--p 00056000 00:1d 43533 /tmp/async-profiler-3.0-e6e0494-linux-x64/lib/libasyncProfiler.so
7fc4ef7f1000-7fc4ef7f2000 ---p 0006a000 00:1d 43533 /tmp/async-profiler-3.0-e6e0494-linux-x64/lib/libasyncProfiler.so
7fc4ef7f2000-7fc4ef7f4000 r--p 0006a000 00:1d 43533 /tmp/async-profiler-3.0-e6e0494-linux-x64/lib/libasyncProfiler.so
7fc4ef7f4000-7fc4ef7f5000 rw-p 0006c000 00:1d 43533 /tmp/async-profiler-3.0-e6e0494-linux-x64/lib/libasyncProfiler.so
A note on the setup. Our JVM is prefixed with a custom ld.so
for locating native libs that are deployed for JNI usage and not part of the program's jar.
/path/to/ld.so /path/to/java <args>
Another note is that we only see this on certain deployments. Some services can be successfully profiled where as a handful of others experience this issue. We see this issue across VM versions, both GraalVM & OpenJDK and on different version 11, 17 and 21. The one I'm focusing on in this bug report is a vanilla Java 17, Temurin VM.
If you have suggestions I can provide additional debug output. Thanks!
Expected vs. actual behavior
I expect that asprof would be able to attach and run a profile successfully
Reproduction Steps
Simply running a profile will trigger it
./bin/asprof collect -d 10s <pid>
Additional Information/Context
No response
Async-profiler version
async-profiler-3.0-e6e0494-linux-x64
Environment details
Linux 6.9.0 x86_64 GNU/Linux
openjdk 17.0.4 2022-07-19
OpenJDK Runtime Environment Temurin-17.0.4+8 (build 17.0.4+8)
OpenJDK 64-Bit Server VM Temurin-17.0.4+8 (build 17.0.4+8, mixed mode, sharing)