-
Notifications
You must be signed in to change notification settings - Fork 937
Description
Describe the bug
Profiles on Intel and AMD show full stack traces leading down to the md5_implCompressMB function; profiles on Arm show the md5_implCompressMB function directly under "all", with no other stack frames.
This is with cstack=vm, which I now use as the default after the resolution of #1325.
This happens on a variety of benchmarks; I'll pick a specific one for the example below.
Expected vs. actual behavior
I expect a stack trace like this one I got on AMD:
Instead I get one like this on Arm Graviton 4:
Reproduction Steps
Run the DaCapo luindex benchmark (large size). See #1325 for details on how to download the benchmark and provision EC2 instances.
Profile the benchmark using this command:
java -agentpath:async-profiler-4.0-f627b31-linux-arm64/lib/libasyncProfiler.so=start,cstack=vm,event=cpu,wall,file=profile.jfr -jar dacapo-23.11-MR2-chopin.jar luindex --size large -n 1
Convert JFR to flame graph with this command:
java -jar jfr-converter.jar --inverted --state default -o html profile.jfr profile.html
Observe the missing stack frames :)
Additional Information/Context
JFR files collected with async-profiler for AMD (m7a) and Arm (m8g):
dacapo-luindex-large-m7a.metal-48xl-profile.jfr.gz
dacapo-luindex-large-m8g.metal-24xl-profile.jfr.gz
Async-profiler version
async-profiler-4.0-f627b31-linux-arm64
Environment details
I always use nightly async-profiler, because the last stable release still crashes on Arm (see #1319). My scripts use the GitHub API to pull the most recent nightly at the time of the run.
OS: Ubuntu 24.04, using Amazon's standard image at the path /aws/service/canonical/ubuntu/server/24.04/stable/current/
JDK: OpenJDK24U-jdk_x64_linux_hotspot_24.0.1_9
CPU: m7a.metal (AMD EPYC 9R14), m8g.metal-24xl (Arm Neoverse-V2 Graviton 4)
The application is running on AWS EC2 metal instances, not in a container.