KEMBAR78
Special handling of prologue and epilogue of compiled methods with cstack=vm by apangin · Pull Request #1449 · async-profiler/async-profiler · GitHub
Skip to content

Conversation

apangin
Copy link
Member

@apangin apangin commented Aug 22, 2025

Description

Add special cases in VMStructs-based stack walker to unwind partially constructed top frame in the method prologue or epilogue.

This PR focues on x86-64 and AArch64 only.

Related issues

Resolves #1444, #1441

Motivation and context

This is an important milestone towards making cstack=vm the default unwinding mode.

In HotSpot JVM, the majority of JIT compiled methods have common prologue and epilogue. Prologue includes stack overflow check, frame setup and nmethod entry barrier (JDK 21+). Epilogue pops frame off the stack, performs safepoint poll or stack watermark check and returns from the method. Prologue and epilogue are usually most problematic for stack unwinding, since they deal with transitional stack states where the topmost frame is in the middle of construction or destruction. This PR aims to detect when a sample falls in the method prologue or epilogue and, depending on the currently executed instruction, unwind the top frame accordingly.

How has this been tested?

  • Added --cstack vm option to all "recovery" tests.
  • Manually ran spring-petclinic and watched the number of "unknown" and "break_compiled" frames.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@apangin apangin force-pushed the prologue-epilogue branch from 4361a02 to ca3c501 Compare August 23, 2025 13:13
instruction_t* ip = (instruction_t*)pc;
instruction_t* entry = (instruction_t*)nm->entry();
if (ip <= entry) {
pc = link();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been wondering about this as I've seen link() recovery in many locations,
Shouldn't we add a restriction to only allow this recovery strategy on the top frame (depth == 0)?

It's not a blocker for this as this is already done in many other locations, I'm just wondering if we should add this additional restriction in future changes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair point - link register is only used to unwind the topmost frame. But the corresponding instruction sequence never occurs in the intermediary frames.

// cmp sp, x8, followed by ret
return true;
} else if ((ip[0] & 0xff000010) == 0x54000000 && ip[1] == 0xd65f03c0) {
// b.cond, followed by ret
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: in the comment on JDK 17+ is written as b.hi offset but here it's b.cond

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. But there is no contradiction: the expression here checks for any conditional branch, not only b.hi.

} else if ((ip[0] & 0x9f00001f) == 0x90000008 && (ip[-1] & 0xff8003ff) == 0x910003ff) {
// adrp x8, preceded by add sp
return true;
} else if (ip[0] == 0xeb2863ff && ip[2] == 0xd65f03c0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also check for the b.cond at ip[1]?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking ret is enough: these two instructions unambiguously identify the epilogue sequence. Checking ip[1] would be also fine, but I prefer to keep the number of comparisons minimal.

out = p.profile("-d 2 -e cpu -i 1ms --cstack vm -o collapsed");
Assert.isGreater(out.ratio("Numbers.main;test/recovery/Numbers.loop"), 0.8);
Assert.isGreater(out.ratio("Numbers.main;test/recovery/Numbers.loop;test/recovery/Numbers.avg"), 0.5);
Assert.isLess(out.ratio("unknown|break_compiled"), 0.002);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about extracting this unknown|break_compiled to a constant & add other errors to it such as

  • break_entry_frame
  • break_interpreted
  • break_deopt

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR only improves unwinding of JIT compiled frames, it does not touch interpreted or entry frames. Some interpreter edge cases are not covered yet - they will be addressed separately.

As to the constant, I think it won't make code more maintainable here.

@apangin apangin merged commit 5454c9b into master Aug 28, 2025
43 checks passed
@apangin apangin deleted the prologue-epilogue branch August 28, 2025 08:43
jbachorik pushed a commit to DataDog/async-profiler that referenced this pull request Sep 4, 2025
jbachorik pushed a commit to DataDog/async-profiler that referenced this pull request Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

cstack=vm should collect stacktraces at method epilogue

2 participants