KEMBAR78
Add --nostop to always start profiler recording even when --ttsp or --start/--end are used by openorclose · Pull Request #1046 · async-profiler/async-profiler · GitHub
Skip to content

Conversation

openorclose
Copy link
Contributor

@openorclose openorclose commented Nov 5, 2024

Description

  • Added --nostop to always start profiler recording even when --ttsp or --start/--end are used

Related issues

#833

Motivation and context

In particular, this allows us to continue recording events in --ttsp mode

How has this been tested?

Added a test for the original ttsp mode and also the newly added flag.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@openorclose openorclose closed this Nov 5, 2024
@openorclose openorclose deleted the ttsp branch November 5, 2024 15:45
@openorclose openorclose restored the ttsp branch November 5, 2024 15:46
@openorclose openorclose reopened this Nov 5, 2024
_mcache = value == NULL ? 1 : (unsigned char)strtol(value, NULL, 0);

CASE("recordonly")
_start_alongside_traps = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to search and follow the logic, when asprof argument, agent argument and the variable in the code are all named differently. Let's use the same name if possible.
My suggestion is nostop (similarly to nostop option for handling signals in gdb), but other suggestions are welcome.
Consequently, arguments and the variable name will be nostop, --nostop, _nostop.

Comment on lines 80 to 82
" --timebetween a b record when a is executed, and then when b is executed, and duration between\n"
" --onlyttsp only time-to-safepoint profiling \n"
" --includettsp profile as per normal, but include time-to-safepoint information\n"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you wanted to shortcut certain use cases, but having many options to do the same thing causes unnecessary ambiguity or even confusion, considering that asprof arguments do not match agent arguments.
If we add just one new option (e.g. --nostop as mentioned above), this will be unambiguous, will match agent argument, and will cover all use cases relatively simply. Moreover, it will remain backward compatible.
IMO, --ttsp --nostop is not much worse than --includettsp.

I would probably agree with additional --window A B option to record profiler.Window event between execution of A and B, but this is a quite rare use cases, so it does not deserve a separate option, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will use nostop for now, anyone else feel free to suggest others

params << ",begin=SafepointSynchronize::begin,end=RuntimeService::record_safepoint_synchronized";

} else if (arg == "--includettsp") {
fprintf(stderr, "ZZZZ");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug leftover.

public void includettsp(TestProcess p) throws Exception {
p.profile("-d 5 --includettsp -f %f.jfr");
if (!containsSamplesOutsideWindow(p)) {
throw new RuntimeException("Expected to find samples outside of ttsp window, but did not find any");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

}
}

private boolean containsSamplesOutsideWindow(TestProcess p) throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be shortened and simplified a bit.

  1. events list is redundant; we may collect profiler windows and execution samples in the first pass.
  2. Profiler windows can be stored in a TreeMap as startTime->endTime. This will provide sorting and lookup for the closest window out of the box.
  3. Search loop can be simplified with stream().anyMatch().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

import java.io.BufferedOutputStream;
import java.io.FileOutputStream;

class Ttsp {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example needs some comments. How does it expose long TTSP issue? Why getAllStackTraces() is called in a loop? Writing 1GB files in a loop may not be a good thing to do in a test - IO is not what we are testing here. Furthermore, not all systems may even have a spare gigabyte in /tmp volume.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I adapted this example from an online article regarding debugging ttsp issues, will rewrite with explanation

@apangin
Copy link
Member

apangin commented Nov 12, 2024

Thank you for the PR.
The feature is indeed useful, but I think it can be implemented in a simpler way - added some comments inline.

@openorclose openorclose changed the title Allow recording windows between a start and stop function Add --nostop to always start profiler recording even when --ttsp or --start/--end are used Dec 6, 2024
apangin and others added 2 commits December 9, 2024 00:20
Signed-off-by: Andrei Pangin <1749416+apangin@users.noreply.github.com>
static private byte loop() {
byte[] byteArray = new byte[1024 * 1024 * 1024];
for (int i = 0; i < 10000; i++) {
new Random().nextBytes(byteArray);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Random.nextBytes contains a loop with a safepoint check - it does not cause long time-to-safepoint pauses.

public static void main(String[] args) throws Exception {
new Thread(() -> {
while (true) {
System.gc();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadMXBean.dumpAllThreads is a much cheaper way to force global safepoint.

}
}).start();

Thread.sleep(1000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on Thread.sleep in tests in error-prone.

new Thread(() -> {
while (true) {
loop();
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A loop in this example is not very useful. The only long ttsp pause happens the very first time when allocating new byte[1024 * 1024 * 1024] array; there are no ttsp pauses afterwards.

@apangin
Copy link
Member

apangin commented Dec 9, 2024

Changes look good, except for the test, which does not really demonstrate time-to-safepoint pauses.
I'll merge this PR and push a revised test afterwards.

@apangin apangin merged commit d7ab642 into async-profiler:master Dec 9, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants