Reduce resonance between methods in CPU smoke test #1465

Baraa-Hasheesh · 2025-09-08T06:49:35Z

Description

It was observed that it's possible for either method1 or method2 not to be sampled in certain runs
When checking the logs it was noticed that the sample distribution was 50% on method3 & the other 50% either method1 or method2.

This change makes the loop counter for each method unique to avoid possible synchronization between them & method3

Related issues

N/A

Motivation and context

reduce the possibility of coordination of timing between the methods

How has this been tested?

Test was re-run 100s of times on various systems including ones where the failure was observed

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

fandreuz · 2025-09-08T10:16:17Z

test/test/smoke/Cpu.java


    private static void method1() {
-        for (int i = 0; i < 1000000; i++) {
+        for (int i = 0; i < 314159; i++) {


Why does this number improve the likelihood of getting samples in all methods?

If there's a resonance problem, how about we let all three run in parallel?

it reduces possible resonance hence reducing the possibility of all 3 methods synchronizing hence increasing the probability of sampling all

As discussed with @apangin internally that would defeat the intention of the test, which is that hot methods should be sampled even if the duration of each call is less than the sampling duration

Running all 3 in parallel would make them in practice have higher run time than the sampling duration & completely change the nature of the test from a single threaded app to a multi threaded one

Could you explain how you choose the numbers?

Discussed in person, thanks

apangin · 2025-09-08T10:32:36Z

What about method3? Should it's duration also be changed? (maybe 3 or 7 milliseconds)

Baraa-Hasheesh · 2025-09-08T10:37:29Z

What about method3? Should it's duration also be changed? (maybe 3 or 7 milliseconds)

Sure let me re-run the tests with the new duration to make sure everything is working as expected

Baraa-Hasheesh · 2025-09-08T11:18:53Z

What about method3? Should it's duration also be changed? (maybe 3 or 7 milliseconds)

@apangin neither values of 7 & 3 are stable with the current iteration counts, I would recommend keeping it as 10

apangin · 2025-09-08T13:42:06Z

What does it mean it's not stable? And how does 10ms duration make it "more stable", given that it correlates with the sampling interval?

Baraa-Hasheesh · 2025-09-08T14:03:34Z

What does it mean it's not stable? And how does 10ms duration make it "more stable", given that it correlates with the sampling interval?

@apangin I test the stability of a given option by simply re-running the same test 100s of times, then measure how likely it's to fail

In the case of using 7,3ms the failure rate was quite high,
Both never managed to go a full 100 runs without failing

As for why 10ms is more stable, it would require an actual full analysis of how the CPU executes the java byte code & create some models to predict when samples would happen, & I honestly didn't see the value of going through that given that the current numbers were observed to be quite stable on multiple systems

krk · 2025-09-08T14:25:01Z

Are these values tuned to a specific CPU type/size?

Baraa-Hasheesh · 2025-09-08T14:27:35Z

Are these values tuned to a specific CPU type/size?

I tried to run this changes across multiple CPU types & sizes & didn't observe any instabilities on the configuration I ran against

apangin · 2025-09-09T20:28:13Z

neither values of 7 & 3 are stable with the current iteration counts, I would recommend keeping it as 10

This is contrary to the declared intention of this PR, which is supposed to reduce correlation with the sampling interval. You are saying that the test is more stable when method3 runs for 10ms, which is exactly the sampling interval. This sounds counterintuitive.

I don't mind skipping time consuming analysis if test behavior matches expectation, but here I see the opposite. I've always said that the purpose of a test is not to "pass", but to validate assumptions. My assumption here is, if a test runs 3 methods for a comparable amount of time each, all 3 methods should be visible in a profile. Where am I wrong? What makes you sure that it's a test issue rather than a profiler bug or a JVM bug?

Baraa-Hasheesh · 2025-09-10T07:48:44Z

This is contrary to the declared intention of this PR, which is supposed to reduce correlation with the sampling interval

The intention here is reduce possible synchronization of all 3 methods together as a single unit with the sampling interval.
Not to approach each method individually

I would describe as reducing the chance that all 3 methods come to an agreement to actively prevent one of them from being sampled from the profiler by coordinating their run time.

My assumption here is, if a test runs 3 methods for a comparable amount of time each, all 3 methods should be visible in a profile

This is a fair assumption but kinda goes against what the test actually runs, as the methods do not strictly run for comparable time to one another in a controlled manner,

Method1 & 2 can take different times to complete depending on the system they run on (CPU resources, JDK version & optimizations)
While method3 is written with a strict minimum run time (10ms Currently)

For example let's assume we change method3 to run for 3ms & method2/1 have a combined run time of 7ms on a random system. (total run time will be 3 + 7 = 10ms = sampling duration)
We would end-up with all 3 methods (as a single unit) having a synchronization against the 10ms sampling time which might cause sampling issues.

What makes you sure that it's a test issue rather than a profiler bug or a JVM bug

If it's a JVM bug it would have been viewable in the generated byte code, (For example one method is completely optimized out)
As for If it's a AGCT bug, that's quite unlikely.

Technically speaking it's a limitation of the profiler due to the realities of how sampling is collected.
=> The profiler collects the samples via a time based approach (for CPU samples), so it's possible for that approach to produce some biased sampling results for strictly periodic/consistent work loads as we do here in this test.

While this is is indeed a limitation of the profiler, I wouldn't personally consider it a bug per say

Now, regarding this test,
It does expose a real limitation of the profiler when profiling strictly periodic/consistent work loads
This issue is indeed true & can happen on all profiling engines/modes of the profiler
=> For example, nativemem tracing with interval of 10000bytes, & having two methods back to back running in a loop with methodA allocating 9999bytes & methodB allocating 1 byte (MethodA will never be sampled)

If this is to be considered a bug/limitation that needs to be fixed, that would need to be discussed in depth & don't believe that this PR is the proper place to handle that discussion.

For this test in particular to reduce the chance of failures that may happen due to possible biased sampling of periodic/consistent work loads we should re-write this test to no longer be strictly periodic/consistent,
In other words introduce some randomness to shift the run time for each method at each iteration

IMO that would make it a better smoke test for async-profiler, as most real applications don't have strictly periodic/consistent loads,
& would reduce the possibility of time bias for the whole work load as a single unit.

What are you thoughts about this?

Signed-off-by: Bara' Hasheesh <bara.hasheesh@gmail.com>

Reduce resonance between methods in CPU smoke test

bfd7d40

fandreuz reviewed Sep 8, 2025

View reviewed changes

Baraa-Hasheesh marked this pull request as draft September 10, 2025 16:10

nanotime

3fb9e8c

Baraa-Hasheesh marked this pull request as ready for review September 17, 2025 16:19

compare non float

b6f0905

Signed-off-by: Bara' Hasheesh <bara.hasheesh@gmail.com>

apangin merged commit 07b3e74 into async-profiler:master Sep 17, 2025
20 checks passed

krk pushed a commit to krk/async-profiler that referenced this pull request Sep 22, 2025

Eliminate period bias in CPU smoke test (async-profiler#1465)

361af18

apangin mentioned this pull request Oct 2, 2025

Consider replacing currentTimeMillis with nanoTime #1495

Closed

Reduce resonance between methods in CPU smoke test #1465

Reduce resonance between methods in CPU smoke test #1465

Uh oh!

Conversation

Baraa-Hasheesh commented Sep 8, 2025

Description

Related issues

Motivation and context

How has this been tested?

Uh oh!

fandreuz Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Baraa-Hasheesh Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

fandreuz Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fandreuz Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

apangin commented Sep 8, 2025

Uh oh!

Baraa-Hasheesh commented Sep 8, 2025

Uh oh!

Baraa-Hasheesh commented Sep 8, 2025

Uh oh!

apangin commented Sep 8, 2025

Uh oh!

Baraa-Hasheesh commented Sep 8, 2025

Uh oh!

krk commented Sep 8, 2025

Uh oh!

Baraa-Hasheesh commented Sep 8, 2025

Uh oh!

apangin commented Sep 9, 2025

Uh oh!

Baraa-Hasheesh commented Sep 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fandreuz Sep 8, 2025 •

edited

Loading