KEMBAR78
Add custom stacktrace renderer which is length limit aware by jack-berg · Pull Request #7281 · open-telemetry/opentelemetry-java · GitHub
Skip to content

Conversation

@jack-berg
Copy link
Member

@jack-berg jack-berg commented Apr 16, 2025

Replace Throwable#printStackTrace(PrintStream) exception stacktrace rendering with a custom implementation which is aware of attribute length limits, and exits early to avoid unnecessary work. The result is significantly improved memory and cpu, with gains proportional to the difference between the total stack trace length and the attribute length limit. I.e. if you have a really long stack trace (i.e. 100k+ chars) and relatively small length limit (i.e. 1k chars), you'll have more to gain with this improvement.

Benchmark results:

Benchmark                                            (exceptionParam)  (lengthLimit)  (renderer)  Mode  Cnt      Score      Error   Units
StacktraceRenderBenchmark.render                               SIMPLE             10         JDK  avgt    5   2241.467 ±   60.057   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                 SIMPLE             10         JDK  avgt    5   8101.709 ±  219.657  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm            SIMPLE             10         JDK  avgt    5  19064.012 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                      SIMPLE             10         JDK  avgt    5      4.000             counts
StacktraceRenderBenchmark.render:gc.time                       SIMPLE             10         JDK  avgt    5      6.000                 ms
StacktraceRenderBenchmark.render                               SIMPLE             10      CUSTOM  avgt    5     52.201 ±    2.985   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                 SIMPLE             10      CUSTOM  avgt    5   4379.620 ±  255.944  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm            SIMPLE             10      CUSTOM  avgt    5    240.000 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                      SIMPLE             10      CUSTOM  avgt    5      2.000             counts
StacktraceRenderBenchmark.render:gc.time                       SIMPLE             10      CUSTOM  avgt    5      3.000                 ms
StacktraceRenderBenchmark.render                               SIMPLE           1000         JDK  avgt    5   2201.715 ±   12.929   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                 SIMPLE           1000         JDK  avgt    5   8679.632 ±   52.630  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm            SIMPLE           1000         JDK  avgt    5  20064.011 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                      SIMPLE           1000         JDK  avgt    5      5.000             counts
StacktraceRenderBenchmark.render:gc.time                       SIMPLE           1000         JDK  avgt    5      8.000                 ms
StacktraceRenderBenchmark.render                               SIMPLE           1000      CUSTOM  avgt    5    682.286 ±   21.344   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                 SIMPLE           1000      CUSTOM  avgt    5  11936.384 ±  370.103  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm            SIMPLE           1000      CUSTOM  avgt    5   8552.004 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                      SIMPLE           1000      CUSTOM  avgt    5      6.000             counts
StacktraceRenderBenchmark.render:gc.time                       SIMPLE           1000      CUSTOM  avgt    5      8.000                 ms
StacktraceRenderBenchmark.render                               SIMPLE         100000         JDK  avgt    5   2345.609 ±  228.133   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                 SIMPLE         100000         JDK  avgt    5   7727.529 ±  734.088  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm            SIMPLE         100000         JDK  avgt    5  19024.012 ±    0.002    B/op
StacktraceRenderBenchmark.render:gc.count                      SIMPLE         100000         JDK  avgt    5      4.000             counts
StacktraceRenderBenchmark.render:gc.time                       SIMPLE         100000         JDK  avgt    5      7.000                 ms
StacktraceRenderBenchmark.render                               SIMPLE         100000      CUSTOM  avgt    5   1208.501 ±  382.039   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                 SIMPLE         100000      CUSTOM  avgt    5  11661.313 ± 3568.475  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm            SIMPLE         100000      CUSTOM  avgt    5  14720.006 ±    0.002    B/op
StacktraceRenderBenchmark.render:gc.count                      SIMPLE         100000      CUSTOM  avgt    5      6.000             counts
StacktraceRenderBenchmark.render:gc.time                       SIMPLE         100000      CUSTOM  avgt    5     10.000                 ms
StacktraceRenderBenchmark.render                              COMPLEX             10         JDK  avgt    5   3885.420 ± 1011.756   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                COMPLEX             10         JDK  avgt    5   5556.220 ± 1337.094  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm           COMPLEX             10         JDK  avgt    5  22592.020 ±    0.006    B/op
StacktraceRenderBenchmark.render:gc.count                     COMPLEX             10         JDK  avgt    5      3.000             counts
StacktraceRenderBenchmark.render:gc.time                      COMPLEX             10         JDK  avgt    5      4.000                 ms
StacktraceRenderBenchmark.render                              COMPLEX             10      CUSTOM  avgt    5     51.126 ±    2.174   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                COMPLEX             10      CUSTOM  avgt    5   4471.278 ±  187.728  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm           COMPLEX             10      CUSTOM  avgt    5    240.000 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                     COMPLEX             10      CUSTOM  avgt    5      2.000             counts
StacktraceRenderBenchmark.render:gc.time                      COMPLEX             10      CUSTOM  avgt    5      3.000                 ms
StacktraceRenderBenchmark.render                              COMPLEX           1000         JDK  avgt    5   4198.255 ± 2681.956   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                COMPLEX           1000         JDK  avgt    5   5427.646 ± 2823.204  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm           COMPLEX           1000         JDK  avgt    5  23592.022 ±    0.016    B/op
StacktraceRenderBenchmark.render:gc.count                     COMPLEX           1000         JDK  avgt    5      3.000             counts
StacktraceRenderBenchmark.render:gc.time                      COMPLEX           1000         JDK  avgt    5      4.000                 ms
StacktraceRenderBenchmark.render                              COMPLEX           1000      CUSTOM  avgt    5    695.620 ±  176.546   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                COMPLEX           1000      CUSTOM  avgt    5  11713.686 ± 2748.314  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm           COMPLEX           1000      CUSTOM  avgt    5   8528.004 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                     COMPLEX           1000      CUSTOM  avgt    5      6.000             counts
StacktraceRenderBenchmark.render:gc.time                      COMPLEX           1000      CUSTOM  avgt    5      9.000                 ms
StacktraceRenderBenchmark.render                              COMPLEX         100000         JDK  avgt    5   4515.629 ± 3389.181   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                COMPLEX         100000         JDK  avgt    5   4890.416 ± 3325.991  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm           COMPLEX         100000         JDK  avgt    5  22552.023 ±    0.016    B/op
StacktraceRenderBenchmark.render:gc.count                     COMPLEX         100000         JDK  avgt    5      3.000             counts
StacktraceRenderBenchmark.render:gc.time                      COMPLEX         100000         JDK  avgt    5      5.000                 ms
StacktraceRenderBenchmark.render                              COMPLEX         100000      CUSTOM  avgt    5   1937.578 ±   26.976   ns/op
StacktraceRenderBenchmark.render:gc.alloc.rate                COMPLEX         100000      CUSTOM  avgt    5   8356.442 ±  115.892  MB/sec
StacktraceRenderBenchmark.render:gc.alloc.rate.norm           COMPLEX         100000      CUSTOM  avgt    5  17000.010 ±    0.001    B/op
StacktraceRenderBenchmark.render:gc.count                     COMPLEX         100000      CUSTOM  avgt    5      5.000             counts
StacktraceRenderBenchmark.render:gc.time                      COMPLEX         100000      CUSTOM  avgt    5      7.000                 ms

@jack-berg jack-berg requested a review from a team as a code owner April 16, 2025 15:02
assertThat(new StackTraceRenderer(throwable, 1000).render())
.isEqualTo(jdkStackTrace(throwable, 1000));
assertThat(new StackTraceRenderer(throwable, Integer.MAX_VALUE).render())
.isEqualTo(jdkStackTrace(throwable, Integer.MAX_VALUE));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this test: for a list of exceptions render each at a variety of length limits, and compare the result to the built-in JDK stack trace renderer.

return value;
}

public static void addExceptionAttributes(
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I've considered is retaining some sort of system property / env var to fallback to the built-in JDK exception rendering. Would offer a nice escape hatch to revert to the built-in jdk rendering in case there's any bugs in the code, but downside is it would be hard to know when it would be safe to finally remove.

@codecov
Copy link

codecov bot commented Apr 16, 2025

Codecov Report

Attention: Patch coverage is 88.50575% with 10 lines in your changes missing coverage. Please review.

Project coverage is 89.75%. Comparing base (5e50aa7) to head (a0ded5e).
Report is 60 commits behind head on main.

Files with missing lines Patch % Lines
...opentelemetry/sdk/internal/StackTraceRenderer.java 86.30% 4 Missing and 6 partials ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #7281      +/-   ##
============================================
- Coverage     89.77%   89.75%   -0.03%     
- Complexity     6979     7002      +23     
============================================
  Files           797      798       +1     
  Lines         21165    21240      +75     
  Branches       2056     2071      +15     
============================================
+ Hits          19001    19063      +62     
- Misses         1503     1509       +6     
- Partials        661      668       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

private static final DefaultExceptionAttributeResolver INSTANCE =
new DefaultExceptionAttributeResolver();
new DefaultExceptionAttributeResolver(
Boolean.parseBoolean(ConfigUtil.getString(ENABLE_JVM_STACKTRACE_PROPERTY, "false")));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added new config to allow toggling between the naive JVM rendering logic and the new limits aware stacktrace rendering logic.

Given the solid testing in StackTraceRendererTest, I think we should default to the new limits aware rendering logic.

@jack-berg
Copy link
Member Author

@open-telemetry/java-approvers please take a look - I'd like to get this in for the 7/11/25 release.

Copy link
Contributor

@jkwatson jkwatson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible to add a little coverage on the uncovered bits? Otherwise, :shipit:

@jack-berg
Copy link
Member Author

I think the lack of coverage is a bit of a false positive. The key to hitting those uncovered bits to truncate exception stacktraces in all sorts of odd places. This test does that by introducing a random length limit as a parameter of the test. I think the coverage just didn't happen to randomly choose a limit which exercises those bits, but with enough iterations it should.

@jack-berg jack-berg merged commit 4821c37 into open-telemetry:main Jul 10, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants