- 
                Notifications
    You must be signed in to change notification settings 
- Fork 5.2k
[Mono]: Add dotnet-gcdump support. #88634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Mono]: Add dotnet-gcdump support. #88634
Conversation
Switch to calculate live keywords and heap dump trigger. Make Mono profiler provider disabled by default (experimental provider).
Issue introduced by old commit, dotnet@136dfcd Case where string len is 0 would fail the utf8 -> utf16 conversion that in case would fail emitting any EventPipe event doing the conversion. Mono only issue since CoreClr uses different code in generated eventpipehepler source file. .net7 backport candidate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I primarily concentrated on the new heap dump code and assumed that move of the code that moved between files didn't change
| I guess my one comment is to be careful about assuming that  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except read/write
| 
 Thanks, will fix, I overlooked it since its async safe, but could fail for other reasons that needs to re-write or re-read until all data is written or read. We have similar loops elsewhere so I will just adopt one of existing implementations. | 
| @davmason, will try to get this merged by tomorrow, if you want to take a look before that it would be great. The majority of this is Mono only changes, but there is a small extension to session type, a new field has been added, indicating that the streaming thread has started (none streaming sessions will set it just before returning from ep_session_start_streaming). We need it on Mono side to make sure we give sessions a decent chance to start the streaming thread, before we fire the gc start event and perform a STW as part of foreground GC (suspending all managed threads, including the streaming threads on Mono). Unfortunately tools like dotnet-gcdump depends on this event being delivered (and not cached in buffer manager nor in TraceEvent library cache) within 5 seconds (normally not a problem), but if we get to STW before streaming thread gets a chance to start, start will hang in buffer manager until complete GC is done and we start to write out the other GC events. | 
| Test failures are either known issues or failing on other PR's as well. | 
Subtle changes like: dotnet@2f2aaae Knocked out a couple of rundown events on Mono that was detected and fixed in dotnet#88634. It would been ideal to catch this issue as part of CI, but turns out that our existing rundownvalidation test only tests a couple of rundown events, and not any of the module/assembly events that was affected by above change. This commit adds validation that the following additional rundown events are included in the rundown event stream: RuntimeStart MethodDCStopInit MethodDCStopComplete LoaderModuleDCStop LoaderDomainModuleDCStop LoadAssemblyDCStop LoadAppDomainDCStop
Subtle changes like: 2f2aaae Knocked out a couple of rundown events on Mono that was detected and fixed in #88634. It would been ideal to catch this issue as part of CI, but turns out that our existing rundownvalidation test only tests a couple of rundown events, and not any of the module/assembly events that was affected by above change. This commit adds validation that the following additional rundown events are included in the rundown event stream: RuntimeStart MethodDCStopInit MethodDCStopComplete LoaderModuleDCStop LoaderDomainModuleDCStop LoadAssemblyDCStop LoadAppDomainDCStop
Add support for
dotnet-gcdumpon Mono.The bulk of the change in
ep-rt-mono.cis splitting the EventPipe provider implementation ofMicrosoft-DotNETRuntimeMonoProfilerandMicrosoft-Windows-DotNETRuntimeinto separate source files.New GC events consumed by
dotnet-gcdumphas been added to the newep-rt-mono-runtime-provider.csource file.PR disable experimental EventPipe mono profiler provider by default to preserve resources. It is possible to re-enable it by setting the following env variable,
MONO_DIAGNOSTICS=--diagnostic-mono-profiler=enabledotnet-gcdumpdepends on event order and that events gets emitted to EventPipe session afterGCStop, ordotnet-gcdumpwill block on read, timeout and closing session that in turn will triggering flush of TraceEvent event cache. This is less of a problem on CoreCLR since it emits many more GC events compared to Mono's implementation (just emitting the needed events used bydotnet-gcdump), but underlying issues is that there is no way to make sure an event emitted into EventPipe will be seen by client, even if its getting flushed into IPC transport on runtime side. This happens because innettrace v4only first event after a sequence point will be directly delivered to client through TraceEvent library, rest of events will be cached until next sequence point is hit and then sorted delivered to TraceEvent client, but if there are no more events emitted into the session, then client will block on read, even if there are events in the TraceEvent client side cache.If we are going to support start/stop event like expected by
dotnet-gcdumpmore deterministically we would need to emit sequence points into EventPipe sessions acting like "barriers" to make sure a specific event gets delivered all the way to the client, pass TraceEvent cache. Until that happens, there is a small risk that tools likedotnet-gcdumpwon't see stop/end events triggering session timeout. If that happens the gcdump will still be complete, but user will need to wait fordotnet-gcdumpto timeout (default 30 seconds) before writing file to disk.