KEMBAR78
[Profiler] Add GC Events to Python Stack Tracer by sraikund16 · Pull Request #161209 · pytorch/pytorch · GitHub
Skip to content

Conversation

@sraikund16
Copy link
Contributor

@sraikund16 sraikund16 commented Aug 21, 2025

Summary:
Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Test Plan:
Ran trace with GC induced and saw it on trace

Also added a test

Rollback Plan:

Differential Revision: D80491146

@pytorch-bot
Copy link

pytorch-bot bot commented Aug 21, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161209

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 270799a with merge base 97200c9 (image):

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 21, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

@sraikund16 sraikund16 added release notes: profiler release notes category feature A request for a proper, new feature. topic: new features topic category and removed feature A request for a proper, new feature. labels Aug 21, 2025
sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:
Pull Request resolved: pytorch#161209

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:
Pull Request resolved: pytorch#161209

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 21, 2025
Summary:
Pull Request resolved: pytorch#161209

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@sraikund16 sraikund16 force-pushed the export-D80491146 branch 2 times, most recently from b9d10a8 to 6d3272e Compare August 22, 2025 16:49
sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 22, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 22, 2025
Summary:
Pull Request resolved: pytorch#161209

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 22, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 22, 2025
Summary:
Pull Request resolved: pytorch#161209

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
sraikund16 added a commit to sraikund16/pytorch that referenced this pull request Aug 22, 2025
Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

Summary:

Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Future Work: Add more metadata to the GC event

Test Plan:
Ran trace with GC induced and saw it on trace. https://www.internalfb.com/intern/perfdoctor/trace_view?filepath=tree%2Ftraces%2Fdynocli%2Fdevgpu003.rva5.facebook.com%2Frank-0.Aug_21_13_52_27.3127794.pt.trace.json.gz&bucket=gpu_traces

Also added a test

Rollback Plan:

Reviewed By: ngimel

Differential Revision: D80491146
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D80491146

@sraikund16
Copy link
Contributor Author

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Summary:
Adds Python Garbage Collection to Kineto Traces and Profiler FunctionEvents. Create custom cpp callback in profiler_python.cpp. Then define a python function with cpp and register that callback for all python garbage collection. We don't worry about thread safety in this case because we are only doing init/teardown for main thread while holding GIL.

Currently we are hiding this behind experimental config because python tracing tends to be unstable especially when adding any new feature. If this is found to not add too much overhead we can set this to on by default. NOTE: To enable this you need both with_stack=True and the experimental config on!

Test Plan:
Ran trace with GC induced and saw it on trace

Also added a test

Rollback Plan:

Differential Revision: D80491146

Pull Request resolved: pytorch#161209
Approved by: https://github.com/ngimel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request fb-exported Merged release notes: profiler release notes category topic: new features topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants