-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[dist] expose unsafe_get_ptr for dist.ProcessGroupNCCL.NCCLConfig #161136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161136
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 8f1a9e5 with merge base 7f201ba ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, in Python (in particular CPython), you can always get an address of a python object by the built-in id(obj) method.
torch/csrc/distributed/c10d/init.cpp
Outdated
| .def_readwrite("nvls_ctas", &ncclConfig_t::nvlsCTAs) | ||
| #endif | ||
| .def_property_readonly( | ||
| "ptr", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind naming it as unsafe_get_ptr? The reason is that in line 3382 above the object is created via std::make_unique thus this object could disappear after one gets the inner pointer. (With the rename, this entry would probably look more like a method than a property.)
torch/csrc/distributed/c10d/init.cpp
Outdated
| .def_property_readonly( | ||
| "ptr", | ||
| [](const ncclConfig_t& self) { | ||
| return reinterpret_cast<int64_t>(&self); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use uintptr_t instead of int64_t?
torch/_C/_distributed_c10d.pyi
Outdated
| cga_cluster_size: int | ||
| min_ctas: int | ||
| max_ctas: int | ||
| ptr: int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need if we make it a unsafe_get_ptr method instead of property.
|
@kwen2501 updated according to your comments. |
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…torch#161136) expose the pointer so that we can create the `ncclConfig_t` object from pytorch and use it elsewhere. this is useful to control the nccl communicator parameters for multiple nccl communicators. Pull Request resolved: pytorch#161136 Approved by: https://github.com/kwen2501
expose the pointer so that we can create the
ncclConfig_tobject from pytorch and use it elsewhere. this is useful to control the nccl communicator parameters for multiple nccl communicators.cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta