-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Detect accelerator type when backend is not specified #142216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
[ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/142216
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit bf1aec7 with merge base 61dc5e9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Today, when user does `init_process_group()`, without `backend` or `device_id` specification, we would auto-translate it into `cuda:nccl,cpu:gloo`. The idea was to initialize all **default** backends to cover what the user may do later. A side effect is increase of initialization time and resources. This PR changes it to detecting the accelerator type on the machine, and initialize only the backend for that accelerator. Pull Request resolved: pytorch#142216 Approved by: https://github.com/wconstab, https://github.com/XilunWu
Update doc to reflect change brought by #142216 cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o [ghstack-poisoned]
Update doc to reflect change brought by #142216 cc H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o [ghstack-poisoned]
Update doc to reflect change brought by #142216 Pull Request resolved: #142404 Approved by: https://github.com/XilunWu
inconsistent with the logic introduced in #162157 and modified in #142216.This update ensures the documentation matches the actual behavior of the code. Pull Request resolved: #162158 Approved by: https://github.com/wconstab
inconsistent with the logic introduced in pytorch#162157 and modified in pytorch#142216.This update ensures the documentation matches the actual behavior of the code. Pull Request resolved: pytorch#162158 Approved by: https://github.com/wconstab
inconsistent with the logic introduced in pytorch#162157 and modified in pytorch#142216.This update ensures the documentation matches the actual behavior of the code. Pull Request resolved: pytorch#162158 Approved by: https://github.com/wconstab
inconsistent with the logic introduced in pytorch#162157 and modified in pytorch#142216.This update ensures the documentation matches the actual behavior of the code. Pull Request resolved: pytorch#162158 Approved by: https://github.com/wconstab
inconsistent with the logic introduced in pytorch#162157 and modified in pytorch#142216.This update ensures the documentation matches the actual behavior of the code. Pull Request resolved: pytorch#162158 Approved by: https://github.com/wconstab
inconsistent with the logic introduced in pytorch#162157 and modified in pytorch#142216.This update ensures the documentation matches the actual behavior of the code. Pull Request resolved: pytorch#162158 Approved by: https://github.com/wconstab
Stack from ghstack (oldest at bottom):
Today, when user does
init_process_group(), withoutbackendordevice_idspecification, we would auto-translate it intocuda:nccl,cpu:gloo. The idea was to initialize all default backends to cover what the user may do later.A side effect is increase of initialization time and resources.
This PR changes it to detecting the accelerator type on the machine, and initialize only the backend for that accelerator.
cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o