-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Add warning about removed sm50 and sm60 arches #158301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
torch/cuda/__init__.py
Outdated
) | ||
if current_arch < min_arch: | ||
warnings.warn( | ||
old_gpu_warn |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like incorrect_binary_warn
is never used. However its probably more accurate warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Just had one small suggestion.
Wait, these warning are saying two opposite things lol. Can we rationalize these messages to be more aligned with the state of the world:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, thanks!
@pytorchmergebot merge -f "lint is green" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Just adding a note that in future, we might want to re-evaluate for "cur_arch > max_arch" case, as there could be scenarios that the binary may be able to still support. But "cur_arch < min_arch" is definitely not supported. e.g. suppose we build sm up to sm80, running on sm86 would still work. Similarly along this line for sm120. |
@pytorchbot cherry-pick --onto release/2.8 -c critical |
Related to #157517 Detect when users are executing torch build with cuda 12.8/12.9 and running on Maxwell or Pascal architectures. We would like to include reference to the issue: #157517 as well as ask people to install CUDA 12.6 builds if they are running on sm50 or sm60 architectures. Test: ``` >>> torch.cuda.get_arch_list() ['sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120'] >>> torch.cuda.init() /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:263: UserWarning: Found <GPU Name> which is of cuda capability 5.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is 7.0. warnings.warn( /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:268: UserWarning: Support for Maxwell and Pascal architectures is removed for CUDA 12.8+ builds. Please see #157517 Please install CUDA 12.6 builds if you require Maxwell or Pascal support. ``` Pull Request resolved: #158301 Approved by: https://github.com/nWEIdia, https://github.com/albanD (cherry picked from commit fb731fe)
Cherry picking #158301The cherry pick PR is at #158478 and it is recommended to link a critical cherry pick PR with an issue. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
Add warning about removed sm50 and sm60 arches (#158301) Related to #157517 Detect when users are executing torch build with cuda 12.8/12.9 and running on Maxwell or Pascal architectures. We would like to include reference to the issue: #157517 as well as ask people to install CUDA 12.6 builds if they are running on sm50 or sm60 architectures. Test: ``` >>> torch.cuda.get_arch_list() ['sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120'] >>> torch.cuda.init() /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:263: UserWarning: Found <GPU Name> which is of cuda capability 5.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is 7.0. warnings.warn( /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:268: UserWarning: Support for Maxwell and Pascal architectures is removed for CUDA 12.8+ builds. Please see #157517 Please install CUDA 12.6 builds if you require Maxwell or Pascal support. ``` Pull Request resolved: #158301 Approved by: https://github.com/nWEIdia, https://github.com/albanD (cherry picked from commit fb731fe) Co-authored-by: atalman <atalman@fb.com>
@pytorchbot revert -m="Diff reverted internally" -c="ghfirst" This Pull Request has been reverted by a revert inside Meta. To re-land this change, please open another pull request, assign the same reviewers, fix the CI failures that caused the revert and make sure that the failing CI runs on the PR by applying the proper ciflow label (e.g., ciflow/trunk).) |
@pytorchbot successfully started a revert job. Check the current status here. |
This reverts commit fb731fe. Reverted #158301 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](#158301 (comment)))
@atalman your PR has been successfully reverted. |
e17ea11
to
7706737
Compare
Move code fixes Revert "conda" This reverts commit 2853662. Revert "use tos accept" This reverts commit 8b34264. Revert "conda" This reverts commit 2853662. Revert "Revert "conda"" This reverts commit e732654. Revert "Revert "use tos accept"" This reverts commit c456c54. Revert "Revert "conda"" This reverts commit bb4fa09. fix fix_arch_list fix fixes
e8cd442
to
100002a
Compare
if torch.version.cuda is not None: # on ROCm we don't want this check | ||
CUDA_VERSION = torch._C._cuda_getCompiledVersion() # noqa: F841 | ||
if ( | ||
torch.version.cuda is not None and torch.cuda.get_arch_list() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New version added check for torch.cuda.get_arch_list()
Related to #157517 Detect when users are executing torch build with cuda 12.8/12.9 and running on Maxwell or Pascal architectures. We would like to include reference to the issue: #157517 as well as ask people to install CUDA 12.6 builds if they are running on sm50 or sm60 architectures. Test: ``` >>> torch.cuda.get_arch_list() ['sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120'] >>> torch.cuda.init() /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:263: UserWarning: Found <GPU Name> which is of cuda capability 5.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is 7.0. warnings.warn( /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:268: UserWarning: Support for Maxwell and Pascal architectures is removed for CUDA 12.8+ builds. Please see #157517 Please install CUDA 12.6 builds if you require Maxwell or Pascal support. ``` Please note I reverted original PR #158301 because it broke internal users. This is a reland, added added check for non empty torch.cuda.get_arch_list() Pull Request resolved: #158700 Approved by: https://github.com/huydhn, https://github.com/Skylion007, https://github.com/eqy
Add warning about removed sm50 and sm60 arches (pytorch#158301) Related to pytorch#157517 Detect when users are executing torch build with cuda 12.8/12.9 and running on Maxwell or Pascal architectures. We would like to include reference to the issue: pytorch#157517 as well as ask people to install CUDA 12.6 builds if they are running on sm50 or sm60 architectures. Test: ``` >>> torch.cuda.get_arch_list() ['sm_70', 'sm_75', 'sm_80', 'sm_86', 'sm_90', 'sm_100', 'sm_120', 'compute_120'] >>> torch.cuda.init() /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:263: UserWarning: Found <GPU Name> which is of cuda capability 5.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability supported by this library is 7.0. warnings.warn( /home/atalman/.conda/envs/py312/lib/python3.12/site-packages/torch/cuda/__init__.py:268: UserWarning: Support for Maxwell and Pascal architectures is removed for CUDA 12.8+ builds. Please see pytorch#157517 Please install CUDA 12.6 builds if you require Maxwell or Pascal support. ``` Pull Request resolved: pytorch#158301 Approved by: https://github.com/nWEIdia, https://github.com/albanD (cherry picked from commit fb731fe) Co-authored-by: atalman <atalman@fb.com>
Related to #157517
Detect when users are executing torch build with cuda 12.8/12.9 and running on Maxwell or Pascal architectures.
We would like to include reference to the issue: #157517 as well as ask people to install CUDA 12.6 builds if they are running on sm50 or sm60 architectures.
Test:
cc @ptrblck @msaroufim @eqy @jerryzh168 @albanD @malfet