-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[CD] [aarch64] Add CUDA 13.0 sbsa nightly build #161257
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161257
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 2 Unrelated FailuresAs of commit 2325f1b with merge base b2db293 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
3503cd5 to
98cadf4
Compare
|
Is it possible to add that archs for next releases? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good, let's add THOR support.
"This release adds support of SM110 GPUs for arm64-sbsa on Linux." from 13.0 release notes https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
This PR is part of the CUDA 13 upstream bringup #159779, adding sm80-120 support for CUDA 13 SBSA build. You can check the size of the wheel in https://github.com/pytorch/pytorch/actions/runs/17186608287, 2.18 GB, with --compress-mode=size enabled. Comparing with 12.9 3.28 GB, that is 1.1 GB of savings and ~33.5% smaller. #161378 seems dup as this PR will add support for Thor (sm_110), Spark (sm_120 compatible) and GB300 (sm_100 compatible). Thanks. |
Thanks! Very kind. |
|
After adding sm_110, build failure in nvshmem, which doesn't support building sm_110. https://github.com/pytorch/pytorch/actions/runs/17200303693/job/48789533255?pr=161257 |
What nvshem are you using? https://pypi.jetson-ai-lab.io/sbsa/cu130/torch/2.9.0.dev20250823+cu130.g3e5b021 also, cusparselt for thor minimum version is 0.8.0 |
|
As mentioned by the NVSHMEM team, NVSHMEM by definition is intended for Datacenter GPUs (clusters), therefore not expected to support Thor. https://docs.nvidia.com/nvshmem/release-notes-install-guide/release-notes/release-3324.html |
cd5e111 to
bc3abe9
Compare
86a0315 to
83d2623
Compare
|
Previous build failures shows there are more places that call into undefined references than the header file Introduce _NVSHMEM_DEVICELIB_SUPPORTED as a helper to decide whether to set NVSHMEM_HOSTLIB_ONLY before including nvshmem.h, and guard whether to run the unsupported functions. 83d2623 |
|
@pytorchbot merge -i |
Merge failedReason: Approvers from one of the following sets are needed:
|
|
@pytorchmergebot merge -f "All required signal look good" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
pytorch#159779 CUDA SBSA build for CUDA 13.0 1. Supported archs: sm_80 to sm_120. Including support for Thor (sm_110), SPARK (sm_121), GB300 (sm_103). "This release adds support of SM110 GPUs for arm64-sbsa on Linux." from 13.0 release notes https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html 2. Use -compress-mode=size for binary size reduction, 13.0 wheel is 2.18 GB, when compared with 12.9 3.28 GB, that is 1.1 GB of savings and ~33.5% smaller. 3. Refactored the libs_to_copy list with common libs, and version_specific_libs. TODO: add the other CUDA archs in the existing support matrix of x86 to SBSA build as well Pull Request resolved: pytorch#161257 Approved by: https://github.com/nWEIdia, https://github.com/atalman
#159779
CUDA SBSA build for CUDA 13.0
"This release adds support of SM110 GPUs for arm64-sbsa on Linux." from 13.0 release notes https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html
TODO: add the other CUDA archs in the existing support matrix of x86 to SBSA build as well
cc @H-Huang @awgu @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @pragupta @ptrblck @atalman @nWEIdia @malfet