-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Add CUDA 12.8 manywheel x86 Builds to Binaries Matrix #145792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/145792
Note: Links to docs will display an error until the docs builds have been completed. ⏳ No Failures, 121 PendingAs of commit d9d6492 with merge base 2af8767 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Will keep the +PTX for nightlies. |
| "nvidia-cusolver-cu12==11.7.2.55; platform_system == 'Linux' and platform_machine == 'x86_64' | " | ||
| "nvidia-cusparse-cu12==12.5.7.53; platform_system == 'Linux' and platform_machine == 'x86_64' | " | ||
| "nvidia-cusparselt-cu12==0.6.3; platform_system == 'Linux' and platform_machine == 'x86_64' | " | ||
| "nvidia-nccl-cu12==2.21.5; platform_system == 'Linux' and platform_machine == 'x86_64' | " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add another PR updating NCCL
|
Hi @malfet binaries are uploaded to cu128 bucket |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tinglvv please fix lint
|
Libtorch Build failure - https://github.com/pytorch/pytorch/actions/runs/13042203635/job/36386381759 Seems the binary size might be too large, need to refine TORCH_CUDA_ARCH_LIST based on #39968. Skipping the libtorch wheel addition for now. |
|
Lint keeps getting this failure, unsure of the reason |
|
Too unblock try passing this flag: #39968 (comment) that might it link while we figure out best way to reduce the code size. |
|
|
||
| cuda_version_nodot=$(echo $CUDA_VERSION | tr -d '.') | ||
|
|
||
| TORCH_CUDA_ARCH_LIST="5.0;6.0;7.0;7.5;8.0;8.6" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add --host-linker-script=use-lcs to the TORCH_NVCC_FLAGS at the top of this file, that should fix this issue without changing the CUDA_ARCH_LIST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
New Build Failure in libtorch after the ld relink error https://github.com/pytorch/pytorch/actions/runs/13056929705/job/36430236293 Let me just merge the manywheel changes for now. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 mandatory check(s) failed. The first few are: Dig deeper by viewing the failures on hud |
|
@pytorchmergebot rebase -b main |
|
@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here |
|
Successfully rebased |
0c296a1 to
d9d6492
Compare
|
@pytorchmergebot merge -f "lint is passing everything else was tested" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Thanks Andrey for merging the PR. Nightly x86 builds should be available from tonight. |
Apparentyl, we need to pass |
#145570
Adding cuda 12.8.0 x86 builds first
TODO: resolve libtorch build failure and add build in #146084
cc @atalman @malfet @ptrblck @nWEIdia