KEMBAR78
[CUDA 13][cuDNN] Bump CUDA 13 to cuDNN 9.13.0 by eqy · Pull Request #162268 · pytorch/pytorch · GitHub
Skip to content

Conversation

@eqy
Copy link
Collaborator

@eqy eqy commented Sep 5, 2025

Fixes some d_qk != d_v cases on Hopper that are broken by cuDNN 9.11-9.12

cc @csarofeen @ptrblck @xwang233

@eqy eqy requested review from Skylion007 and drisspg September 5, 2025 15:41
@eqy eqy added the module: cudnn Related to torch.backends.cudnn, and CuDNN support label Sep 5, 2025
@eqy eqy requested review from a team and jeffdaily as code owners September 5, 2025 15:41
@eqy eqy added open source topic: not user facing topic category module: sdpa All things related to torch.nn.functional.scaled_dot_product_attentiion labels Sep 5, 2025
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 5, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/162268

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit f0ef0dc with merge base 2dd529d (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@eqy eqy added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 5, 2025
@drisspg
Copy link
Contributor

drisspg commented Sep 5, 2025

Is the wheel already on pypi?

Copy link
Collaborator

@Skylion007 Skylion007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to update CUDNN frontend too? What about CUDA 12 builds?

@eqy
Copy link
Collaborator Author

eqy commented Sep 5, 2025

Do we need to update CUDNN frontend too? What about CUDA 12 builds?
Given the proximity to 2.9 branch cut I wanted to limit the potential fallout as the CUDA builds are on a "safe" version, 9.10.2

Frontend in theory doesn't need to be updated (or we can make that a separate change) as mainly there was churn around the above dqk dv issue that was introduced after our current pin but it should be fixed in frontend 1.14.1.

@eqy
Copy link
Collaborator Author

eqy commented Sep 5, 2025

Is the wheel already on pypi?

Yes looks like it https://pypi.org/project/nvidia-cudnn-cu13/

@eqy
Copy link
Collaborator Author

eqy commented Sep 6, 2025

@pytorchmergebot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@nWEIdia
Copy link
Collaborator

nWEIdia commented Sep 6, 2025

cc @atalman @malfet
Maybe we need the wheels copied to AWS S3 as well. https://download.pytorch.org/whl/nightly/nvidia-cudnn-cu13/

missing 9.13.0

@atalman
Copy link
Contributor

atalman commented Sep 6, 2025

@nWEIdia correct we need to update these wheels, otherwise nightly build will fail
@eqy Do we need this PR for windows ? I believe Windows also require AMI update: https://github.com/pytorch/test-infra/blob/main/aws/ami/windows/scripts/Installers/Install-CUDA-Tools.ps1#L32

@eqy
Copy link
Collaborator Author

eqy commented Sep 6, 2025

@atalman it is a nice-to-have for Windows as we did not make cuDNN default for non-H100/B200

@atalman
Copy link
Contributor

atalman commented Sep 6, 2025

@eqy lets roll back window portion of this since I don't think we can do another AMI update before the branch cut. I believe you can provide forward fix for this

@atalman
Copy link
Contributor

atalman commented Sep 6, 2025

Thank you @eqy Linux cudnn was uploaded to download.pytorch.org

@nWEIdia
Copy link
Collaborator

nWEIdia commented Sep 6, 2025

Thank you @eqy Linux cudnn was uploaded to download.pytorch.org

Thank you Andrey!

pytorchmergebot pushed a commit that referenced this pull request Sep 6, 2025
daisyden pushed a commit to daisyden/pytorch that referenced this pull request Sep 8, 2025
Fixes some `d_qk` != `d_v` cases on Hopper that are broken by cuDNN 9.11-9.12

Pull Request resolved: pytorch#162268
Approved by: https://github.com/drisspg, https://github.com/Skylion007
daisyden pushed a commit to daisyden/pytorch that referenced this pull request Sep 8, 2025
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
Fixes some `d_qk` != `d_v` cases on Hopper that are broken by cuDNN 9.11-9.12

Pull Request resolved: pytorch#162268
Approved by: https://github.com/drisspg, https://github.com/Skylion007
markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
Fixes some `d_qk` != `d_v` cases on Hopper that are broken by cuDNN 9.11-9.12

Pull Request resolved: pytorch#162268
Approved by: https://github.com/drisspg, https://github.com/Skylion007
mansiag05 pushed a commit to mansiag05/pytorch that referenced this pull request Sep 22, 2025
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
Fixes some `d_qk` != `d_v` cases on Hopper that are broken by cuDNN 9.11-9.12

Pull Request resolved: pytorch#162268
Approved by: https://github.com/drisspg, https://github.com/Skylion007
cleonard530 pushed a commit to cleonard530/pytorch that referenced this pull request Sep 22, 2025
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Fixes some `d_qk` != `d_v` cases on Hopper that are broken by cuDNN 9.11-9.12

Pull Request resolved: pytorch#162268
Approved by: https://github.com/drisspg, https://github.com/Skylion007
dsashidh pushed a commit to dsashidh/pytorch that referenced this pull request Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged module: cudnn Related to torch.backends.cudnn, and CuDNN support module: sdpa All things related to torch.nn.functional.scaled_dot_product_attentiion open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants