-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[cuDNN][Convolution] Disable cuDNN for 3D convolutions with kernel size != 1 for cuDNN 9.8+ #163581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/163581
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b3318d7 with merge base cf28ab2 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
disabling cudnn for this specific op is not the solution for us unfortunately. Without cudnn we run into OOM on this conv3d quickly. |
Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
aten/src/ATen/native/Convolution.cpp
Outdated
| } | ||
| // broken on cuDNN 9.8 | ||
| if (cudnn_version >= 90800) { | ||
| if (input.scalar_type() == at::kBFloat16 || input.scalar_type() == at::kHalf) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it broken just for inputs or for weights as well? Also, don't we have a reduced_precision_float bredicate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think we have convolutions that are mixing input and weight types as IIRC this fails checks
Are you referring to reduced precision reductions? Those are for cuBLAS matmuls only and control the use of split-k.
Or if you're referring to automatic mixed-precision/AMP that's doing casts of weight and input before the computation.
| } | ||
| } | ||
| } | ||
| if (!input.is_cuda() || !cudnn_enabled) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: Shouldn't this check be at the very beginning of the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, trying not to create a merge conflict for myself: #163171
|
Note that we still need this to disable cuDNN for this case in general as it is preferred over silent numerical incorrectness |
|
@pytorchbot cherry-pick --onto release/2.9 --fixes #163539 -c critical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving to unblock, but if the sizes in the issue don't require 64-bit indexing, how will that help?
Good catch! Fumbled an edit here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch, so this should disable cudnn for all of 3d cases?
aten/src/ATen/native/Convolution.cpp
Outdated
| if (cudnn_version >= 90800) { | ||
| if (input.scalar_type() == at::kBFloat16 || input.scalar_type() == at::kHalf) { | ||
| for (auto val : weight.sizes()) { | ||
| if (val != 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is also checking channels of weight, not just filter? Also, this disables all cudnn convolutions, not just 3d?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, in theory it would only need to be disabled for channels-first cases...
| if (cudnn_version >= 90800) { | ||
| if (cudnn_conv_suggest_memory_format(input, weight) == at::MemoryFormat::Contiguous && | ||
| input.scalar_type() == at::kBFloat16 || input.scalar_type() == at::kHalf) { | ||
| for (int i = 2; i < weight.dim(); i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it affect both 2d and 3d, or 3d only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check with the team to confirm but my understand is it's 3d only as the user who reported said they could work around it with a 2D conv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so you need to also check weight.dim() == 5?
|
@pytorchmergebot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot cherry-pick --onto release/2.9 --fixes #163539 -c critical |
…ze != 1 for cuDNN 9.8+ (#163581) To workaround #163539 Still confirming whether 9.10 is affected. The original test states that the convolution is "large," but note that the input size does not apepar to require 64-bit indexing. Pull Request resolved: #163581 Approved by: https://github.com/ngimel, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com> (cherry picked from commit e2817ac)
Cherry picking #163581The cherry pick PR is at #164027 and it is linked with issue #163539. The following tracker issues are updated: Details for Dev Infra teamRaised by workflow job |
…ze != 1 for cuDNN 9.8+ (#163581) To workaround #163539 Still confirming whether 9.10 is affected. The original test states that the convolution is "large," but note that the input size does not apepar to require 64-bit indexing. Pull Request resolved: #163581 Approved by: https://github.com/ngimel, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
…ze != 1 for cuDNN 9.8+ (#164027) [cuDNN][Convolution] Disable cuDNN for 3D convolutions with kernel size != 1 for cuDNN 9.8+ (#163581) To workaround #163539 Still confirming whether 9.10 is affected. The original test states that the convolution is "large," but note that the input size does not apepar to require 64-bit indexing. Pull Request resolved: #163581 Approved by: https://github.com/ngimel, https://github.com/malfet (cherry picked from commit e2817ac) Co-authored-by: Eddie Yan <eddiey@nvidia.com> Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
…ze != 1 for cuDNN 9.8+ (pytorch#163581) To workaround pytorch#163539 Still confirming whether 9.10 is affected. The original test states that the convolution is "large," but note that the input size does not apepar to require 64-bit indexing. Pull Request resolved: pytorch#163581 Approved by: https://github.com/ngimel, https://github.com/malfet Co-authored-by: Nikita Shulga <2453524+malfet@users.noreply.github.com>
To workaround #163539
Still confirming whether 9.10 is affected. The original test states that the convolution is "large," but note that the input size does not apepar to require 64-bit indexing.
cc @csarofeen @ptrblck @xwang233 @msaroufim @jerryzh168 @jgong5 @mingfeima @XiaobingSuper @sanchitintel @ashokei @jingxu10 @voznesenskym @penguinwu @EikanWang @Guobing-Chen @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov @coconutruben