-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Migrate thnn_conv_depthwise2d from THC to ATen #62006
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. | | Master Forward (us) | This PR Forward (us) | |------------------:|:-------------------:|:--------------------:| | Forward | 133.5 | 133.6 | | Backward (input) | 1,102 | 1,119 | | Backward (weight) | 2,220 | 2,217 | [ghstack-poisoned]
💊 CI failures summary and remediationsAs of commit c01b02e (more details on the Dr. CI page and at hud.pytorch.org/pr/62006):
🕵️ 4 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
| Job | Step | Action |
|---|---|---|
| Install dependencies | 🔁 rerun |
Preview docs built from this PR
This comment was automatically generated by Dr. CI (expand for details).
Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group.
Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. | | Master Forward (us) | This PR Forward (us) | |------------------:|:-------------------:|:--------------------:| | Forward | 133.5 | 133.6 | | Backward (input) | 1,102 | 1,119 | | Backward (weight) | 2,220 | 2,217 | [ghstack-poisoned]
Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. | | Master Forward (us) | This PR Forward (us) | |------------------:|:-------------------:|:--------------------:| | Forward | 133.5 | 133.6 | | Backward (input) | 1,102 | 1,119 | | Backward (weight) | 2,220 | 2,217 | ghstack-source-id: 9ee4fdd Pull Request resolved: #62006
Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. | | Master Forward (us) | This PR Forward (us) | |------------------:|:-------------------:|:--------------------:| | Forward | 133.5 | 133.6 | | Backward (input) | 1,102 | 1,119 | | Backward (weight) | 2,220 | 2,217 | [ghstack-poisoned]
| conv_depthwise2d_backward_out( | ||
| *self, *grad_output, grad_input, *weight, | ||
| kernel_size[1], kernel_size[0], | ||
| stride[1], stride[0], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is scary, can we add a test that would have caught it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a gradcheck test that fails for the previous version.
| static inline bool cudnn_conv_use_channels_last(const at::Tensor& input, const at::Tensor& weight) { | ||
| // disable NHWC for float64 input. | ||
| if (!detail::getCUDAHooks().compiledWithCuDNN() || | ||
| if (!at::detail::getCUDAHooks().compiledWithCuDNN() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this I get compilation errors because it's looking in the wrong namespace:
/home/peter/git/pytorch/aten/src/ATen/native/ConvUtils.h(87): error: namespace "at::native::detail" has no member "getCUDAHooks"
Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. | | Master Forward (us) | This PR Forward (us) | |------------------:|:-------------------:|:--------------------:| | Forward | 133.5 | 133.6 | | Backward (input) | 1,102 | 1,119 | | Backward (weight) | 2,220 | 2,217 | [ghstack-poisoned]
Closes gh-24646, gh-24647 There is no `TensorIterator` equivalent to these kernels so this is just migrating the existing kernels over to the ATen style. I've benchmarked for contiguous tensors with this script: ``` import torch shape = (10, 10, 100, 100) x = torch.randn(*shape, device='cuda') w = torch.randn((10, 1, 5, 5), device='cuda') for _ in range(100): torch.nn.functional.conv2d(x, w, groups=10) ``` and similarly for backwards. I see these as the same to within measurement error. | | Master Forward (us) | This PR Forward (us) | |------------------:|:-------------------:|:--------------------:| | Forward | 133.5 | 133.6 | | Backward (input) | 1,102 | 1,119 | | Backward (weight) | 2,220 | 2,217 | ghstack-source-id: b585bf4 Pull Request resolved: #62006
|
Test failures looks unrelated. |
|
@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@ngimel |
|
This pull request has been reverted by acaac70. |
|
Reverting this PR since it breaks |
Stack from ghstack:
Closes gh-24646, gh-24647
There is no
TensorIteratorequivalent to these kernels so this is justmigrating the existing kernels over to the ATen style.
I've benchmarked for contiguous tensors with this script:
and similarly for backwards. I see these as the same to within measurement error.
Differential Revision: D29883676