Migrate thnn_conv2d from THC to ATen #63428

peterbell10 · 2021-08-17T21:21:22Z

Stack from ghstack:

Closes gh-24644, closes gh-24645

Differential Revision: D30441307

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

facebook-github-bot · 2021-08-17T21:21:27Z

🔗 Helpful links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/63428
📄 Preview docs built from this PR

💊 CI failures summary and remediations

As of commit d525b31 (more details on the Dr. CI page):

4/4 failures possibly* introduced in this PR
- 1/4 non-scanned failure(s)

🕵️ 3 new failures recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

linux-bionic-py3.8-gcc9-coverage / test (default, 1, 2, linux.2xlarge) (1/3)

Step: "Test PyTorch" (full log | diagnosis details | 🔁 rerun)

2021-08-20T00:54:21.1750308Z Build left local git repository checkout dirty

2021-08-20T00:54:11.4198113Z real	65m2.425s
2021-08-20T00:54:11.4198504Z user	154m9.866s
2021-08-20T00:54:11.4198806Z sys	9m15.562s
2021-08-20T00:54:11.4199155Z + assert_git_not_dirty
2021-08-20T00:54:11.4200552Z + [[ linux-bionic-py3.8-gcc9-coverage-default != *rocm* ]]
2021-08-20T00:54:11.4201605Z + [[ linux-bionic-py3.8-gcc9-coverage-default != *xla* ]]
2021-08-20T00:54:11.4202354Z ++ git status --porcelain
2021-08-20T00:54:21.1747545Z + git_status='?? third_party/breakpad/'
2021-08-20T00:54:21.1748444Z + [[ -n ?? third_party/breakpad/ ]]
2021-08-20T00:54:21.1749439Z + echo 'Build left local git repository checkout dirty'
2021-08-20T00:54:21.1750308Z Build left local git repository checkout dirty
2021-08-20T00:54:21.1751249Z + echo 'git status --porcelain:'
2021-08-20T00:54:21.1752071Z git status --porcelain:
2021-08-20T00:54:21.1752902Z + echo '?? third_party/breakpad/'
2021-08-20T00:54:21.1753574Z ?? third_party/breakpad/
2021-08-20T00:54:21.1754123Z + exit 1
2021-08-20T00:54:21.1754604Z + cleanup
2021-08-20T00:54:21.1755058Z + retcode=1
2021-08-20T00:54:21.1755554Z + set +x
2021-08-20T00:54:21.1756157Z =================== sccache compilation log ===================
2021-08-20T00:54:21.1927719Z =========== If your build fails, please take a look at the log above for possible reasons ===========

linux-xenial-cuda11.1-py3.6-gcc7 / test (default, 1, 2, linux.8xlarge.nvidia.gpu) (2/3)

Step: "Test PyTorch" (full log | diagnosis details | 🔁 rerun)

2021-08-20T04:55:37.8355658Z Build left local git repository checkout dirty

2021-08-20T04:55:37.2628682Z real	90m46.063s
2021-08-20T04:55:37.2629427Z user	146m37.825s
2021-08-20T04:55:37.2629798Z sys	35m31.740s
2021-08-20T04:55:37.2630216Z + assert_git_not_dirty
2021-08-20T04:55:37.2631773Z + [[ linux-xenial-cuda11.1-py3.6-gcc7-default != *rocm* ]]
2021-08-20T04:55:37.2633920Z + [[ linux-xenial-cuda11.1-py3.6-gcc7-default != *xla* ]]
2021-08-20T04:55:37.2635110Z ++ git status --porcelain
2021-08-20T04:55:37.8351327Z + git_status='?? third_party/breakpad/'
2021-08-20T04:55:37.8352916Z + [[ -n ?? third_party/breakpad/ ]]
2021-08-20T04:55:37.8354353Z + echo 'Build left local git repository checkout dirty'
2021-08-20T04:55:37.8355658Z Build left local git repository checkout dirty
2021-08-20T04:55:37.8357098Z + echo 'git status --porcelain:'
2021-08-20T04:55:37.8358423Z git status --porcelain:
2021-08-20T04:55:37.8359709Z + echo '?? third_party/breakpad/'
2021-08-20T04:55:37.8360745Z ?? third_party/breakpad/
2021-08-20T04:55:37.8361546Z + exit 1
2021-08-20T04:55:37.8362249Z + cleanup
2021-08-20T04:55:37.8363027Z + retcode=1
2021-08-20T04:55:37.8363744Z + set +x
2021-08-20T04:55:37.8364622Z =================== sccache compilation log ===================
2021-08-20T04:55:37.8631814Z =========== If your build fails, please take a look at the log above for possible reasons ===========

win-vs2019-cuda10.1-py3 / test (default, 1, 1, windows.8xlarge.nvidia.gpu) (3/3)

Step: "Run test scripts" (full log | diagnosis details | 🔁 rerun)

2021-08-20T01:01:26.1118929Z AssertionError: True is not false

2021-08-20T01:01:26.1108143Z ======================================================================
2021-08-20T01:01:26.1109359Z FAIL [103.970s]: test_fd_sharing (__main__.TestMultiprocessing)
2021-08-20T01:01:26.1110725Z ----------------------------------------------------------------------
2021-08-20T01:01:26.1111815Z Traceback (most recent call last):
2021-08-20T01:01:26.1114526Z   File "test_multiprocessing.py", line 327, in test_fd_sharing
2021-08-20T01:01:26.1115641Z     self._test_sharing(repeat=TEST_REPEATS)
2021-08-20T01:01:26.1116465Z   File "test_multiprocessing.py", line 283, in _test_sharing
2021-08-20T01:01:26.1117042Z     test_receive()
2021-08-20T01:01:26.1117689Z   File "test_multiprocessing.py", line 278, in test_receive
2021-08-20T01:01:26.1118351Z     self.assertFalse(p.is_alive())
2021-08-20T01:01:26.1118929Z AssertionError: True is not false
2021-08-20T01:01:26.1119281Z 
2021-08-20T01:01:26.1222364Z ----------------------------------------------------------------------
2021-08-20T01:01:26.1223830Z Ran 2 tests in 216.577s
2021-08-20T01:01:26.1224343Z 
2021-08-20T01:01:26.1224984Z FAILED (failures=1)
2021-08-20T01:01:26.1225502Z 
2021-08-20T01:01:26.1226298Z Generating XML reports...
2021-08-20T01:01:26.1228575Z Generated XML report: test-reports\python-unittest\test_multiprocessing\TEST-TestMultiprocessing-20210820005749.xml
2021-08-20T01:01:26.4308770Z Traceback (most recent call last):
2021-08-20T01:01:26.4309925Z   File "run_test.py", line 1095, in <module>

ci.pytorch.org: 1 failed

Failed: pr/pytorch-linux-bionic-rocm4.2-py3.6

This comment was automatically generated by Dr. CI (expand for details).

Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

ngimel

This looks great @peterbell10, I left a few comments.

ngimel · 2021-08-19T21:00:51Z

aten/src/ATen/native/cuda/ConvolutionMM2d.cu

+  }
+
+  // Allow for empty batch size but not other dimensions
+  const bool valid_empty = c10::multiply_integers(in_sizes.slice(1)) != 0;


This is not right if ndim is 3 (the you need in_sizes.numel() > 0)? Btw, do we still allow 3d inputs, or is this code path never exercised?

You're right, _convolution enforce the shape check before this is ever called. Should I just remove all the dim=3 handling code?

Yes, please do

ngimel · 2021-08-19T21:08:40Z

aten/src/ATen/native/cuda/ConvolutionMM2d.cu

+  if (weight.defined()) {
+    const auto w_sizes = weight.sizes();
+    int64_t nInputPlane = w_sizes[1];
+    if (w_sizes.size() == 2) {


can this happen? I though weight shape is checked to have 4 dims before dispatching to slow conv?

ngimel · 2021-08-19T21:38:18Z

aten/src/ATen/native/cuda/ConvolutionMM2d.cu

+      int64_t k_ = 1;
+
+      // Do GEMM (note: this is a bit confusing because gemm assumes column-major matrices)
+      if (bias.defined()) {


is it just expanding bias to output size? Should we be able to do the same with .expand() and output_n.copy_, without gemm call and ones?

Is it alright if I leave algorithmic changes to a follow-up PR?

sure! Then I only have questions about 3d inputs and 2d weights.

ngimel · 2021-08-19T21:44:16Z

aten/src/ATen/native/cuda/ConvolutionMM2d.cu

+      }
+
+      // Do Bias:
+      if (grad_bias.defined()) {


This also should be doable by just a sum call with appropriate dimensions?

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

ngimel · 2021-08-20T00:15:38Z

@ngimel has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2021-08-21T01:30:13Z

@ngimel merged this pull request in b4f5809.

Migrate thnn_conv2d from THC to ATen

a26eb3f

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

peterbell10 requested a review from ezyang as a code owner August 17, 2021 21:21

facebook-github-bot added the cla signed label Aug 17, 2021

peterbell10 mentioned this pull request Aug 17, 2021

Kill THCUNN #63429

Closed

pytorchbot added the open source label Aug 17, 2021

peterbell10 added 2 commits August 18, 2021 11:44

Update on "Migrate thnn_conv2d from THC to ATen"

fe4ca8d

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

Update on "Migrate thnn_conv2d from THC to ATen"

0a1b0b6

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

peterbell10 requested a review from ngimel August 18, 2021 15:24

This was referenced Aug 18, 2021

Migrate legacy lstsq from THC to ATen (CUDA) #63504

Closed

Migrate THCTensor_copyIgnoringOverlaps to ATen #63505

Closed

Update on "Migrate thnn_conv2d from THC to ATen"

7de26b4

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

ezyang removed their request for review August 19, 2021 03:57

peterbell10 added the module: porting Issues related to porting TH/THNN legacy to ATen native label Aug 19, 2021

ngimel reviewed Aug 19, 2021

View reviewed changes

peterbell10 added 2 commits August 19, 2021 23:49

Update on "Migrate thnn_conv2d from THC to ATen"

0fbf28d

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

Update on "Migrate thnn_conv2d from THC to ATen"

d525b31

Closes gh-24644, closes gh-24645 [ghstack-poisoned]

facebook-github-bot closed this in b4f5809 Aug 21, 2021

facebook-github-bot added the Merged label Aug 21, 2021

facebook-github-bot deleted the gh/peterbell10/115/head branch August 24, 2021 14:16

Migrate thnn_conv2d from THC to ATen #63428

Migrate thnn_conv2d from THC to ATen #63428

Uh oh!

Conversation

peterbell10 commented Aug 17, 2021 • edited by ngimel Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot commented Aug 17, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful links

💊 CI failures summary and remediations

🕵️ 3 new failures recognized by patterns

linux-bionic-py3.8-gcc9-coverage / test (default, 1, 2, linux.2xlarge) (1/3)

linux-xenial-cuda11.1-py3.6-gcc7 / test (default, 1, 2, linux.8xlarge.nvidia.gpu) (2/3)

win-vs2019-cuda10.1-py3 / test (default, 1, 1, windows.8xlarge.nvidia.gpu) (3/3)

ci.pytorch.org: 1 failed

Uh oh!

ngimel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ngimel commented Aug 20, 2021

Uh oh!

facebook-github-bot commented Aug 21, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

peterbell10 commented Aug 17, 2021 •

edited by ngimel

Loading

facebook-github-bot commented Aug 17, 2021 •

edited

Loading