[JIT][AD]matmul/dropout #17523

ailzhang · 2019-02-27T00:11:53Z

Add AD formula for [removed: _convolution &] matmul & dropout
add prim::range, fixes [JIT] support range of int #17483
Example:

dim = 3
x = range(dim) 
# x = [0, 1, 2] and is List[int]

…onvolution

zdevito

The non-convolution changes look good. In person: let's merge the non-convolution parts of this PR but hold off on changing _convolution. It is not clear dropping really complicated _convolution op into the differentiation will allow any backends to do meaningful optimization.

zdevito · 2019-02-28T00:33:23Z

aten/src/ATen/native/Convolution.cpp

+      // impl_index 6 - 12 are in _convolution_nogroup
+      auto returned_tuple = at::_convolution_nogroup(
          input, weight, bias, params.stride, params.padding, params.dilation, params.transposed, params.output_padding);
+      output = std::get<0>(returned_tuple);


std::tie would be nicer

facebook-github-bot

@ailzhang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Summary: * Add AD formula for _convolution & matmul & dropout * add prim::range, fixes #17483 Example: ``` dim = 3 x = range(dim) ``` Pull Request resolved: pytorch/pytorch#17523 Differential Revision: D14254002 Pulled By: ailzhang fbshipit-source-id: ba60d77b047db347929b72beca2623fb26aec957

apaszke · 2019-03-02T15:46:57Z

torch/csrc/jit/autodiff.cpp


+  if (n->matches(
+          "aten::dropout(Tensor input, float p, bool train) -> Tensor")) {
+    auto train = n->get<bool>(attr::train).value();


You never checked that this is a constant. If train is a computed value, this will throw. You have to do n->matches("...", attr::train) above.

also nit: no need to assign to a local just to return

apaszke · 2019-03-02T15:48:34Z

torch/csrc/jit/register_prim_ops.cpp

          };
        }),
+    Operator(
+        "prim::range(int n) -> int[]",


Hmm I'm a bit ambivalent about adding this, since those objects have completely different characteristics in Python.

for i in range(2 ** 64): ...

is fine in Python 3, but we're using Python 2 semantics in TorchScript? That seems like not a bad choice.

Yea I wasn't sure about this but it seems very useful atm as it saves me from writing for unnecessary loops in torchscript.
An alternative of this I thought about was list comprehension which in general I think it's also super useful, I went for prim::range because it's quick to implement I think. Possibly we could not document this feature and remove it when list comprehension is ready?

I'm fine with adding a special primitive, just don't call it range. If we use Python names we should match Python semantics.

Yea that makes sense, I will rename it in a following PR. :) Thanks!

apaszke · 2019-03-02T15:50:29Z

torch/csrc/jit/symbolic_script.cpp

+                    train: bool):
+            mask = torch.empty_like(input)
+            mask.bernoulli_(1 - p)
+            res = mask * input / (1.0 - p)


Why are we replacing forward code with this? It's not necessary, and avoiding it might open up more optimization opportunities and is less bug prone

When I wrote this I had 2 options :

changing https://github.com/pytorch/pytorch/blob/master/aten/src/ATen/native/Dropout.cpp#L54 to return the mask as well

write a simple dropout implementation in torchscript.
I went for the latter solution mainly because it's straightforward to do with a few lines.

I think a design question here is : do we want to keep AD formula as close as possible to its ATen counterpart (for easier possible future deduplication and consistency), or do we want to keep AD only math and simple? In some cases our aten functions considers backend/special sizes etc, I'm not certain if we would like to include those in AD formula or not. Let me know your thoughts on this, would love to have some guidelines on it. Thanks!

Fair enough, I forgot that you need the mask! Let's keep it like that.

apaszke · 2019-03-02T15:51:04Z

torch/csrc/jit/register_prim_ops.cpp


  auto element = getBoolItem(list->elements(), idx);
-  push(stack, std::move(element));
+  push(stack, element);


Why did we loose the move here? Seems like a regression?

Hmmm iirc this is done by clang-tidy hook. It was suggesting moving a bool doesn't make a perf difference and I thought that makes sense. I could bring it back if this might cause regression.

Ohhh I haven't noticed it's a bool. Yeah, in that case it shouldn't be there

apaszke · 2019-03-02T15:52:16Z

@zdevito FWIW adding convolution might still be nice because we could wrap the whole CNN in a single DifferentiableGraph and save up on constructing autograd graphs (assuming our DifferentiableGraphFunction is fast enough). But it's a lot of work.

Summary: fixes #17669 Address apaszke 's comments in #17523 Pull Request resolved: #17691 Differential Revision: D14328083 Pulled By: ailzhang fbshipit-source-id: 9ec4a54f13bfd1aaf4b1821dd00c31793ac07a44

Ailing Zhang added 6 commits February 24, 2019 20:42

add aten::_convolution and test

9a2ade9

add prim::range

f51e4cf

add aten::matmul

5095477

fix conv tests

14ac9b7

add aten::dropout

e23ca30

remove test for _convolution, it's tested in TestJitGeneratedModule

aa498a7

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Feb 27, 2019

Ailing Zhang added 2 commits February 26, 2019 16:18

Merge branch 'master' of https://github.com/pytorch/pytorch into AD/c…

f19189c

…onvolution

int64_t -> int

65d7f81

ailzhang requested review from apaszke and zdevito February 27, 2019 05:05

Merge branch 'master' of https://github.com/pytorch/pytorch into AD/c…

05fa9db

…onvolution

zdevito approved these changes Feb 28, 2019

View reviewed changes

remove _convolution

b8cf7b6

facebook-github-bot reviewed Feb 28, 2019

View reviewed changes

ailzhang changed the title ~~[JIT][AD]convolution/matmul/dropout~~ [JIT][AD]matmul/dropout Feb 28, 2019

facebook-github-bot closed this in 03132c1 Feb 28, 2019

pytorchbot added the merged label Feb 28, 2019

apaszke reviewed Mar 2, 2019

View reviewed changes

wanchaol mentioned this pull request Mar 5, 2019

[jit] Bad optional access during training for ScriptModule #17669

Closed

ailzhang mentioned this pull request Mar 5, 2019

fix dropout AD & rename range to rangelist #17691

Closed

[JIT][AD]matmul/dropout #17523

[JIT][AD]matmul/dropout #17523

Uh oh!

Conversation

ailzhang commented Feb 27, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zdevito left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apaszke commented Mar 2, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ailzhang commented Feb 27, 2019 •

edited

Loading