[jit] Add LSTM to standard library #14831

driazati · 2018-12-06T01:47:22Z

[WIP]

Adds support for torch.nn.LSTM in Script

De-sugaring for self._parameters.values() to a Tensor[]
- add aten::List for int, float, and Tensor
LSTM accepts both PackedSequence (which is a Tuple[Tensor, Optional[Tensor]]) or Tensor for[WI{{ its input. While the Python version is unaffected, the TorchScript LSTM only supports Tuple[Tensor, Optional[Tensor]] for its input.
- aten::_wrap_tuple and aten::_unwrap_tuple are used to provide the correct types to the Script compiler, but cannot actually be run

eellison · 2018-12-06T17:27:57Z

torch/csrc/jit/register_special_ops.cpp

What is the reason to use FunctionSchema instead of a string ?

eellison · 2018-12-06T17:29:26Z

torch/csrc/jit/script/compiler.cpp

What in LSTM cell is the reason for this ?

wanchaol · 2018-12-07T01:48:42Z

torch/csrc/jit/script/init.cpp

+  // self._parameters.values()
+  std::shared_ptr<SugaredValue> call(SourceRange loc, Method & caller, at::ArrayRef<NamedValue> inputs, at::ArrayRef<NamedValue> attributes, size_t n_binders) override {
+    std::vector<Value*> params;
+    const auto& param_list = module_->get_parameters();


You might also need to check if the parameters is buffer or not

wanchaol · 2018-12-07T01:52:28Z

torch/jit/__init__.py

    return x


+def _unwrap_tuple(x):


hmm do we need to think more about this? I think we want to ultimately remove any those unwrap functions if possible, adding more and more might result in a non-easy revert in the future. At least we should think how to provide user a API or something they don't need to care about.

If we want to support it at the language level instead of having these hacks I would imagine it would look something like

@torch.jit.script def fn(x): # type: (Union[int, float]) -> Union[int, float] if isinstance(x, int): return 3 else: return 3.5

And similarly to #14533 we would only emit the branches for the types seen at compile time

Hmmm It's hard to control those meta programming conditions since it might be a runtime known value, it might be ok for us as a temp hack to add unwrap/wrap tuples in supporting lstm modules, I guess we just need to be aware to not stacking too much on them

wanchaol · 2018-12-07T01:53:49Z

torch/csrc/jit/register_special_ops.cpp

+        }),
+    Operator(
+      // "aten::_get_packed_sequence(Tensor a, Tensor b) -> (Tensor, Tensor?)", // TODO: using this causes a segfault
+      FunctionSchema(


I have the same doubt, why you are using FunctionSchema instead of a string?

He said in person: no support for Tuples yet in registering operators.

facebook-github-bot

@driazati has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

zdevito

There needs to be more explanation of what this is trying to accomplish. There seems to be major copy pasting and adding of specific features to get this to work. There is likely a better way.

zdevito · 2018-12-10T00:59:15Z

torch/csrc/jit/register_special_ops.cpp

+          };
+        }),
+    Operator(
+      // "aten::_get_packed_sequence(Tensor a, Tensor b) -> (Tensor, Tensor?)", // TODO: using this causes a segfault


You should fix the segfault :)

zdevito · 2018-12-10T01:02:28Z

torch/csrc/jit/script/init.cpp

  }
 };

+


I doubt it is a good idea to add more SugaredValue types simply to support this one module. This needs a more thorough explanation of what is going on, and why these changes are necessary.

zdevito · 2018-12-10T01:04:44Z

torch/nn/modules/rnn.py

    def __init__(self, *args, **kwargs):
        super(LSTM, self).__init__('LSTM', *args, **kwargs)

+    @weak_script_method


This seems like a massive copy-paste from somewhere, what is going on?

eellison

I understand that the issue is the function can be called in python with either a Tensor or a Packed Sequence. Torchscript will only be able to handle the Packed Sequence case.

I think another approach would be:

@torch.jit.script
def fn(x):
    # type: (Tuple[Tensor, Optional[Tensor])
    if isinstance(x, torch.Tensor)
        ... x is typed as Tensor here, and the if branch will be constant prop'd away
    else:
        .... x is a tuple here

This would allow us the torchschript code to compile, without changing the python code.

eellison · 2018-12-07T02:08:09Z

torch/csrc/jit/register_prim_ops.cpp

 }

+template <typename T>
+Operation listList(const Node* node) {


There is already a Noop defined on line 37

eellison · 2018-12-07T02:09:25Z

torch/csrc/jit/register_prim_ops.cpp


 #define CREATE_LIST_OPS(decl_type, c_type) \
    Operator("aten::len(" decl_type "[] a) -> int", listLen<Shared<c_type>>), \
+    Operator("aten::list(" decl_type "[] a) -> " decl_type "[]", listList<Shared<c_type>>), \


I'm not sure we want to support list yet... If we do add it, you need to add aliasing information

eellison · 2018-12-07T02:11:58Z

torch/csrc/jit/register_special_ops.cpp

+        }),
+    Operator(
+      // "aten::_get_packed_sequence(Tensor a, Tensor b) -> (Tensor, Tensor?)", // TODO: using this causes a segfault
+      FunctionSchema(


He said in person: no support for Tuples yet in registering operators.

zou3519 · 2018-12-18T14:23:19Z

Btw this will conflict with #15225

Summary: **WIP** Attempt 2 at #14831 This adds `nn.LSTM` to the jit standard library. Necessary changes to the module itself are detailed in comments. The main limitation is the lack of a true `PackedSequence`, instead this PR uses an ordinary `tuple` to stand in for `PackedSequence`. Most of the new code in `rnn.py` is copied to `nn.LSTM` from `nn.RNNBase` to specialize it for LSTM since `hx` is a `Tuple[Tensor, Tensor]` (rather than just a `Tensor` as in the other RNN modules) for LSTM. As a hack it adds an internal annotation `@_parameter_list` to mark that a function returns all the parameters of a module. The weights for `RNN` modules are passed to the corresponding op as a `List[Tensor]`. In Python this has to be gathered dynamically since Parameters could be moved from CPU to GPU or be deleted and replaced (i.e. if someone calls `weight_norm` on their module, #15766), but in the JIT parameter lists are immutable, hence a builtin to handle this differently in Python/JIT. Pull Request resolved: #15744 Differential Revision: D14173198 Pulled By: driazati fbshipit-source-id: 4ee8113159b3a8f29a9f56fe661cfbb6b30dffcd

facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Dec 6, 2018

eellison reviewed Dec 6, 2018

View reviewed changes

torch/csrc/jit/register_special_ops.cpp Outdated

Copy link

Contributor

eellison Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason to use FunctionSchema instead of a string ?

eellison reviewed Dec 6, 2018

View reviewed changes

torch/csrc/jit/script/compiler.cpp Outdated

Copy link

Contributor

eellison Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What in LSTM cell is the reason for this ?

driazati force-pushed the rnn branch from b3d58cf to 121093d Compare December 6, 2018 22:59

davidriazati added 2 commits December 6, 2018 17:22

lstm

2f995a7

Merge branch 'master' of https://github.com/pytorch/pytorch into rnn

4f77fd0

driazati force-pushed the rnn branch from 2dcd1c8 to 4f77fd0 Compare December 7, 2018 01:23

wanchaol reviewed Dec 7, 2018

View reviewed changes

davidriazati added 4 commits December 7, 2018 10:50

py2 support

8e4824d

Ignore buffers for ParameterList

c46cc35

error msg

9b1fcdd

better kind names

ded7c85

facebook-github-bot reviewed Dec 7, 2018

View reviewed changes

zdevito reviewed Dec 10, 2018

View reviewed changes

eellison reviewed Dec 10, 2018

View reviewed changes

suo mentioned this pull request Dec 18, 2018

torch.jit.trace hardcodes batch size with packed input to LSTM #15319

Closed

jamesr66a mentioned this pull request Dec 21, 2018

Quantized RNNCell modules #15469

Closed

driazati closed this Jan 3, 2019

driazati mentioned this pull request Jan 4, 2019

[jit] Add LSTM to standard library #15744

Closed

[jit] Add LSTM to standard library #14831

[jit] Add LSTM to standard library #14831

Uh oh!

Conversation

driazati commented Dec 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

zdevito left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 commented Dec 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

driazati commented Dec 6, 2018 •

edited

Loading