KEMBAR78
use all_weights instead of _parameters in _flat_weights in rnn by ngimel · Pull Request #15766 · pytorch/pytorch · GitHub
Skip to content

Conversation

@ngimel
Copy link
Collaborator

@ngimel ngimel commented Jan 6, 2019

Fixes #15749

@soumith
Copy link
Member

soumith commented Jan 6, 2019

failing tests

@ngimel
Copy link
Collaborator Author

ngimel commented Jan 6, 2019

Build and asan failures seem unrelated (cannot install moreutils, temporary failure in name resolution), cuda9_cudnn7_py2_test looks real, I'll take a look next week.

@ngimel
Copy link
Collaborator Author

ngimel commented Jan 7, 2019

Remaining MacOS failures are unrelated.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@apaszke apaszke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it really looks like a hot patch for a much deeper issue, which is that weight norm and similar don't play well with our module design... Not being able to reliably access parameters in any way other than by name is just bad.

@ailzhang ailzhang added this to the 1.0.1 milestone Jan 8, 2019
@ngimel ngimel deleted the rnn_weight_norm branch January 16, 2019 19:50
@soumith soumith added the cherry-picked This PR was cherry-picked onto a release branch from master label Jan 17, 2019
soumith pushed a commit that referenced this pull request Jan 17, 2019
Summary:
Fixes #15749
Pull Request resolved: #15766

Differential Revision: D13592320

Pulled By: soumith

fbshipit-source-id: 6c3805f576c3df5a2da8bef1e4305eda379718df
soumith pushed a commit that referenced this pull request Jan 18, 2019
Summary:
Fixes #15749
Pull Request resolved: #15766

Differential Revision: D13592320

Pulled By: soumith

fbshipit-source-id: 6c3805f576c3df5a2da8bef1e4305eda379718df
facebook-github-bot pushed a commit that referenced this pull request Feb 22, 2019
Summary:
**WIP**

Attempt 2 at #14831

This adds `nn.LSTM` to the jit standard library. Necessary changes to the module itself are detailed in comments. The main limitation is the lack of a true `PackedSequence`, instead this PR uses an ordinary `tuple` to stand in for `PackedSequence`.

Most of the new code in `rnn.py` is copied to `nn.LSTM` from `nn.RNNBase` to specialize it for LSTM since `hx` is a `Tuple[Tensor, Tensor]` (rather than just a `Tensor` as in the other RNN modules) for LSTM.

As a hack it adds an internal annotation `@_parameter_list` to mark that a function returns all the parameters of a module. The weights for `RNN` modules are passed to the corresponding op as a `List[Tensor]`. In Python this has to be gathered dynamically since Parameters could be moved from CPU to GPU or be deleted and replaced (i.e. if someone calls `weight_norm` on their module, #15766), but in the JIT parameter lists are immutable, hence a builtin to handle this differently in Python/JIT.
Pull Request resolved: #15744

Differential Revision: D14173198

Pulled By: driazati

fbshipit-source-id: 4ee8113159b3a8f29a9f56fe661cfbb6b30dffcd
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked This PR was cherry-picked onto a release branch from master open source

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic GRU weight

6 participants