Dynamic GRU weight

## 🐛 Bug

For weight norm or pruning, this library needs to support dynamically updating the weight tensors. My might attempt to do so that brought up several interesting stack traces:

## To Reproduce

Run this script:
```python3
import torch

# Constants
size = 16
batch_size = 4
seq_len = 8
device = torch.device('cuda')
input_ = torch.randn(seq_len, batch_size, size).to(device)
hidden = torch.randn(1, batch_size, size).to(device)

gru = torch.nn.GRU(size, size).to(device)

# Update weight with a `torch.tensor`
# NOTE: Similar weight update as torch.nn.utils.weight_nrom
data = gru.weight_hh_l0.data
del gru._parameters['weight_hh_l0']
setattr(gru, 'weight_hh_l0', torch.tensor(data))

# Optional call to resolve parameter shapes
gru.flatten_parameters()

# Run forward pass
_, output = gru(input_, hidden)
```

With out ``gru.flatten_parameters``:
```
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  setattr(gru, 'weight_hh_l0', torch.tensor(data))
Traceback (most recent call last):
  File "ddd.py", line 15, in <module>
    _, output = gru(input_, hidden)
  File "/home/michaelp/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/michaelp/.local/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 179, in forward
    self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: num_ptrs == (num_parameters * (has_biases ? 1 : 2)) ASSERT FAILED at /pytorch/aten/src/ATen/native/cudnn/RNN.cpp:1190, please report a bug to PyTorch.
```

With ``gru.flatten_parameters``:
```
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
  setattr(gru, 'weight_hh_l0', torch.tensor(data))
Traceback (most recent call last):
  File "ddd.py", line 14, in <module>
    gru.flatten_parameters()
  File "/home/michaelp/.local/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 113, in flatten_parameters
    self.batch_first, bool(self.bidirectional))
RuntimeError: MatrixRef: ArrayRef size 3 not divisible by stride 4
```

## Expected behavior

That I can update the GRU weight with a new `torch.tensor`, without an issue. 

## Environment

```
Collecting environment information...
PyTorch version: 1.0.0
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Ubuntu 18.04.1 LTS
GCC version: (Ubuntu 7.3.0-16ubuntu3) 7.3.0
CMake version: version 3.10.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla P100-PCIE-16GB
GPU 1: Tesla P100-PCIE-16GB
GPU 2: Tesla P100-PCIE-16GB
GPU 3: Tesla P100-PCIE-16GB

Nvidia driver version: 390.30
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.1.3

Versions of relevant libraries:
[pip] Could not collect
[conda] Could not collect
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamic GRU weight #15749

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dynamic GRU weight #15749

Description

🐛 Bug

To Reproduce

Expected behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions