-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Closed
Description
🐛 Bug
For weight norm or pruning, this library needs to support dynamically updating the weight tensors. My might attempt to do so that brought up several interesting stack traces:
To Reproduce
Run this script:
import torch
# Constants
size = 16
batch_size = 4
seq_len = 8
device = torch.device('cuda')
input_ = torch.randn(seq_len, batch_size, size).to(device)
hidden = torch.randn(1, batch_size, size).to(device)
gru = torch.nn.GRU(size, size).to(device)
# Update weight with a `torch.tensor`
# NOTE: Similar weight update as torch.nn.utils.weight_nrom
data = gru.weight_hh_l0.data
del gru._parameters['weight_hh_l0']
setattr(gru, 'weight_hh_l0', torch.tensor(data))
# Optional call to resolve parameter shapes
gru.flatten_parameters()
# Run forward pass
_, output = gru(input_, hidden)With out gru.flatten_parameters:
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
setattr(gru, 'weight_hh_l0', torch.tensor(data))
Traceback (most recent call last):
File "ddd.py", line 15, in <module>
_, output = gru(input_, hidden)
File "/home/michaelp/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
result = self.forward(*input, **kwargs)
File "/home/michaelp/.local/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 179, in forward
self.dropout, self.training, self.bidirectional, self.batch_first)
RuntimeError: num_ptrs == (num_parameters * (has_biases ? 1 : 2)) ASSERT FAILED at /pytorch/aten/src/ATen/native/cudnn/RNN.cpp:1190, please report a bug to PyTorch.
With gru.flatten_parameters:
UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requires_grad_(True), rather than torch.tensor(sourceTensor).
setattr(gru, 'weight_hh_l0', torch.tensor(data))
Traceback (most recent call last):
File "ddd.py", line 14, in <module>
gru.flatten_parameters()
File "/home/michaelp/.local/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 113, in flatten_parameters
self.batch_first, bool(self.bidirectional))
RuntimeError: MatrixRef: ArrayRef size 3 not divisible by stride 4
Expected behavior
That I can update the GRU weight with a new torch.tensor, without an issue.
Environment
Collecting environment information...
PyTorch version: 1.0.0
Is debug build: No
CUDA used to build PyTorch: 9.0.176
OS: Ubuntu 18.04.1 LTS
GCC version: (Ubuntu 7.3.0-16ubuntu3) 7.3.0
CMake version: version 3.10.2
Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: Tesla P100-PCIE-16GB
GPU 1: Tesla P100-PCIE-16GB
GPU 2: Tesla P100-PCIE-16GB
GPU 3: Tesla P100-PCIE-16GB
Nvidia driver version: 390.30
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.7.1.3
Versions of relevant libraries:
[pip] Could not collect
[conda] Could not collect
Metadata
Metadata
Assignees
Labels
No labels