-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Description
🐛 Bug
_ConvNd.reset_parameters initializes the weight parameters as follows:
init.kaiming_uniform_(self.weight, a=math.sqrt(5))
See https://github.com/pytorch/pytorch/blob/v1.8.1/torch/nn/modules/conv.py#L115
This is not consistent with the documentation:
https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html?highlight=conv2d#torch.nn.Conv2d
I believe the sqrt(5) is incorrect. It was introduced in PR #9038 which appears to be a documentation change. I am not aware of anything in the literature which suggests that sqrt(5) should be used.
Originally reported here:
https://www.youtube.com/watch?v=4u8FxNEDUeg&t=6220s
To Reproduce
Steps to reproduce the behavior:
Expected behavior
Environment
Please copy and paste the output from our
environment collection script
(or fill out the checklist below manually).
You can get the script and run it with:
wget https://raw.githubusercontent.com/pytorch/pytorch/master/torch/utils/collect_env.py
# For security purposes, please check the contents of collect_env.py before running it.
python collect_env.py
- PyTorch Version (e.g., 1.0):
- OS (e.g., Linux):
- How you installed PyTorch (
conda,pip, source): - Build command you used (if compiling from source):
- Python version:
- CUDA/cuDNN version:
- GPU models and configuration:
- Any other relevant information:
