-
Notifications
You must be signed in to change notification settings - Fork 25.7k
DOC Adds code comment for _ConvNd.reset_parameters #58931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💊 CI failures summary and remediationsAs of commit 0e77bc5 (more details on the Dr. CI page):
🕵️ 3 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
| Job | Step | Action |
|---|---|---|
| Unknown | 🔁 rerun |
This comment was automatically generated by Dr. CI (expand for details).
Follow this link to opt-out of these comments for your Pull Requests.Please report bugs/suggestions to the (internal) Dr. CI Users group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the clarification!
|
|
||
| def reset_parameters(self) -> None: | ||
| # Setting a=sqrt(5) in kaiming_uniform is the same as initializing with | ||
| # uniform(-1/sqrt(k), 1/sqrt(k)), where k = weight.size(1) * prod(*kernel_size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add a line to the effect of: "See for more details." where link points to Soumith's comment explaining the calculation (#15314 (comment))
|
@jbschlosser has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
|
@jbschlosser merged this pull request in 8130f2f. |
Summary: Fixes pytorch#55741 by adding a comment regarding the behavior of `kaiming_uniform_` The docstring is correct in this case. For example: ```python import math import matplotlib.pyplot as plt import torch import torch.nn as nn in_channels = 120 groups = 2 kernel = (3, 8) m = nn.Conv2d(in_channels=in_channels, groups=groups, out_channels=100, kernel_size=kernel) k = math.sqrt(groups / (in_channels * math.prod(kernel))) print(f"k: {k:0.6f}") print(f"min weight: {m.weight.min().item():0.6f}") print(f"max weight: {m.weight.max().item():0.6f}") ``` outputs: ``` k: 0.026352 min weight: -0.026352 max weight: 0.026352 ``` And when we plot the distribution, it is uniform with the correct bounds: ```python _ = plt.hist(m.weight.detach().numpy().ravel()) ```  Pull Request resolved: pytorch#58931 Reviewed By: anjali411 Differential Revision: D28689863 Pulled By: jbschlosser fbshipit-source-id: 98eebf265dfdaceed91f1991fc4b1592c0b3cf37
Fixes #55741 by adding a comment regarding the behavior of
kaiming_uniform_The docstring is correct in this case. For example:
outputs:
And when we plot the distribution, it is uniform with the correct bounds: