-
Notifications
You must be signed in to change notification settings - Fork 25.7k
catch tensor.numel() == 0 in nan detector #140741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This appears to be a diff that was exported from phabricator, but the PR author does not have sufficient permissions to run CI. @HarounH, please do step 2 of internal wiki to get write access so you do not need to get CI approvals in the future. If you think this is a mistake, please contact the Pytorch Dev Infra team. |
|
|
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/140741
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ✅ No FailuresAs of commit 2b117c7 with merge base 27c7caf ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
This pull request was exported from Phabricator. Differential Revision: D65956095 |
34eb847 to
9575ada
Compare
|
This pull request was exported from Phabricator. Differential Revision: D65956095 |
9575ada to
f2c96df
Compare
|
This pull request was exported from Phabricator. Differential Revision: D65956095 |
f2c96df to
e076a8f
Compare
|
This pull request was exported from Phabricator. Differential Revision: D65956095 |
Summary: Pull Request resolved: pytorch#140741 Test Plan: idk what i'm doing here, someone help Reviewed By: shuqiangzhang Differential Revision: D65956095
e076a8f to
2b117c7
Compare
|
This pull request was exported from Phabricator. Differential Revision: D65956095 |
|
@pytorchbot merge -f "merging" |
|
You need to provide a reason for using force merge, in the format @pytorchbot merge -f 'Explanation'.
|
|
@pytorchbot merge -f "no CI failure" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Context: we are trying to pass an empty tensor through the system now (sometimes;... its an edge case); and it seems to cause all_reduce to seg fault, which is unexpected to me Deep Shah and Pavan identified the issue, I'm just pushing for a fix :) Test Plan: idk what i'm doing here, someone help Reviewed By: shuqiangzhang Differential Revision: D65956095 Pull Request resolved: pytorch#140741 Approved by: https://github.com/shuqiangzhang
Context: we are trying to pass an empty tensor through the system now (sometimes;... its an edge case); and it seems to cause all_reduce to seg fault, which is unexpected to me
Deep Shah and Pavan identified the issue, I'm just pushing for a fix :)
Test Plan: idk what i'm doing here, someone help
Reviewed By: shuqiangzhang
Differential Revision: D65956095
cc @H-Huang @awgu @kwen2501 @wanchaol @fegin @fduwjj @wz337 @wconstab @d4l3k @c-p-i-o