KEMBAR78
Update custom Function preserve torch function when inputs returned as-is by soulitzer · Pull Request #109825 · pytorch/pytorch · GitHub
Skip to content

Conversation

@soulitzer
Copy link
Contributor

@soulitzer soulitzer commented Sep 21, 2023

@soulitzer soulitzer requested a review from albanD as a code owner September 21, 2023 21:07
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 21, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109825

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit b184a54 with merge base 255d1a7 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

soulitzer added a commit that referenced this pull request Sep 21, 2023
};

auto view_as_fn = [](const at::Tensor& x, const at::Tensor& y) -> at::Tensor {
pybind11::gil_scoped_acquire gil;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not actually sure we need this since we are calling from Python and we don't release the gil, but jvp_user_function does it.

pybind11::gil_scoped_acquire gil;
py::object py_x = py::cast(x);
py::function py_view_as_method = py_x.attr("view_as");
return py_view_as_method(y).cast<at::Tensor>();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you do a quick local check with %timeit to make sure this is not making this case significantly slower please?
If it does, we should use our direct APIs via THPVariable_wrap and cie to avoid the pybind overhead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a quick test, and it seems that this PR adds 0.3-4 us of overhead, but avoiding use of pybind can cut 0.1 us of that away.

  • Current: 2.66 us
  • Without pybind: 2.95 us
  • With pybind 3.05 us

I've updated this PR to avoid using pybind to avoid that overhead.

soulitzer added a commit that referenced this pull request Oct 1, 2023
@soulitzer soulitzer requested a review from albanD October 1, 2023 18:02
Copy link
Collaborator

@albanD albanD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM!

@soulitzer
Copy link
Contributor Author

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 2, 2023
@soulitzer soulitzer added release notes: autograd release notes category topic: bug fixes topic category labels Oct 2, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

@soulitzer
Copy link
Contributor Author

Failures look legit, going to investigate.

@soulitzer soulitzer requested a review from wz337 as a code owner October 4, 2023 14:54
@soulitzer
Copy link
Contributor Author

soulitzer commented Oct 4, 2023

Failures look legit, going to investigate.

This failures is a expected result of the bug fix. The failure occurs in a test that tests that DDP properly fails when used with LazyModule. The difference is now that we now fail earlier with a different error message.

When DDP tries to replicate the module it calls Broadcast, which is a custom Function that returns input as-is. LazyModules have parameters of instance UninitializedTensorMixin which has a __torch_function__ method that errors if that unitialized parameter is used. With this PR we no longer incorrectly bypass this check, leading to a different error to be observed in the test.

soulitzer added a commit that referenced this pull request Oct 4, 2023
@soulitzer
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@PaliC
Copy link
Contributor

PaliC commented Oct 5, 2023

@pytorchbot revert -m "causing a plethora of internal failures" -c ghfirst

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

@pytorchmergebot
Copy link
Collaborator

@soulitzer your PR has been successfully reverted.

pytorchmergebot added a commit that referenced this pull request Oct 5, 2023
…turned as-is (#109825)"

This reverts commit 4e73eee.

Reverted #109825 on behalf of https://github.com/PaliC due to causing a plethora of internal failures ([comment](#109825 (comment)))
soulitzer added a commit to soulitzer/pytorch that referenced this pull request Oct 6, 2023
…r… (pytorch#110679)

Summary:
…eturned as-is

reland of pytorch#109825 (comment)

Opening this without ghstack to do codev. In our PR, we changed the signature of `_wrap_outputs`. There is some internal code that calls `_wrap_outputs` directly, so we also need to update that callsite.


Differential Revision: D49983654

Pulled By: soulitzer
pytorchmergebot pushed a commit that referenced this pull request Oct 7, 2023
#110679)

…eturned as-is

reland of #109825 (comment)

Opening this without ghstack to do codev. In our PR, we changed the signature of `_wrap_outputs`. There is some internal code that calls `_wrap_outputs` directly, so we also need to update that callsite.
Pull Request resolved: #110679
Approved by: https://github.com/albanD
@facebook-github-bot facebook-github-bot deleted the gh/soulitzer/234/head branch October 8, 2023 14:23
soulitzer added a commit to soulitzer/pytorch that referenced this pull request Oct 9, 2023
…r… (pytorch#110679)

Summary:
…eturned as-is

reland of pytorch#109825 (comment)

Opening this without ghstack to do codev. In our PR, we changed the signature of `_wrap_outputs`. There is some internal code that calls `_wrap_outputs` directly, so we also need to update that callsite.


Differential Revision: D49983654

Pulled By: soulitzer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: autograd release notes category Reverted topic: bug fixes topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants