KEMBAR78
[PyTorch] RFC: ExclusivelyOwned by swolchok · Pull Request #59419 · pytorch/pytorch · GitHub
Skip to content

Conversation

@swolchok
Copy link
Contributor

@swolchok swolchok commented Jun 3, 2021

Stack from ghstack:

This introduces ExclusivelyOwned, which allows isolated
pieces of code that can make ownership guarantees to opt out of
reference counting operations on intrusive_ptr and Tensor
entirely. To elaborate, if you know you are the exclusive owner of an
intrusive_ptr or Tensor, moving it into an ExclusivelyOwned will
avoid performing atomic reference counting operations at destruction
time. The documentation comment should provide sufficient explanation; please request changes if not.

Differential Revision: D28885314

NOTE FOR REVIEWERS: This PR has internal Facebook specific changes or comments, please review them on Phabricator!

This introduces ExclusivelyOwned, which allows isolated
pieces of code that can make ownership guarantees to opt out of
reference counting operations on `intrusive_ptr` and `Tensor`
entirely. To elaborate, if you know you are the exclusive owner of an
`intrusive_ptr` or `Tensor`, moving it into an `ExclusivelyOwned` will
avoid performing atomic reference counting operations at destruction
time. The documentation comment should provide sufficient explanation; please request changes if not.

Differential Revision: [D28885314](https://our.internmc.facebook.com/intern/diff/D28885314/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D28885314/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Jun 3, 2021

💊 CI failures summary and remediations

As of commit 770a4fe (more details on the Dr. CI page):


  • 2/2 failures possibly* introduced in this PR
    • 1/2 non-scanned failure(s)

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build (1/1)

Step: "Build" (full log | diagnosis details | 🔁 rerun)

Jun 14 18:55:18 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Jun 14 18:55:18 ++++ extract_trap_cmd
Jun 14 18:55:18 ++++ printf '%s\n' ''
Jun 14 18:55:18 +++ printf '%s\n' cleanup
Jun 14 18:55:18 ++ trap -- '
Jun 14 18:55:18 cleanup' EXIT
Jun 14 18:55:18 ++ [[ pytorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7-build != *pytorch-win-* ]]
Jun 14 18:55:18 ++ which sccache
Jun 14 18:55:18 ++ sccache --stop-server
Jun 14 18:55:18 ++ true
Jun 14 18:55:18 ++ rm /var/lib/jenkins/sccache_error.log
Jun 14 18:55:18 rm: cannot remove '/var/lib/jenkins/sccache_error.log': No such file or directory
Jun 14 18:55:18 ++ true
Jun 14 18:55:18 ++ [[ -n '' ]]
Jun 14 18:55:18 ++ [[ pytorch-linux-xenial-cuda11.1-cudnn8-py3-gcc7-build == *rocm* ]]
Jun 14 18:55:18 ++ SCCACHE_ERROR_LOG=/var/lib/jenkins/sccache_error.log
Jun 14 18:55:18 ++ SCCACHE_IDLE_TIMEOUT=1200
Jun 14 18:55:18 ++ RUST_LOG=sccache::server=error
Jun 14 18:55:18 ++ sccache --start-server
Jun 14 18:55:18 sccache: Starting the server...
Jun 14 18:55:18 ++ sccache --zero-stats
Jun 14 18:55:18 Compile requests                      0

1 job timed out:

  • pytorch_linux_xenial_cuda11_1_cudnn8_py3_gcc7_build

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

Click here to manually regenerate this comment.

@swolchok
Copy link
Contributor Author

swolchok commented Jun 3, 2021

an open question: would it be nice if Tensor factory functions could optionally return this? We can't simply change the return type because that would make this opt-out instead of opt-in, but perhaps we could have a template wrapper callOwned<someFactory>() that had the same signature as someFactory, except that it returned ExclusivelyOwned<Tensor>. We could statically check (probably in debug builds only to save resources) that someFactory was actually one of the factory functions defined in native_functions.yaml (e.g., by putting their addresses in a constexpr array of void*and writing a constexpr function to loop over the array) to prevent misuse.


static repr_type nullRepr() {
// REVIEW: do we need to be able to distinguish an empty
// ExclusivelyOwned<Tensor> from an undefined Tensor?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems unlikely

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternately, we could delete the default constructor from ExclusivelyOwned (I am not sure how painful that will be for consumers)

#ifndef NDEBUG
// Needed to pass the debug assertions in ~intrusive_ptr_target.
toDestroy->refcount_ = 0;
toDestroy->weakcount_ = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this is extra invariant on release_resources that the refcount/weakcount may not be accurate

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if release_resources cares about the refcount/weakcount....why?
We can do the stores in all builds if you think it will make a difference to anyone, I would just rather not have the code size and memory traffic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh, wait, it can't care about the refcount & weakcount because they're private to intrusive_ptr_target!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think release_resources ever relies on it, but it's a near miss, as some PyObject binding stuff uses use_count to decide if it has exclusive ownership or not. Just need to doc and socialize this, no action needed.

// End the lifetime of repr_ without executing its dtor, since we
// already did specialized destruction for the exclusively-owned
// case in destroyOwned!
dummy_ = '\0';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C unions never destruct anyway, though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normally, we would have to write an explicit destructor call for exactly that reason.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my Q is, couldn't we just omit nulling it out here entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I think you're right. the lifetime of the union is going to end anyway. will fix.

swolchok added 2 commits June 9, 2021 10:55
This introduces ExclusivelyOwned, which allows isolated
pieces of code that can make ownership guarantees to opt out of
reference counting operations on `intrusive_ptr` and `Tensor`
entirely. To elaborate, if you know you are the exclusive owner of an
`intrusive_ptr` or `Tensor`, moving it into an `ExclusivelyOwned` will
avoid performing atomic reference counting operations at destruction
time. The documentation comment should provide sufficient explanation; please request changes if not.

Differential Revision: [D28885314](https://our.internmc.facebook.com/intern/diff/D28885314/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D28885314/)!

[ghstack-poisoned]
This introduces ExclusivelyOwned, which allows isolated
pieces of code that can make ownership guarantees to opt out of
reference counting operations on `intrusive_ptr` and `Tensor`
entirely. To elaborate, if you know you are the exclusive owner of an
`intrusive_ptr` or `Tensor`, moving it into an `ExclusivelyOwned` will
avoid performing atomic reference counting operations at destruction
time. The documentation comment should provide sufficient explanation; please request changes if not.

Differential Revision: [D28885314](https://our.internmc.facebook.com/intern/diff/D28885314/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D28885314/)!

[ghstack-poisoned]
// have this traits class, and instead we should just have two
// explicit specializations of ExclusivelyOwned? There would be some
// small amount of code duplication, but perhaps it would be easier to
// understand?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have an opinion here, you should do what you feel is best.

Copy link
Contributor

@ezyang ezyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speculatively approving this. I don't know how this is going to pan out overall in the codebase but it seems like a useful tool to have in the toolbox

swolchok added 2 commits June 14, 2021 10:32
This introduces ExclusivelyOwned, which allows isolated
pieces of code that can make ownership guarantees to opt out of
reference counting operations on `intrusive_ptr` and `Tensor`
entirely. To elaborate, if you know you are the exclusive owner of an
`intrusive_ptr` or `Tensor`, moving it into an `ExclusivelyOwned` will
avoid performing atomic reference counting operations at destruction
time. The documentation comment should provide sufficient explanation; please request changes if not.

Differential Revision: [D28885314](https://our.internmc.facebook.com/intern/diff/D28885314/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D28885314/)!

[ghstack-poisoned]
This introduces ExclusivelyOwned, which allows isolated
pieces of code that can make ownership guarantees to opt out of
reference counting operations on `intrusive_ptr` and `Tensor`
entirely. To elaborate, if you know you are the exclusive owner of an
`intrusive_ptr` or `Tensor`, moving it into an `ExclusivelyOwned` will
avoid performing atomic reference counting operations at destruction
time. The documentation comment should provide sufficient explanation; please request changes if not.

Differential Revision: [D28885314](https://our.internmc.facebook.com/intern/diff/D28885314/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D28885314/)!

[ghstack-poisoned]
@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 79d7c15.

@facebook-github-bot facebook-github-bot deleted the gh/swolchok/253/head branch June 19, 2021 14:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants