KEMBAR78
[ATen core IR] Register additional ATen operators as core by SS-JIA · Pull Request #110882 · pytorch/pytorch · GitHub
Skip to content

Conversation

@SS-JIA
Copy link
Contributor

@SS-JIA SS-JIA commented Oct 9, 2023

Stack from ghstack (oldest at bottom):

Context

For more context, please refer to this PyTorch forums post.

This PR registers some additional ATen operators as core, based on feedback from the forums post as well as the experiences from adding other core ATen decompositions.

The ATen operators registered as core in this diff, with the associated reasoning, are:

ATen op reasoning
aten::atan2 This operator often maps to a hardware intrinsic.
aten::diagonal There is no straightforward decomposition for this operator.
aten::empty_like Decomposition for this operator would require as_strided to retain the strides of the input tensor, which should be avoided.
aten::expm1 This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::full_like Decomposition for this operator would require as_strided to retain the strides of the input tensor, which should be avoided.
aten::log10 This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::log1p This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::log2 This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::pow.Scalar_Tensor This is a Scalar variant of pow.Tensor_Tensor, which is a part of core.
aten::resize There is no valid decomposition for this operator.

@pytorch-bot
Copy link

pytorch-bot bot commented Oct 9, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110882

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3788d1f with merge base 2e57b1e (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.


## Context

For more context, please refer to [this PyTorch forums post](https://dev-discuss.pytorch.org/t/defining-the-core-aten-opset/1464).

This PR registers some additional ATen operators as `core`, based on feedback from the forums post as well as the experiences from adding other core ATen decompositions.

The ATen operators registered as core in this diff, with the associated reasoning, are:

ATen op | reasoning
--|--
aten::atan2 | This operator often maps to a hardware intrinsic.
aten::diagonal | There is no straightforward decomposition for this operator.
aten::empty_like | Decomposition for this operator would require `as_strided` to retain the strides of the input tensor, which should be avoided.
aten::expm1 | This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::full_like | Decomposition for this operator would require `as_strided` to retain the strides of the input tensor, which should be avoided.
aten::log10 | This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::log1p | This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::log2 | This operator often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.
aten::pow.Scalar_Tensor | This is a Scalar variant of pow.Tensor_Tensor, which is a part of core.
aten::resize | There is no valid decomposition for this operator.


[ghstack-poisoned]
SS-JIA added a commit that referenced this pull request Oct 9, 2023
@SS-JIA
Copy link
Contributor Author

SS-JIA commented Oct 9, 2023

@pytorchbot merge

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 9, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

SparseCsrCPU, SparseCsrCUDA: empty_like_sparse_csr
NestedTensorCPU, NestedTensorCUDA: empty_like_nested
autogen: empty_like.out
tags: core
Copy link
Collaborator

@peterbell10 peterbell10 Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would empty_like require as_strided? Can't we use the reference which decomposes to aten.empty_permuted ?

The same applies to full_like if #110234 gets merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbell10 that is definitely an oversight on my part, my apologies. In fact I see that empty_like is already being decomposed as part of the core ATen decomposition table, so for empty_like this is definitely an erroneous classification.

I've created #110924 to fix this, please take a look. As part of that diff, I also registered the reference implementation of full_like as a core decomposition, with some slight changes to the implementation to use torch ops instead of prims.

For my own understanding, is it possible that the input tensor has weird strides that can't translate cleanly to a permutation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy pasting from my comment on the PR:

Okay, nevermind. I see why full_like cannot be registered as a decomposition this way (it is due to recursion with fill.Scalar decomposition. So this diff will just remove empty_like as a core decomposition, and when #110234 lands we can decompose full_like as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my own understanding, is it possible that the input tensor has weird strides that can't translate cleanly to a permutation?

torch.empty_like in eager always returns a dense tensor and only the ordering of dimensions is influenced by the "like" tensor argument.

SS-JIA added a commit that referenced this pull request Oct 10, 2023
## Context

Following up from peterbell10 comments on #110882.

* `empty_like` was erroneously classified as `core`
* Register `refs` implementation of `full_like` as a decomposition, and add it to core. Change it to use `torch.fill` instead of `prims.fill`


[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Oct 10, 2023
)

## Context

Following up from @peterbell10 comments on #110882.

* `empty_like` was erroneously classified as `core`. It can be decomposed using `empty_permuted` and in fact is currently decomposed this way in the core decomposition table.
* `full_like` can be similarly decomposed to `full_permuted` once #110234 lands. The current decomposition into `empty_like` and `fill` doesn't work because `fill` decomposes to `full_like`, resulting in a recursive loop.
Pull Request resolved: #110924
Approved by: https://github.com/kirklandsign
@facebook-github-bot facebook-github-bot deleted the gh/SS-JIA/174/head branch October 13, 2023 14:22
pytorchmergebot pushed a commit that referenced this pull request Apr 2, 2024
Same reason as `log10`. `log2` is a core aten op, we should not decompose it. As #110882 suggested, it often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.

Pull Request resolved: #123112
Approved by: https://github.com/peterbell10
sanketpurandare pushed a commit to sanketpurandare/pytorch that referenced this pull request Apr 22, 2024
Same reason as `log10`. `log2` is a core aten op, we should not decompose it. As pytorch#110882 suggested, it often maps to a hardware intrinsic; Furthermore, decomposing it will negatively impact the numerical precision of the output.

Pull Request resolved: pytorch#123112
Approved by: https://github.com/peterbell10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/trunk Trigger trunk jobs on your pull request Merged release notes: export

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants