-
Notifications
You must be signed in to change notification settings - Fork 25.7k
[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided #110877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110877
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 980a485 with merge base 2edc75a ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) ghstack-source-id: 203390727 Pull Request resolved: #110877
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks!
…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]
…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More like a note to myself: we should do the same for other ops in this file
…torch_empty_strided" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]
0e8b6be
to
e34523c
Compare
…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]
e34523c
to
980a485
Compare
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
I need this to do a cheap and easy output copy in D50023678. Differential Revision: [D50105080](https://our.internmc.facebook.com/intern/diff/D50105080/) Pull Request resolved: #110909 Approved by: https://github.com/jansel, https://github.com/chenyang78, https://github.com/desertfire ghstack dependencies: #110876, #110877
…ave static shape (#110892) If a Tensor can be reused and has static shape, we can just cache it across iterations. This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning. Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/) Pull Request resolved: #110892 Approved by: https://github.com/bertmaher ghstack dependencies: #110876, #110877, #110909
Stack from ghstack (oldest at bottom):
This seems to reduce benchmark time by 15-20%. Supersedes D49835545.
Differential Revision: D49974460