[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided #110877

swolchok · 2023-10-09T16:57:58Z

Stack from ghstack (oldest at bottom):

This seems to reduce benchmark time by 15-20%. Supersedes D49835545.

Differential Revision: D49974460

This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

pytorch-bot · 2023-10-09T16:58:01Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110877

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 980a485 with merge base 2edc75a ():

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) ghstack-source-id: 203390727 Pull Request resolved: #110877

chenyang78

LGTM. Thanks!

…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

desertfire

More like a note to myself: we should do the same for other ops in this file

…torch_empty_strided" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

swolchok · 2023-10-12T23:46:42Z

@pytorchbot merge

pytorchmergebot · 2023-10-12T23:48:35Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

I need this to do a cheap and easy output copy in D50023678. Differential Revision: [D50105080](https://our.internmc.facebook.com/intern/diff/D50105080/) Pull Request resolved: #110909 Approved by: https://github.com/jansel, https://github.com/chenyang78, https://github.com/desertfire ghstack dependencies: #110876, #110877

…ave static shape (#110892) If a Tensor can be reused and has static shape, we can just cache it across iterations. This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning. Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/) Pull Request resolved: #110892 Approved by: https://github.com/bertmaher ghstack dependencies: #110876, #110877, #110909

[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided

34a010f

This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

swolchok mentioned this pull request Oct 9, 2023

[PyTorch] -DNDEBUG in inductor codecache builds #110876

Closed

swolchok requested review from chenyang78 and jansel October 9, 2023 17:30

swolchok mentioned this pull request Oct 9, 2023

AOTI: add and use aoti_torch_empty_strided_cpu #110470

Closed

chenyang78 approved these changes Oct 9, 2023

View reviewed changes

Update on "[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_stri…

e00cdb5

…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

swolchok mentioned this pull request Oct 9, 2023

[PyTorch] AOTI: generate reused thread_locals when tensors provably have static shape #110892

Closed

Update on "[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_stri…

0e8b6be

…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

swolchok mentioned this pull request Oct 9, 2023

[PyTorch] AOTI: Add aoti_torch_assign_tensors to ABI #110909

Closed

jansel approved these changes Oct 10, 2023

View reviewed changes

desertfire approved these changes Oct 10, 2023

View reviewed changes

swolchok added the topic: not user facing topic category label Oct 10, 2023

use V.graph._shape_env on "[PyTorch] AOTI: add CPU fast path in aoti_…

e34523c

…torch_empty_strided" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

swolchok force-pushed the gh/swolchok/587/head branch from 0e8b6be to e34523c Compare October 10, 2023 19:50

Update on "[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_stri…

980a485

…ded" This seems to reduce benchmark time by 15-20%. Supersedes D49835545. Differential Revision: [D49974460](https://our.internmc.facebook.com/intern/diff/D49974460/) [ghstack-poisoned]

swolchok force-pushed the gh/swolchok/587/head branch from e34523c to 980a485 Compare October 12, 2023 17:49

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 12, 2023

pytorchmergebot added the merging label Oct 12, 2023

pytorchmergebot added Merged and removed merging labels Oct 13, 2023

pytorchmergebot closed this in a2c17a2 Oct 13, 2023

facebook-github-bot deleted the gh/swolchok/587/head branch October 16, 2023 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided #110877

[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided #110877

Uh oh!

swolchok commented Oct 9, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Oct 9, 2023 •

edited

Loading

Uh oh!

chenyang78 left a comment

Uh oh!

desertfire left a comment

Uh oh!

swolchok commented Oct 12, 2023

Uh oh!

pytorchmergebot commented Oct 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided #110877

[PyTorch] AOTI: add CPU fast path in aoti_torch_empty_strided #110877

Uh oh!

Conversation

swolchok commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Oct 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110877

✅ You can merge normally! (2 Unrelated Failures)

Uh oh!

chenyang78 left a comment

Choose a reason for hiding this comment

Uh oh!

desertfire left a comment

Choose a reason for hiding this comment

Uh oh!

swolchok commented Oct 12, 2023

Uh oh!

pytorchmergebot commented Oct 12, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

swolchok commented Oct 9, 2023 •

edited

Loading

pytorch-bot bot commented Oct 9, 2023 •

edited

Loading