KEMBAR78
[PyTorch] AOTI: generate reused thread_locals when tensors provably have static shape by swolchok · Pull Request #110892 · pytorch/pytorch · GitHub
Skip to content

Conversation

@swolchok
Copy link
Contributor

@swolchok swolchok commented Oct 9, 2023

…ave static shape

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This and the following diff are meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Oct 9, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/110892

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit b591468 with merge base 2edc75a (image):

FLAKY - The following job failed but was likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

swolchok added a commit that referenced this pull request Oct 9, 2023
…ave static shape

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This and the following diff are meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

ghstack-source-id: 203410034
Pull Request resolved: #110892
… provably have static shape"

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This and the following diff are meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Oct 9, 2023
…ave static shape

Pull Request resolved: #110892

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This and the following diff are meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.
ghstack-source-id: 203460350
@exported-using-ghexport

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)
Copy link
Contributor

@desertfire desertfire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left comments in the internal diff. Let me know when you think this is ready for review.

@swolchok swolchok added the topic: not user facing topic category label Oct 10, 2023
… provably have static shape"


If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Oct 10, 2023
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

…als when tensors provably have static shape"


If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
@swolchok swolchok force-pushed the gh/swolchok/588/head branch from ce7b56a to 4e03711 Compare October 10, 2023 19:50
swolchok added a commit that referenced this pull request Oct 10, 2023
…ave static shape

Pull Request resolved: #110892

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.
ghstack-source-id: 203568593
@exported-using-ghexport

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)
@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: New commits were pushed while merging. Please rerun the merge command.

Details for Dev Infra team Raised by workflow job

@swolchok
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@pytorchmergebot
Copy link
Collaborator

Merge failed

Reason: 1 mandatory check(s) failed. The first few are:

Dig deeper by viewing the failures on hud

Details for Dev Infra team Raised by workflow job

Failing merge rule: Core Maintainers

… provably have static shape"


If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Oct 12, 2023
…ave static shape

Pull Request resolved: #110892

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.
ghstack-source-id: 203869660
@exported-using-ghexport

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)
…rovably have static shape"


If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
swolchok added a commit that referenced this pull request Oct 12, 2023
…ave static shape

Pull Request resolved: #110892

If a Tensor can be reused and has static shape, we can just cache it across iterations.

This is meant as a quickly shippable overhead reduction for CPU overhead-bound use cases that we can ship without relying on memory planning.
ghstack-source-id: 203924205
@exported-using-ghexport

Differential Revision: [D50023678](https://our.internmc.facebook.com/intern/diff/D50023678/)
@swolchok
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@facebook-github-bot facebook-github-bot deleted the gh/swolchok/588/head branch October 17, 2023 14:24
pytorchmergebot pushed a commit that referenced this pull request Nov 30, 2023
…on-AOT mode"


We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Nov 30, 2023
We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Nov 30, 2023
…on-AOT mode"


We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Nov 30, 2023
We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Nov 30, 2023
…on-AOT mode"


We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
chunyuan-w added a commit that referenced this pull request Nov 30, 2023
We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Nov 30, 2023
We found performance regression when using cpp wrapper in non-AOT mode due to the change in #110892.
#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

Pull Request resolved: #114741
Approved by: https://github.com/jgong5, https://github.com/desertfire
dmenig pushed a commit to dmenig/pytorch that referenced this pull request Dec 21, 2023
We found performance regression when using cpp wrapper in non-AOT mode due to the change in pytorch#110892.
pytorch#110892 only handles the buffer cache in AOT mode but removes the `reset` call without checking whether AOT mode is on or off. This PR updates the buffer free change to only happen when `V.graph.aot_mode is True`.

Pull Request resolved: pytorch#114741
Approved by: https://github.com/jgong5, https://github.com/desertfire
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants