KEMBAR78
add back in unsafe view decomp by eellison · Pull Request #109663 · pytorch/pytorch · GitHub
Skip to content

Conversation

@pytorch-bot
Copy link

pytorch-bot bot commented Sep 19, 2023

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/109663

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 25b6bed with merge base 518308a (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

This decomp makes pattern matching easier, and was only just excluded from decomp set.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov

[ghstack-poisoned]
@eellison eellison added ciflow/trunk Trigger trunk jobs on your pull request topic: not user facing topic category labels Sep 20, 2023
@eellison
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@eellison
Copy link
Contributor Author

@pytorchbot merge -f "not related to rocm, taking too long"

@pytorchmergebot
Copy link
Collaborator

The merge job was canceled. If you believe this is a mistake, then you can re trigger it through pytorch-bot.

@eellison
Copy link
Contributor Author

@pytorchbot merge -f "not related to rocm, taking too long"

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

pytorchmergebot pushed a commit that referenced this pull request Sep 20, 2023
Adds a Python Pretty Printer to the pattern matcher that serializes patterns as python. Generating our fuse attention patterns was taking 4 seconds of compile time, which will only get worse as we add more variants (which I will do in the rest of this stack). To write out patterns, build pytorch, then run `gen_attention_patterns.py`.

Since there is a line limit for PRs  i'm only including the _sdpa_pattern1 in this first diff.

Pull Request resolved: #108894
Approved by: https://github.com/yanboliang
ghstack dependencies: #109663
pytorchmergebot pushed a commit that referenced this pull request Sep 20, 2023
Serializes the remaining traced patterns.

Pull Request resolved: #108917
Approved by: https://github.com/davidberard98
ghstack dependencies: #109663, #108894
pytorchmergebot pushed a commit that referenced this pull request Sep 20, 2023
aten.softmax will generate a different decomposition for fp16/bf16 and fp32 because when invoked in lower precision it will upcast the inputs to fp32 and then downcast after. This has been causing us to miss bf16 patterns. For example, Camembert improves 20% with this PR (as do I'm sure many other models).

Pull Request resolved: #109142
Approved by: https://github.com/yanboliang
ghstack dependencies: #109663, #108894, #108917
pytorchmergebot pushed a commit that referenced this pull request Sep 20, 2023
Adds a 3d pattern that improves perf of HF Whisper from 1.3 -> 4.1. We could be matching more generally on 3d, but i'll leave that for another pr.

Thanks to @drisspg for helping me write the pattern.

Pull Request resolved: #109156
Approved by: https://github.com/yanboliang
ghstack dependencies: #109663, #108894, #108917, #109142
pytorchmergebot pushed a commit that referenced this pull request Sep 20, 2023
The pretty print is faster and more concise because it memoizes objects.

Pull Request resolved: #109066
Approved by: https://github.com/yanboliang
ghstack dependencies: #109663, #108894, #108917, #109142, #109156
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants