Add Python serialization to Pattern Matcher patterns #108894

eellison · 2023-09-08T20:02:23Z

Stack from ghstack (oldest at bottom):

Adds a Python Pretty Printer to the pattern matcher that serializes patterns as python. Generating our fuse attention patterns was taking 4 seconds of compile time, which will only get worse as we add more variants (which I will do in the rest of this stack). To write out patterns, build pytorch, then run gen_attention_patterns.py.

Since there is a line limit for PRs i'm only including the _sdpa_pattern1 in this first diff.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx @peterbell10 @ipiszy @ngimel @yf225 @chenyang78 @kadeng @muchulee8 @aakhundov

[ghstack-poisoned]

pytorch-bot · 2023-09-08T20:02:26Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108894

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 1dd4628 with merge base 518308a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

Adds a Python Pretty Printer to the pattern matcher that serializes patterns as python. Generating our fuse attention patterns was taking 4 seconds of compile time, which will only get worse as we add more variants (which I will do in the rest of this stack). To write out patterns, build pytorch, then run `gen_attention_patterns.py`. Since there is a line limit for PRs i'm only including the _sdpa_pattern1 in this first diff. Then I will include all inference patterns, then all training patterns. Example Serialized Pattern: ``` tmp_0 = CallFunction( aten.view.default, CallFunction(aten.expand.default, KeywordArg("query"), Ignored()), Ignored(), _users=2, ) tmp_1 = CallFunction( aten.view.default, CallFunction( aten.expand.default, CallFunction(aten.permute.default, KeywordArg("key"), Ignored()), Ignored(), ), Ignored(), _users=2, ) tmp_2 = CallFunction( aten.div.Tensor, CallFunction( aten.view.default, CallFunction(aten.bmm.default, tmp_0, tmp_1), Ignored() ), KeywordArg("inv_scale"), _users=2, ) tmp_3 = CallFunction( aten.exp.default, CallFunction( aten.sub.Tensor, tmp_2, CallFunction(aten.amax.default, tmp_2, Ignored(), True) ), _users=2, ) tmp_4 = CallFunction( aten.div.Tensor, tmp_3, CallFunction(aten.sum.dim_IntList, tmp_3, Ignored(), True), _users=3, ) tmp_5 = CallFunction( aten.view.default, CallFunction(aten.expand.default, tmp_4, Ignored()), Ignored(), _users=2, ) tmp_6 = CallFunction( aten.view.default, CallFunction(aten.expand.default, KeywordArg("value"), Ignored()), Ignored(), _users=2, ) tmp_7 = CallFunction(aten.view.default, KeywordArg("tangents_1"), Ignored(), _users=2) tmp_8 = CallFunction( aten.mul.Tensor, CallFunction( aten.view.default, CallFunction( aten.bmm.default, tmp_7, CallFunction(aten.permute.default, tmp_6, Ignored()), ), Ignored(), ), tmp_4, _users=2, ) tmp_9 = CallFunction( aten.view.default, CallFunction( aten.div.Tensor, CallFunction( aten.sub.Tensor, tmp_8, CallFunction( aten.mul.Tensor, tmp_4, CallFunction(aten.sum.dim_IntList, tmp_8, Ignored(), True), ), ), KeywordArg("inv_scale"), ), Ignored(), _users=2, ) ``` cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

ghstack-source-id: 3b70e17 Pull Request resolved: #108894

jansel

Do we need a test to ensure the pattern file is up to date?

torch/_inductor/fx_passes/serialized_attention_patterns/_sfdp_pattern_1.py

torch/_inductor/pattern_matcher.py

Adds a Python Pretty Printer to the pattern matcher that serializes patterns as python. Generating our fuse attention patterns was taking 4 seconds of compile time, which will only get worse as we add more variants (which I will do in the rest of this stack). To write out patterns, build pytorch, then run `gen_attention_patterns.py`. Since there is a line limit for PRs i'm only including the _sdpa_pattern1 in this first diff. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

Addressed comments

Adds a Python Pretty Printer to the pattern matcher that serializes patterns as python. Generating our fuse attention patterns was taking 4 seconds of compile time, which will only get worse as we add more variants (which I will do in the rest of this stack). To write out patterns, build pytorch, then run `gen_attention_patterns.py`. Since there is a line limit for PRs i'm only including the _sdpa_pattern1 in this first diff. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

pytorchmergebot · 2023-09-19T20:36:49Z

Merge started

Your change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use -f as last resort and instead consider -i/--ignore-current to continue the merge ignoring current failures. This will allow currently pending tests to finish and report signal before the merge.

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Serializes the remaining traced patterns. Pull Request resolved: #108917 Approved by: https://github.com/davidberard98 ghstack dependencies: #108894

aten.softmax will generate a different decomposition for fp16/bf16 and fp32 because when invoked in lower precision it will upcast the inputs to fp32 and then downcast after. This has been causing us to miss bf16 patterns. For example, Camembert improves 20% with this PR (as do I'm sure many other models). Pull Request resolved: #109142 Approved by: https://github.com/yanboliang ghstack dependencies: #108894, #108917

eellison · 2023-09-19T22:56:43Z

@pytorchbot revert -m "land race" -c landrace

pytorchmergebot · 2023-09-19T22:59:57Z

@pytorchbot successfully started a revert job. Check the current status here.
Questions? Feedback? Please reach out to the PyTorch DevX Team

pytorchmergebot · 2023-09-19T23:00:07Z

@eellison your PR has been successfully reverted.

This reverts commit 7db175b. Reverted #108894 on behalf of https://github.com/eellison due to land race ([comment](#108894 (comment)))

Adds a Python Pretty Printer to the pattern matcher that serializes patterns as python. Generating our fuse attention patterns was taking 4 seconds of compile time, which will only get worse as we add more variants (which I will do in the rest of this stack). To write out patterns, build pytorch, then run `gen_attention_patterns.py`. Since there is a line limit for PRs i'm only including the _sdpa_pattern1 in this first diff. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

eellison · 2023-09-20T05:34:55Z

@pytorchbot merge

pytorchmergebot · 2023-09-20T05:36:38Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Serializes the remaining traced patterns. Pull Request resolved: #108917 Approved by: https://github.com/davidberard98 ghstack dependencies: #109663, #108894

aten.softmax will generate a different decomposition for fp16/bf16 and fp32 because when invoked in lower precision it will upcast the inputs to fp32 and then downcast after. This has been causing us to miss bf16 patterns. For example, Camembert improves 20% with this PR (as do I'm sure many other models). Pull Request resolved: #109142 Approved by: https://github.com/yanboliang ghstack dependencies: #109663, #108894, #108917

@drisspg

Adds a 3d pattern that improves perf of HF Whisper from 1.3 -> 4.1. We could be matching more generally on 3d, but i'll leave that for another pr. Thanks to @drisspg for helping me write the pattern. Pull Request resolved: #109156 Approved by: https://github.com/yanboliang ghstack dependencies: #109663, #108894, #108917, #109142

The pretty print is faster and more concise because it memoizes objects. Pull Request resolved: #109066 Approved by: https://github.com/yanboliang ghstack dependencies: #109663, #108894, #108917, #109142, #109156

Add Pretty Python Print to serialize patterns

12c43ba

[ghstack-poisoned]

github-actions bot added module: inductor ciflow/inductor labels Sep 8, 2023

Update on "Add Pretty Python Print to serialize patterns"

0e57686

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng Xia-Weiwen wenzhe-nrv jiayisunx peterbell10 ipiszy ngimel yf225 chenyang78 kadeng muchulee8 aakhundov [ghstack-poisoned]

eellison changed the title ~~Add Pretty Python Print to serialize patterns~~ Add Python serialization to Pattern Matcher patterns Sep 8, 2023

eellison added a commit that referenced this pull request Sep 8, 2023

Add Pretty Python Print to serialize patterns

d715120

ghstack-source-id: 3b70e17 Pull Request resolved: #108894

This was referenced Sep 8, 2023

Serialize training attention patterns #108899

Closed

Add serialized inference patterns #108901

Closed

eellison requested review from jansel and yanboliang September 8, 2023 20:45

jansel previously requested changes Sep 8, 2023

View reviewed changes

torch/_inductor/fx_passes/serialized_attention_patterns/_sfdp_pattern_1.py Outdated Show resolved Hide resolved

torch/_inductor/pattern_matcher.py Show resolved Hide resolved

This was referenced Sep 8, 2023

Serialize Remaining Patterns #108917

Closed

serialize 5 more patterns #108918

Closed

serialize remaining patterns #108919

Closed

eellison mentioned this pull request Sep 11, 2023

serialize final patterns #109048

Closed

yanboliang approved these changes Sep 11, 2023

View reviewed changes

eellison mentioned this pull request Sep 11, 2023

Use pretty print for checking no duplicated pattern #109066

Closed

eellison added 2 commits September 11, 2023 22:33

eellison mentioned this pull request Sep 12, 2023

Trace attention inference patterns with p=0, cleanup #109118

Closed

eellison mentioned this pull request Sep 12, 2023

Generate patterns in fp16 and fp32 #109142

Closed

pytorchmergebot added the merging label Sep 19, 2023

pytorchmergebot added Merged and removed merging labels Sep 19, 2023

pytorchmergebot closed this in 7db175b Sep 19, 2023

pytorchmergebot pushed a commit that referenced this pull request Sep 19, 2023

Serialize Remaining Patterns (#108917)

7bf08b7

Serializes the remaining traced patterns. Pull Request resolved: #108917 Approved by: https://github.com/davidberard98 ghstack dependencies: #108894

eellison reopened this Sep 19, 2023

pytorchmergebot added the Reverted label Sep 19, 2023

eellison mentioned this pull request Sep 19, 2023

add back in unsafe view decomp #109663

Closed

eellison added 2 commits September 19, 2023 17:06

pytorchmergebot added the merging label Sep 20, 2023

pytorchmergebot removed the merging label Sep 20, 2023

pytorchmergebot closed this in 16d608d Sep 20, 2023

pytorchmergebot pushed a commit that referenced this pull request Sep 20, 2023

Serialize Remaining Patterns (#108917)

067f172

Serializes the remaining traced patterns. Pull Request resolved: #108917 Approved by: https://github.com/davidberard98 ghstack dependencies: #109663, #108894

facebook-github-bot deleted the gh/eellison/529/head branch September 23, 2023 14:22

eellison mentioned this pull request Mar 13, 2024

_sfdp_init is extremely expensive for startup time, even on networks that don't benefit from it #100376

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Python serialization to Pattern Matcher patterns #108894

Add Python serialization to Pattern Matcher patterns #108894

Uh oh!

eellison commented Sep 8, 2023 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 8, 2023 •

edited

Loading

Uh oh!

jansel left a comment

Uh oh!

Uh oh!

Uh oh!

pytorchmergebot commented Sep 19, 2023

Uh oh!

eellison commented Sep 19, 2023

Uh oh!

pytorchmergebot commented Sep 19, 2023

Uh oh!

pytorchmergebot commented Sep 19, 2023

Uh oh!

eellison commented Sep 20, 2023

Uh oh!

pytorchmergebot commented Sep 20, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add Python serialization to Pattern Matcher patterns #108894

Add Python serialization to Pattern Matcher patterns #108894

Uh oh!

Conversation

eellison commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/108894

✅ No Failures

Uh oh!

jansel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

pytorchmergebot commented Sep 19, 2023

Merge started

Uh oh!

eellison commented Sep 19, 2023

Uh oh!

pytorchmergebot commented Sep 19, 2023

Uh oh!

pytorchmergebot commented Sep 19, 2023

Uh oh!

eellison commented Sep 20, 2023

Uh oh!

pytorchmergebot commented Sep 20, 2023

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

eellison commented Sep 8, 2023 •

edited

Loading

pytorch-bot bot commented Sep 8, 2023 •

edited

Loading