[export] Modify SDPA decomposition to decompose _scaled_dot_product_flash_attention_for_cpu #117097

larryliu0820 · 2024-01-10T06:26:37Z

Stack from ghstack (oldest at bottom):

-> [export] Modify SDPA decomposition to decompose _scaled_dot_product_flash_attention_for_cpu #117097

Summary: As titled. #115913 added
_scaled_dot_product_flash_attention_for_cpu and the export result of
scaled_dot_product_attention includes this op. Adding this
decomposition so that it's being decomposed the same way as
_scaled_dot_product_attention_math.

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:

…lash_attention_for_cpu Summary: As titled. #115913 added `_scaled_dot_product_flash_attention_for_cpu` and the export result of `scaled_dot_product_attention` includes this op. Adding this decomposition so that it's being decomposed the same way as `_scaled_dot_product_attention_math`. Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

pytorch-bot · 2024-01-10T06:26:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117097

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 6a34f62 with merge base 19e93b8 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…t_product_flash_attention_for_cpu" Summary: As titled. #115913 added `_scaled_dot_product_flash_attention_for_cpu` and the export result of `scaled_dot_product_attention` includes this op. Adding this decomposition so that it's being decomposed the same way as `_scaled_dot_product_attention_math`. Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

…lash_attention_for_cpu Summary: As titled. #115913 added `_scaled_dot_product_flash_attention_for_cpu` and the export result of `scaled_dot_product_attention` includes this op. Adding this decomposition so that it's being decomposed the same way as `_scaled_dot_product_attention_math`. Test Plan: python test/test_decomp.py -k test_aten_core_operators Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 8651762 Pull Request resolved: #117097

lezcano

Don't we want to keep the decomp of _scaled_dot_product_flash_attention?

kimishpatel · 2024-01-10T16:06:03Z

I have been wanting to fix this in a more correct way which probably is non-trivial. Correct way is really to define decomp for aten sdpa op. I remember discussing this with you, but I still think we should look into decomposing aten sdpa.

One not so nice executorch specific workaround is inside to_edge, where before calling run_decomposition, we manually replace instances of aten sdpa with its decomp version.

kimishpatel · 2024-01-10T16:06:32Z

Don't we want to keep the decomp of _scaled_dot_product_flash_attention?

why flash attention?

lezcano · 2024-01-10T16:28:27Z

This PR removes one decomposition and registers a decomposition in one of the functions inside it. Why don't we keep the original decomposition in terms of this new decomposition?

kimishpatel · 2024-01-10T16:33:03Z

This PR removes one decomposition and registers a decomposition in one of the functions inside it. Why don't we keep the original decomposition in terms of this new decomposition?

Oh I see. Sorry didnt follow the first time around. Yeah makes sense. Basically you are suggesting to keep definition for both decomp and one of it just calls the other one. @larryliu0820 ?

lezcano · 2024-01-10T16:49:10Z

Yep, although looking at the removed code, the previous decomposition was completely wrong, with variables like

logsumexp = torch.empty([batchSize, qSize, num_head, headSize], dtype=torch.float)

that are never filled up, so it may be alright to straight up remove it.

larryliu0820 · 2024-01-10T17:25:23Z

Yep, although looking at the removed code, the previous decomposition was completely wrong, with variables like
logsumexp = torch.empty([batchSize, qSize, num_head, headSize], dtype=torch.float)
that are never filled up, so it may be alright to straight up remove it.

Yeah the previous decomposition was pretty awkward because _scaled_dot_product_flash_attention is returning way more things than _scaled_dot_product_attention_math. The current decomp makes much more sense.

test/test_decomp.py

larryliu0820 · 2024-01-10T17:42:10Z

@pytorchbot merge

lezcano · 2024-01-10T18:56:43Z

@larryliu0820 can you please deactivate the test for CUDA?

larryliu0820 · 2024-01-10T18:58:44Z

@larryliu0820 can you please deactivate the test for CUDA?

oh sorry I'm on it

…t_product_flash_attention_for_cpu" Summary: As titled. #115913 added `_scaled_dot_product_flash_attention_for_cpu` and the export result of `scaled_dot_product_attention` includes this op. Adding this decomposition so that it's being decomposed the same way as `_scaled_dot_product_attention_math`. Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

…lash_attention_for_cpu Summary: As titled. #115913 added `_scaled_dot_product_flash_attention_for_cpu` and the export result of `scaled_dot_product_attention` includes this op. Adding this decomposition so that it's being decomposed the same way as `_scaled_dot_product_attention_math`. Test Plan: python test/test_decomp.py -k test_aten_core_operators Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 45bd1d2 Pull Request resolved: #117097

larryliu0820 · 2024-01-10T20:52:04Z

@pytorchbot merge

pytorchmergebot · 2024-01-10T20:54:05Z

Merge failed

Reason: This PR needs a release notes: label
If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Details for Dev Infra team

Raised by workflow job

larryliu0820 · 2024-01-10T20:55:01Z

@pytorchbot merge

pytorchmergebot · 2024-01-10T20:56:55Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Summary: A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Test Plan: python test/export/test_export.py -k test_scaled_dot_product_attention Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Test Plan: python test/export/test_export.py -k test_scaled_dot_product_attention Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]

Summary: A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Test Plan: python test/export/test_export.py -k test_scaled_dot_product_attention Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 8c7f1ab Pull Request resolved: #117390

Summary: A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Test Plan: python test/export/test_export.py -k test_scaled_dot_product_attention Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: #117390 Approved by: https://github.com/drisspg

…ult" Summary: A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Test Plan: python test/export/test_export.py -k test_scaled_dot_product_attention Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]

Summary: A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Test Plan: python test/export/test_export.py -k test_scaled_dot_product_attention Reviewers: Subscribers: Tasks: Tags: cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx peterbell10 ipiszy yf225 chenyang78 kadeng muchulee8 aakhundov ColinPeppler [ghstack-poisoned]

A follow up for #117097. In that PR I didn't add `_scaled_dot_product_attention_for_cpu` into the core_aten_decomposition table. This PR does that and also add a unit test. Pull Request resolved: #117390 Approved by: https://github.com/drisspg Internal: << DO NOT EDIT BELOW THIS LINE >> Differential Revision: [D52788012](https://our.internmc.facebook.com/intern/diff/D52788012/) ghstack-source-id: 212131226

github-actions bot added the ciflow/inductor label Jan 10, 2024

larryliu0820 requested review from angelayi, kimishpatel and lezcano January 10, 2024 06:27

lezcano reviewed Jan 10, 2024

View reviewed changes

lezcano approved these changes Jan 10, 2024

View reviewed changes

test/test_decomp.py Show resolved Hide resolved

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Jan 10, 2024

pytorchmergebot added the merging label Jan 10, 2024

pytorchmergebot removed the merging label Jan 10, 2024

larryliu0820 added the topic: not user facing topic category label Jan 10, 2024

pytorchmergebot added the merging label Jan 10, 2024

pytorchmergebot added the Merged label Jan 10, 2024

pytorchmergebot closed this in 8783fe9 Jan 10, 2024

pytorchmergebot removed the merging label Jan 10, 2024

larryliu0820 mentioned this pull request Jan 12, 2024

[export] Add unit test for SDPA export result #117390

Closed

facebook-github-bot deleted the gh/larryliu0820/43/head branch January 14, 2024 15:23

AmosLewis mentioned this pull request Jan 17, 2024

Error while conversion of .mlir to .vmfb llvm/torch-mlir#2730

Open

[export] Modify SDPA decomposition to decompose _scaled_dot_product_flash_attention_for_cpu #117097

[export] Modify SDPA decomposition to decompose _scaled_dot_product_flash_attention_for_cpu #117097

Uh oh!

Conversation

larryliu0820 commented Jan 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jan 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/117097

✅ No Failures

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

kimishpatel commented Jan 10, 2024

Uh oh!

kimishpatel commented Jan 10, 2024

Uh oh!

lezcano commented Jan 10, 2024

Uh oh!

kimishpatel commented Jan 10, 2024

Uh oh!

lezcano commented Jan 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

larryliu0820 commented Jan 10, 2024

Uh oh!

Uh oh!

larryliu0820 commented Jan 10, 2024

Uh oh!

lezcano commented Jan 10, 2024

Uh oh!

larryliu0820 commented Jan 10, 2024

Uh oh!

larryliu0820 commented Jan 10, 2024

Uh oh!

pytorchmergebot commented Jan 10, 2024

Merge failed

Uh oh!

larryliu0820 commented Jan 10, 2024

Uh oh!

pytorchmergebot commented Jan 10, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

larryliu0820 commented Jan 10, 2024 •

edited

Loading

pytorch-bot bot commented Jan 10, 2024 •

edited

Loading

lezcano commented Jan 10, 2024 •

edited

Loading