For addmm and bmm, check if config.autotune_fallback_to_aten before using aten as a fallback. Also fix bmm cutlass backend #147148

henrylhtsang · 2025-02-13T22:59:56Z

Stack from ghstack (oldest at bottom):

-> For addmm and bmm, check if config.autotune_fallback_to_aten before using aten as a fallback. Also fix bmm cutlass backend #147148

This PR also fixes BMM, which was silently failing for a while.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

…back [ghstack-poisoned]

pytorch-bot · 2025-02-13T23:00:00Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/147148

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit a0dd187 with merge base be0df96 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / macos-py3-arm64-mps / test (mps, 1, 1, macos-m1-13) (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

…back ghstack-source-id: aa3d879 Pull Request resolved: #147148

…en before using aten as a fallback. Also fix bmm cutlass backend " This PR also fixes BMM, which was silently failing for a while. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

chenyang78 · 2025-02-14T08:42:11Z

torch/_inductor/kernel/bmm.py

+        len(choices) == 0
+        and not use_aten_gemm_kernels()
+        and inductor_config.autotune_fallback_to_aten
+    ):


super nit - since we use the same condition multiple times, can we make a utility function for it? e.g. we could add one in mm_common.py

I agree, will add

…en before using aten as a fallback. Also fix bmm cutlass backend " This PR also fixes BMM, which was silently failing for a while. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

…back ghstack-source-id: 63d4c35 Pull Request resolved: #147148

henrylhtsang · 2025-02-14T19:45:23Z

torch/_inductor/kernel/mm.py

+    if should_fallback_to_aten(choices):
        choices = [aten__int_mm.bind((mat1, mat2), layout)]

    try:


@chenyang78 I think this try catch is not needed anymore. I plan on removing them in a separate PR, since they are higher risk imo

…en before using aten as a fallback. Also fix bmm cutlass backend " This PR also fixes BMM, which was silently failing for a while. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

Skylion007 · 2025-02-17T15:34:39Z

torch/_inductor/kernel/mm_common.py

 log = logging.getLogger(__name__)


+def should_fallback_to_aten(choices) -> bool:


What's the type of the input here? At least Sized or Sequence right?

Skylion007 · 2025-02-17T15:36:41Z

torch/_inductor/codegen/cuda/gemm_template.py

-        return X.get_size()[1] == W.get_size()[0]
+        X_size, W_size = X.get_size(), W.get_size()
+        if len(X_size) != len(W_size):
+            log.info("X and W have different ranks")


Suggested change

log.info("X and W have different ranks")

log.info("X and W have different ranks: %d, %d", len(X_size), len(W_size))

This would be effectively free if you cached the lens anyway?

…en before using aten as a fallback. Also fix bmm cutlass backend " This PR also fixes BMM, which was silently failing for a while. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

eellison · 2025-02-18T19:57:32Z

torch/_inductor/kernel/mm_common.py

+    fallback_to_aten: bool = (
+        len(choices) == 0
+        and not use_aten_gemm_kernels()
+        and inductor_config.autotune_fallback_to_aten


Do we need autotune_fallback_to_aten when max_autotune_gemm_backends exists ? isnt this duplicative ?

Do we need autotune_fallback_to_aten when max_autotune_gemm_backends exists ? isnt this duplicative ?

I think the intent is to be super safe. Even if users specify only "CUTLASS" and CUTLASS fails, it will still keep things running. I think this safe logic predates the autotune_fallback_to_aten config.

But yeah it has been pretty painful for me when working on cutlass since bunch of stuff are failing silently.

If the user wants to fallback, they should do max_autotune_gemm_backends="CUTLASS, ATEN". I dont think we should have two ways of doing the same exact thing.

@eellison I agree with the idea. I can commit to removing autotune_fallback_to_aten all together and removing the silent fallback logic. But it will take a few PRs and a while. Does that sound good?

eellison · 2025-02-19T00:03:15Z

torch/_inductor/codegen/cuda/gemm_template.py

+        if len(X_size) == 2:
+            return X_size[1] == W_size[0]


Why do we need this check ? this should already be a precondition of lowering.

Why do we need this check ? this should already be a precondition of lowering.

The _shape_match in CUTLASS2x is a bit different for sparse. So I guess that was the intention.

Let me know what you think. I am fine with removing it here for cutlass 3x.

EDIT: changed it to always return True

@eellison updated

eellison · 2025-02-19T00:03:22Z

torch/_inductor/codegen/cuda/gemm_template.py

+            return X_size[1] == W_size[0]
+        if len(X_size) == 3:
+            # for bmm
+            return X_size[2] == W_size[1]


eellison · 2025-02-19T00:03:33Z

torch/_inductor/kernel/mm.py

-    # The only difference between the two templates is M >= BLOCK_M and N >= BLOCK_N checking.
-    # See more details in https://github.com/pytorch/pytorch/pull/146293
-    else r"""
+        if torch.version.hip is None


Unintended format changes ?

Unintended format changes ?

yeah let me remove that..

…en before using aten as a fallback. Also fix bmm cutlass backend " This PR also fixes BMM, which was silently failing for a while. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

…back ghstack-source-id: e30bf4c Pull Request resolved: #147148

eellison

please add test for the newly passing bmm case

henrylhtsang · 2025-02-20T21:17:33Z

please add test for the newly passing bmm case

there is an existing test, test_max_autotune_cutlass_backend_simple_bmm

henrylhtsang · 2025-02-20T21:25:44Z

@pytorchbot merge

pytorchmergebot · 2025-02-20T21:27:26Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

…en before using aten as a fallback. Also fix bmm cutlass backend " This PR also fixes BMM, which was silently failing for a while. cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 amjames chauhang aakhundov [ghstack-poisoned]

…back ghstack-source-id: 86206c1 Pull Request resolved: #147148

pytorchmergebot · 2025-02-20T22:04:52Z

Merge failed

Reason: New commits were pushed while merging. Please rerun the merge command.

Details for Dev Infra team

Raised by workflow job

henrylhtsang · 2025-02-21T18:34:16Z

@pytorchbot merge -i

pytorchmergebot · 2025-02-21T18:36:00Z

Merge started

Your change will be merged while ignoring the following 1 checks: trunk / macos-py3-arm64-mps / test (mps, 1, 1, macos-m1-13)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-02-21T18:42:57Z

This PR (#147148) was merged in 76ce194 but it is still open, likely due to a Github bug, so mergebot is closing it manually. If you think this is a mistake, please feel free to reopen and contact Dev Infra.

…sing aten as a fallback. Also fix bmm cutlass backend (pytorch#147148) This PR also fixes BMM, which was silently failing for a while. Pull Request resolved: pytorch#147148 Approved by: https://github.com/eellison

check if config.autotune_fallback_to_aten before using aten as a fall…

8b48b1c

…back [ghstack-poisoned]

henrylhtsang mentioned this pull request Feb 13, 2025

[cutlass backend][BE] refactor tests to remove duplicate logic #146743

Closed

pytorch-bot bot added ciflow/inductor module: inductor labels Feb 13, 2025

henrylhtsang added a commit that referenced this pull request Feb 13, 2025

check if config.autotune_fallback_to_aten before using aten as a fall…

1dc8fbd

…back ghstack-source-id: aa3d879 Pull Request resolved: #147148

henrylhtsang added the topic: not user facing topic category label Feb 13, 2025

henrylhtsang changed the title ~~check if config.autotune_fallback_to_aten before using aten as a fallback~~ For addmm and bmm, check if config.autotune_fallback_to_aten before using aten as a fallback. Also fix bmm cutlass backend Feb 13, 2025

henrylhtsang requested review from ColinPeppler, chenyang78 and drisspg February 13, 2025 23:02

henrylhtsang marked this pull request as draft February 14, 2025 00:06

henrylhtsang marked this pull request as ready for review February 14, 2025 00:14

This was referenced Feb 14, 2025

[cutlass backend] forward fix of standalone runner for fbcode #147158

Closed

[cutlass backend] remove triton from most tests and add an integration test #147169

Closed

[cutlass backend] add subproc tests #147173

Closed

henrylhtsang mentioned this pull request Feb 14, 2025

Add numerical tests for speciality ops #147178

Closed

chenyang78 reviewed Feb 14, 2025

View reviewed changes

henrylhtsang added a commit that referenced this pull request Feb 14, 2025

check if config.autotune_fallback_to_aten before using aten as a fall…

1bda8f5

…back ghstack-source-id: 63d4c35 Pull Request resolved: #147148

henrylhtsang commented Feb 14, 2025

View reviewed changes

Skylion007 reviewed Feb 17, 2025

View reviewed changes

eellison reviewed Feb 18, 2025

View reviewed changes

eellison reviewed Feb 19, 2025

View reviewed changes

henrylhtsang added a commit that referenced this pull request Feb 19, 2025

check if config.autotune_fallback_to_aten before using aten as a fall…

d9d3b1a

…back ghstack-source-id: e30bf4c Pull Request resolved: #147148

henrylhtsang mentioned this pull request Feb 19, 2025

[RFC] Deprecate silent fallback to aten logic in Inductor #147479

Closed

eellison self-requested a review February 20, 2025 18:05

eellison approved these changes Feb 20, 2025

View reviewed changes

eellison reviewed Feb 20, 2025

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 20, 2025

pytorchmergebot added the merging label Feb 20, 2025

henrylhtsang added a commit that referenced this pull request Feb 20, 2025

check if config.autotune_fallback_to_aten before using aten as a fall…

83b4189

…back ghstack-source-id: 86206c1 Pull Request resolved: #147148

pytorchmergebot removed the merging label Feb 20, 2025

pytorchmergebot added the merging label Feb 21, 2025

pytorchmergebot added the Merged label Feb 21, 2025

pytorchmergebot closed this Feb 21, 2025

pytorchmergebot removed the merging label Feb 21, 2025

github-actions bot deleted the gh/henrylhtsang/9/head branch March 27, 2025 02:12

		log = logging.getLogger(__name__)


		def should_fallback_to_aten(choices) -> bool:

	log.info("X and W have different ranks")
	log.info("X and W have different ranks: %d, %d", len(X_size), len(W_size))

For addmm and bmm, check if config.autotune_fallback_to_aten before using aten as a fallback. Also fix bmm cutlass backend #147148

For addmm and bmm, check if config.autotune_fallback_to_aten before using aten as a fallback. Also fix bmm cutlass backend #147148

Uh oh!

Conversation

henrylhtsang commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/147148

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Skylion007 Feb 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

henrylhtsang Feb 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

eellison left a comment

Choose a reason for hiding this comment

Uh oh!

henrylhtsang commented Feb 20, 2025

Uh oh!

henrylhtsang commented Feb 20, 2025

Uh oh!

pytorchmergebot commented Feb 20, 2025

Merge started

Uh oh!

pytorchmergebot commented Feb 20, 2025

Merge failed

Uh oh!

henrylhtsang commented Feb 21, 2025

Uh oh!

pytorchmergebot commented Feb 21, 2025

Merge started

Uh oh!

pytorchmergebot commented Feb 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

henrylhtsang commented Feb 13, 2025 •

edited

Loading

pytorch-bot bot commented Feb 13, 2025 •

edited

Loading

Skylion007 Feb 17, 2025 •

edited

Loading

henrylhtsang Feb 19, 2025 •

edited

Loading