KEMBAR78
[RFC] Deprecate silent fallback to aten logic in Inductor · Issue #147479 · pytorch/pytorch · GitHub
Skip to content

[RFC] Deprecate silent fallback to aten logic in Inductor #147479

@henrylhtsang

Description

@henrylhtsang

🚀 The feature, motivation and pitch

The proposal is suggested by @eellison. See more context in #147148

tldr
This proposal isn’t going to affect the majority of users. You only need to worry about it if both of the following are true:
You set your custom max_autotune_gemm_backends
You do not have Aten in max_autotune_gemm_backends

Problem

Currently, when max_autotune_gemm is true, GEMM kernels are generated using backends specified in max_autotune_gemm_backends. If these backends fail to produce a valid kernel, Inductor silently falls back to ATen, even when ATen is not included in max_autotune_gemm_backends. This silent fallback behavior is what we want to remove.

Additionally, there is autotune_fallback_to_aten, which attempts to control this silent fallback behavior. However, the correct approach should be to respect the user's choice of backends specified in max_autotune_gemm_backends.
Expected behavior
In the expected behavior, we respect the user's choice of max_autotune_gemm_backends. If a user intentionally excludes ATen from max_autotune_gemm_backends and the specified backends fail to generate a valid kernel, an error will be raised.

Proposal

We want to deprecate it in 3 steps.

Step 1: Gate the behavior with autotune_fallback_to_aten

Setup autotune_fallback_to_aten to control the silent fallback behavior for the remaining ops (addmm, bmm, mixed mm, int mm, etc)
Remove excess fallback logic that are not gated by autotune_fallback_to_aten, for example, link.
Add env variable to control autotune_fallback_to_aten as a kill switch

In this step, we don’t expect any change in behavior.

Step 2: turn off autotune_fallback_to_aten

This should be a one line change to change the behavior.

Step 3: cleanup

We would clean up the logic after 3 weeks - one month, assuming nothing breaks.

Alternatives

No response

Additional context

No response

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @aakhundov

Metadata

Metadata

Assignees

Labels

module: inductoroncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions