-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Description
🚀 The feature, motivation and pitch
The proposal is suggested by @eellison. See more context in #147148
tldr
This proposal isn’t going to affect the majority of users. You only need to worry about it if both of the following are true:
You set your custom max_autotune_gemm_backends
You do not have Aten in max_autotune_gemm_backends
Problem
Currently, when max_autotune_gemm is true, GEMM kernels are generated using backends specified in max_autotune_gemm_backends. If these backends fail to produce a valid kernel, Inductor silently falls back to ATen, even when ATen is not included in max_autotune_gemm_backends. This silent fallback behavior is what we want to remove.
Additionally, there is autotune_fallback_to_aten, which attempts to control this silent fallback behavior. However, the correct approach should be to respect the user's choice of backends specified in max_autotune_gemm_backends.
Expected behavior
In the expected behavior, we respect the user's choice of max_autotune_gemm_backends. If a user intentionally excludes ATen from max_autotune_gemm_backends and the specified backends fail to generate a valid kernel, an error will be raised.
Proposal
We want to deprecate it in 3 steps.
Step 1: Gate the behavior with autotune_fallback_to_aten
Setup autotune_fallback_to_aten to control the silent fallback behavior for the remaining ops (addmm, bmm, mixed mm, int mm, etc)
Remove excess fallback logic that are not gated by autotune_fallback_to_aten, for example, link.
Add env variable to control autotune_fallback_to_aten as a kill switch
In this step, we don’t expect any change in behavior.
Step 2: turn off autotune_fallback_to_aten
This should be a one line change to change the behavior.
Step 3: cleanup
We would clean up the logic after 3 weeks - one month, assuming nothing breaks.
Alternatives
No response
Additional context
No response
cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @aakhundov