[RFC] Deprecate silent fallback to aten logic in Inductor

### 🚀 The feature, motivation and pitch

The proposal is suggested by @eellison. See more context in https://github.com/pytorch/pytorch/pull/147148

**tldr**
This proposal isn’t going to affect the majority of users. You only need to worry about it if **both** of the following are true:
You set your custom max_autotune_gemm_backends
You do not have Aten in max_autotune_gemm_backends

# Problem
Currently, when max_autotune_gemm is true, GEMM kernels are generated using backends specified in max_autotune_gemm_backends. If these backends fail to produce a valid kernel, Inductor silently falls back to ATen, even when ATen is not included in max_autotune_gemm_backends. This silent fallback behavior is what we want to remove.

Additionally, there is autotune_fallback_to_aten, which attempts to control this silent fallback behavior. However, the correct approach should be to respect the user's choice of backends specified in max_autotune_gemm_backends.
Expected behavior
In the expected behavior, we respect the user's choice of max_autotune_gemm_backends. If a user intentionally excludes ATen from max_autotune_gemm_backends and the specified backends fail to generate a valid kernel, an error will be raised.

# Proposal 
We want to deprecate it in 3 steps.

## Step 1: Gate the behavior with autotune_fallback_to_aten
Setup autotune_fallback_to_aten to control the silent fallback behavior for the remaining ops (addmm, bmm, mixed mm, int mm, etc)
Remove excess fallback logic that are not gated by autotune_fallback_to_aten, for example, [link](https://github.com/pytorch/pytorch/pull/147148#discussion_r1956643273). 
Add env variable to control autotune_fallback_to_aten as a kill switch

In this step, we don’t expect any change in behavior. 

## Step 2: turn off autotune_fallback_to_aten
This should be a one line change to change the behavior.

## Step 3: cleanup
We would clean up the logic after 3 weeks - one month, assuming nothing breaks.

### Alternatives

_No response_

### Additional context

_No response_

cc @chauhang @penguinwu @voznesenskym @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @amjames @aakhundov

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[RFC] Deprecate silent fallback to aten logic in Inductor #147479

🚀 The feature, motivation and pitch

Problem

Proposal

Step 1: Gate the behavior with autotune_fallback_to_aten

Step 2: turn off autotune_fallback_to_aten

Step 3: cleanup

Alternatives

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Deprecate silent fallback to aten logic in Inductor #147479

Description

🚀 The feature, motivation and pitch

Problem

Proposal

Step 1: Gate the behavior with autotune_fallback_to_aten

Step 2: turn off autotune_fallback_to_aten

Step 3: cleanup

Alternatives

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions