[rfc] aot precompile with custom backend api #161383

zhxchen17 · 2025-08-25T02:54:51Z

Stack from ghstack (oldest at bottom):

-> [rfc] aot precompile with custom backend api #161383

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs.

On user side it should look like:

def foo(x, y):
    return x + y

compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {}))

This is different from the traditional torch.compile workflow where compiled object will be a drop-in replacement for the original eager model:

tensor input -> torch.compile() -> tensor output (and populates the cache entry)

aot_compile will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo:

tensor input -> aot_compile() -> compiled function

The aot compiled function will be savable and loadable on disk as well:

torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path')
compiled_fn = torch.compiler.load_compiled_function("my/path")

Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable:

class SerializableCallable:
    def save_compile_artifacts(): ....
    def load_compile_artifacts(): ....

We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through torch._dynamo.config.aot_compile (which defaults to False), and this will be left as follow up PR to the current PR.

Differential Revision: D80914270

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @chenyang78 @kadeng @chauhang @amjames @Lucaskabela

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) [ghstack-poisoned]

pytorch-bot · 2025-08-25T02:54:55Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161383

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

Multiple CI trunk failures after landing https://github.com/pytorch/pytorch/pull/161002

❌ 1 New Failure, 1 Unrelated Failure

As of commit 78477c5 with merge base da838f6 ():

NEW FAILURE - The following job has failed:

trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m1-14) (gh)
Build left local git repository checkout dirty

FLAKY - The following job failed but was likely due to flakiness present on trunk:

trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m2-15) (gh) (similar failure)
Build left local git repository checkout dirty

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) ghstack-source-id: 305322399 Pull Request resolved: #161383

facebook-github-bot · 2025-08-25T02:55:19Z

This pull request was exported from Phabricator. Differential Revision: D80914270

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela [ghstack-poisoned]

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) ghstack-source-id: 305774732

facebook-github-bot · 2025-08-26T20:22:18Z

This pull request was exported from Phabricator. Differential Revision: D80914270

zhxchen17 · 2025-08-26T20:59:50Z

@pytorchbot rebase

pytorchmergebot · 2025-08-26T21:01:28Z

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

pytorchmergebot · 2025-08-26T21:01:39Z

Tried to rebase and push PR #161383, but it was already up to date. Try rebasing against main by issuing:
@pytorchbot rebase -b main

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) ghstack-source-id: 0804870

zhxchen17 · 2025-08-26T21:09:10Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-08-26T21:10:37Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

[ghstack-poisoned]

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) ghstack-source-id: 1a1440d

pytorchmergebot · 2025-08-26T21:10:48Z

Successfully rebased gh/zhxchen17/43/orig onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/161383)

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela [ghstack-poisoned]

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. ghstack-source-id: 305803306 Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/)

facebook-github-bot · 2025-08-26T21:54:55Z

This pull request was exported from Phabricator. Differential Revision: D80914270

zhxchen17 · 2025-08-26T22:03:02Z

@pytorchbot rebase -b main

pytorchmergebot · 2025-08-26T22:04:34Z

@pytorchbot started a rebase job onto refs/remotes/origin/main. Check the current status here

[ghstack-poisoned]

pytorchmergebot · 2025-08-26T22:04:46Z

Successfully rebased gh/zhxchen17/43/orig onto refs/remotes/origin/main, please pull locally before adding more changes (for example, via ghstack checkout https://github.com/pytorch/pytorch/pull/161383)

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. ghstack-source-id: fc60d60 Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/)

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela [ghstack-poisoned]

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. ghstack-source-id: 305974657 Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/)

facebook-github-bot · 2025-08-27T16:00:06Z

This pull request was exported from Phabricator. Differential Revision: D80914270

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx chenyang78 kadeng chauhang amjames Lucaskabela [ghstack-poisoned]

Pull Request resolved: #161383 Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. ghstack-source-id: 305974657 Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/)

facebook-github-bot · 2025-08-27T16:03:26Z

This pull request was exported from Phabricator. Differential Revision: D80914270

zhxchen17 · 2025-08-27T18:51:26Z

landing to unblock ongoing work for vllm. open to change the code later

facebook-github-bot · 2025-08-27T20:05:28Z

@pytorchbot merge

(Initiating merge automatically since Phabricator Diff has merged)

pytorchmergebot · 2025-08-27T20:07:18Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorchmergebot · 2025-08-27T20:07:40Z

Merge failed

Reason: 1 jobs have failed, first few of them are: trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m1-14)

Details for Dev Infra team

Raised by workflow job

zhxchen17 · 2025-08-27T21:18:29Z

@pytorchbot merge -i

pytorchmergebot · 2025-08-27T21:20:38Z

Merge started

Your change will be merged while ignoring the following 2 checks: trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m1-14), trunk / macos-py3-arm64 / test (mps, 1, 1, macos-m2-15)

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

Adding a new feature to torch.compile(fullgraph=True) which "aot_compile" a function with given example inputs. On user side it should look like: ``` def foo(x, y): return x + y compiled_fn = torch.compile(fullgraph=True).aot_compile(((torch.randn(3, 4), torch.randn(3, 4)), {})) ``` This is different from the traditional `torch.compile` workflow where compiled object will be a drop-in replacement for the original eager model: ``` tensor input -> torch.compile() -> tensor output (and populates the cache entry) ``` `aot_compile` will instead return a compiled function as result, and it's purely functional and doesn't populate the compile cache entry in dynamo: ``` tensor input -> aot_compile() -> compiled function ``` The aot compiled function will be savable and loadable on disk as well: ``` torch.compile(fullgraph=True).aot_compile(...).save_compiled_function('my/path') compiled_fn = torch.compiler.load_compiled_function("my/path") ``` Right now we treat compiler backend as a blackbox and it needs to implement the following interface to make compile artifacts serialzable: ``` class SerializableCallable: def save_compile_artifacts(): .... def load_compile_artifacts(): .... ``` We haven't implemented this for inductor yet, but this shouldn't be an issue since we gate this feature through `torch._dynamo.config.aot_compile` (which defaults to False), and this will be left as follow up PR to the current PR. Differential Revision: [D80914270](https://our.internmc.facebook.com/intern/diff/D80914270/) Pull Request resolved: pytorch#161383 Approved by: https://github.com/tugsbayasgalan

zhxchen17 mentioned this pull request Aug 25, 2025

[dynamo] Refactor convert_frame.compile_frame to be self contained function. [5/n] #160900

Closed

pytorch-bot bot added ciflow/inductor module: dynamo labels Aug 25, 2025

zhxchen17 requested a review from zou3519 August 25, 2025 02:55

facebook-github-bot added the fb-exported label Aug 25, 2025

zhxchen17 requested a review from jamesjwu August 25, 2025 02:55

zhxchen17 changed the title ~~[poc] aot precompile with custom backend api~~ [rfc] aot precompile with custom backend api Aug 25, 2025

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 25, 2025

zhxchen17 added the topic: not user facing topic category label Aug 25, 2025

zhxchen17 marked this pull request as draft August 25, 2025 02:59

zhxchen17 requested a review from ezyang August 25, 2025 03:04

tugsbayasgalan approved these changes Aug 25, 2025

View reviewed changes

Update

9172e62

[ghstack-poisoned]

Update

d584767

[ghstack-poisoned]

zhxchen17 marked this pull request as ready for review August 27, 2025 17:13

pytorchmergebot added the merging label Aug 27, 2025

pytorchmergebot removed the merging label Aug 27, 2025

pytorchmergebot added the merging label Aug 27, 2025

pytorchmergebot closed this in c36d18d Aug 27, 2025

pytorchmergebot added Merged and removed merging labels Aug 27, 2025

zhxchen17 mentioned this pull request Aug 28, 2025

[poc] aot precompile with custom backend api #161382

Closed

github-actions bot deleted the gh/zhxchen17/43/head branch September 27, 2025 02:07

[rfc] aot precompile with custom backend api #161383

[rfc] aot precompile with custom backend api #161383

Conversation

zhxchen17 commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161383

❗ 1 Active SEVs

❌ 1 New Failure, 1 Unrelated Failure

Uh oh!

facebook-github-bot commented Aug 25, 2025

Uh oh!

facebook-github-bot commented Aug 26, 2025

Uh oh!

zhxchen17 commented Aug 26, 2025

Uh oh!

pytorchmergebot commented Aug 26, 2025

Uh oh!

pytorchmergebot commented Aug 26, 2025

Uh oh!

zhxchen17 commented Aug 26, 2025

Uh oh!

pytorchmergebot commented Aug 26, 2025

Uh oh!

pytorchmergebot commented Aug 26, 2025

Uh oh!

facebook-github-bot commented Aug 26, 2025

Uh oh!

zhxchen17 commented Aug 26, 2025

Uh oh!

pytorchmergebot commented Aug 26, 2025

Uh oh!

pytorchmergebot commented Aug 26, 2025

Uh oh!

facebook-github-bot commented Aug 27, 2025

Uh oh!

facebook-github-bot commented Aug 27, 2025

Uh oh!

zhxchen17 commented Aug 27, 2025

Uh oh!

facebook-github-bot commented Aug 27, 2025

Uh oh!

pytorchmergebot commented Aug 27, 2025

Merge started

Uh oh!

pytorchmergebot commented Aug 27, 2025

Merge failed

Uh oh!

zhxchen17 commented Aug 27, 2025

Uh oh!

pytorchmergebot commented Aug 27, 2025

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhxchen17 commented Aug 25, 2025 •

edited

Loading

pytorch-bot bot commented Aug 25, 2025 •

edited

Loading