[aoti] Add cpp loader #135374

angelayi · 2024-09-06T17:34:22Z

Stack from ghstack (oldest at bottom):

-> [aoti] Add cpp loader #135374
Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the run function, whereas in cpp users can directly access the runner_ if they are more familiar with that. I couldn't figure out how to bind the get_runner() function to python...
Added a new config, aot_inductor.package_cpp_only which will not package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The package_cpp_only is a feature which torchchat intends to use to provide flexibility to users.
Added a new config, aot_inductor.metadata which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through loader.get_metadata(). TODO is to move this metadata to the toplevel package_aoti function so that we can remove the metadata as a config.
Separated out package_aoti as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in test_multiple_methods.
load_package will load a singular model, given the model name.
The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows?

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang

Differential Revision: D62329906

[ghstack-poisoned]

pytorch-bot · 2024-09-06T17:34:25Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135374

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (1 Unrelated Failure)

As of commit 51be90c with merge base eb38ee2 ():

FLAKY - The following job failed but was likely due to flakiness present on trunk:

docker-builds / docker-build (linux.12xlarge, pytorch-linux-jammy-py3.9-gcc11-inductor-benchmarks) (gh) (detected as infra flaky with no log or failing log classifier)

This comment was automatically generated by Dr. CI and updates every 15 minutes.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

torch/csrc/inductor/aoti_package/model_package_loader.cpp

torch/_inductor/__init__.py

* Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will **not** package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

ghstack-source-id: 46ce8d1 Pull Request resolved: #135374

angelayi · 2024-09-06T23:33:58Z

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

desertfire · 2024-09-07T21:07:10Z

torch/csrc/inductor/aoti_runner/model_container_runner.h

 };

-using CreateAOTIModelRunnerFunc = std::shared_ptr<AOTIModelContainerRunner> (*)(
+using CreateAOTIModelRunnerFunc = std::unique_ptr<AOTIModelContainerRunner> (*)(


cc @EikanWang , we are updating this to return unique_ptr.

desertfire · 2024-09-07T21:40:31Z

torch/csrc/inductor/aoti_package/model_package_loader.cpp

+}
+
+std::tuple<std::string, std::string> AOTIModelPackageLoader::
+    get_cpp_compile_command(


This works, but I wonder why we didn't store the whole compile option as one string in the first place?

Nikita mentioned it would be better to store it as a set of flags instead of one string for better debugability and hackiness.

desertfire · 2024-09-07T21:42:27Z

torch/csrc/inductor/aoti_package/model_package_loader.cpp

+
+  std::string include_dirs_args = "";
+  for (auto& arg : compile_options["include_dirs"]) {
+    include_dirs_args += "-I" + arg.get<std::string>() + " ";


Are the directories stored as absolute paths or relative paths? This seems problematic if the compilation platform and deployment platform are different...

Discussed offline. The compiling from cpp work needs more thinking and will not be used in this PR.

Ah I didn't consider this.. I think I need to take a closer look into the cpp build side. For now this is gated by the flag packge_cpp_only.

test/inductor/test_aot_inductor_package.py

torch/csrc/inductor/aoti_package/model_package_loader.cpp

angelayi · 2024-09-09T20:42:48Z

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

* Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will **not** package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang Differential Revision: [D62329906](https://our.internmc.facebook.com/intern/diff/D62329906) [ghstack-poisoned]

ghstack-source-id: 55a1c31 Pull Request resolved: #135374

* Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will **not** package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang Differential Revision: [D62329906](https://our.internmc.facebook.com/intern/diff/D62329906) [ghstack-poisoned]

ghstack-source-id: 332b89c Pull Request resolved: #135374

angelayi · 2024-09-10T23:30:11Z

@angelayi has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

angelayi · 2024-09-11T02:52:48Z

@pytorchbot merge

pytorchmergebot · 2024-09-11T02:54:32Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

* Added a cpp loader, AOTIModelPackageLoader, which can load the .pt2, build the .so, and create a runner. The python-facing API is that users can directly call the `run` function, whereas in cpp users can directly access the `runner_` if they are more familiar with that. I couldn't figure out how to bind the `get_runner()` function to python... * Added a new config, `aot_inductor.package_cpp_only` which will **not** package the so. This means that whenever the package is loaded, we will need to build the so. This is turned off by default so that new environments do not need to rebuild their so. The `package_cpp_only` is a feature which torchchat intends to use to provide flexibility to users. * Added a new config, `aot_inductor.metadata` which stores user-provided metadata, serialized to the pt2 as a json file. It also stores the device used when exporting, "cuda" or "cpu", so that during load time, we can use that data to determine which AOTIModelContainerRunner to use. The metadata can be accessed through `loader.get_metadata()`. TODO is to move this metadata to the toplevel `package_aoti` function so that we can remove the metadata as a config. * Separated out `package_aoti` as a standalone function, instead of it automatically being called in inductor. This is to prepare for the case where users will compile multiple models, and want to bundle it in one package. The specific use case is in torchchat, where we want to package the separately-exported encoder and decoder layers. An example of how to use this is in `test_multiple_methods`. * `load_package` will load a singular model, given the model name. * The loader doesn't support windows for now, I think I need to add some more casing to make the build commands work on windows? Differential Revision: [D62329906](https://our.internmc.facebook.com/intern/diff/D62329906) Pull Request resolved: pytorch#135374 Approved by: https://github.com/desertfire, https://github.com/malfet

* Apply changes from #135374 * Fix dependency on filesystem on Linux (#137209) Similar to: #134494 We are seeing come back of #133437 due to use of filesystem on Linux Pull Request resolved: #137209 Approved by: https://github.com/kit1980, https://github.com/malfet --------- Co-authored-by: atalman <atalman@fb.com>

[aoti] Add cpp loader

1b15c32

[ghstack-poisoned]

pytorch-bot bot added ciflow/inductor module: inductor labels Sep 6, 2024

angelayi mentioned this pull request Sep 6, 2024

[aoti] Save additional sources #135375

Closed

Update on "[aoti] Add cpp loader"

45a5958

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy yf225 chenyang78 kadeng muchulee8 ColinPeppler amjames desertfire chauhang [ghstack-poisoned]

angelayi requested review from desertfire and malfet September 6, 2024 17:37

angelayi mentioned this pull request Sep 6, 2024

[aoti] Add cpp loader #134865

Closed

angelayi added the release notes: export label Sep 6, 2024

desertfire reviewed Sep 6, 2024

View reviewed changes

torch/csrc/inductor/aoti_package/model_package_loader.cpp Outdated Show resolved Hide resolved

desertfire reviewed Sep 6, 2024

View reviewed changes

torch/_inductor/__init__.py Outdated Show resolved Hide resolved

angelayi requested a review from jeffdaily as a code owner September 6, 2024 21:18

angelayi added a commit that referenced this pull request Sep 6, 2024

[aoti] Add cpp loader

fec5fd4

ghstack-source-id: 46ce8d1 Pull Request resolved: #135374

angelayi requested a review from desertfire September 6, 2024 23:15

desertfire reviewed Sep 7, 2024

View reviewed changes

desertfire reviewed Sep 8, 2024

View reviewed changes

desertfire approved these changes Sep 9, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Sep 9, 2024

malfet approved these changes Sep 9, 2024

View reviewed changes

test/inductor/test_aot_inductor_package.py Show resolved Hide resolved

test/inductor/test_aot_inductor_package.py Outdated Show resolved Hide resolved

torch/csrc/inductor/aoti_package/model_package_loader.cpp Show resolved Hide resolved

angelayi requested a review from a team as a code owner September 9, 2024 23:47

angelayi added a commit that referenced this pull request Sep 9, 2024

[aoti] Add cpp loader

7ba4ae7

ghstack-source-id: 55a1c31 Pull Request resolved: #135374

angelayi added a commit that referenced this pull request Sep 10, 2024

[aoti] Add cpp loader

9c10a09

ghstack-source-id: 332b89c Pull Request resolved: #135374

pytorchmergebot added the merging label Sep 11, 2024

pytorchmergebot added the Merged label Sep 11, 2024

pytorchmergebot closed this in cd9ee49 Sep 11, 2024

pytorchmergebot removed the merging label Sep 11, 2024

This was referenced Oct 3, 2024

[RELEASE-ONLY CHANGES] Fix dependency on filesystem on Linux #137241

Closed

[RELEASE-ONLY CHANGES] Fix dependency on filesystem on Linux #137242

Merged

github-actions bot deleted the gh/angelayi/54/head branch October 12, 2024 02:07

seemethere mentioned this pull request Jan 23, 2025

seg fault in aot_inductor_package on arm GPU with 2.6.0 RC #145441

Closed

[aoti] Add cpp loader #135374

[aoti] Add cpp loader #135374

Uh oh!

Conversation

angelayi commented Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135374

✅ You can merge normally! (1 Unrelated Failure)

Uh oh!

Uh oh!

Uh oh!

angelayi commented Sep 6, 2024

Uh oh!

desertfire Sep 7, 2024

Choose a reason for hiding this comment

Uh oh!

desertfire Sep 7, 2024

Choose a reason for hiding this comment

Uh oh!

angelayi Sep 9, 2024

Choose a reason for hiding this comment

Uh oh!

desertfire Sep 7, 2024

Choose a reason for hiding this comment

Uh oh!

desertfire Sep 9, 2024

Choose a reason for hiding this comment

Uh oh!

angelayi Sep 9, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

angelayi commented Sep 9, 2024

Uh oh!

angelayi commented Sep 10, 2024

Uh oh!

angelayi commented Sep 11, 2024

Uh oh!

pytorchmergebot commented Sep 11, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

angelayi commented Sep 6, 2024 •

edited

Loading

pytorch-bot bot commented Sep 6, 2024 •

edited

Loading