KEMBAR78
[AOTI] Add a multi_arch_kernel_binary option by desertfire · Pull Request #154413 · pytorch/pytorch · GitHub
Skip to content

Conversation

@desertfire
Copy link
Contributor

@desertfire desertfire commented May 27, 2025

Stack from ghstack (oldest at bottom):

Summary: CUDA can support multi-arch with the fatbin format. Add this multi_arch_kernel_binary option, so the compiled model binary can run across different GPU archs.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov

Differential Revision: D75452094

Summary: CUDA can support multi-arch with the fatbin format. Add this multi_arch_kernel_binary option, so the compiled model binary can run across different GPU archs.

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented May 27, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/154413

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (3 Unrelated Failures)

As of commit 73ff440 with merge base ef6306e (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: CUDA can support multi-arch with the fatbin format. Add this multi_arch_kernel_binary option, so the compiled model binary can run across different GPU archs.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
@desertfire
Copy link
Contributor Author

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label May 27, 2025
Summary: CUDA can support multi-arch with the fatbin format. Add this multi_arch_kernel_binary option, so the compiled model binary can run across different GPU archs.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

Differential Revision: [D75452094](https://our.internmc.facebook.com/intern/diff/D75452094)

[ghstack-poisoned]
@desertfire
Copy link
Contributor Author

@desertfire has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #154414

3 similar comments
@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #154414

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #154414

@pytorchmergebot
Copy link
Collaborator

Starting merge as part of PR stack under #154414

pytorchmergebot pushed a commit that referenced this pull request May 28, 2025
Summary: Add support of multi_arch_kernel_binary in the package_cpp_only mode. More specifically, generate specific cmake targets to compile .ptx to .fatbin and embed them in the final shared library or binary.

Differential Revision: [D75452096](https://our.internmc.facebook.com/intern/diff/D75452096)
Pull Request resolved: #154414
Approved by: https://github.com/angelayi
ghstack dependencies: #154412, #154413
desertfire added a commit that referenced this pull request May 28, 2025
Summary: Add support of multi_arch_kernel_binary in the package_cpp_only mode. More specifically, generate specific cmake targets to compile .ptx to .fatbin and embed them in the final shared library or binary.

Differential Revision: [D75452096](https://our.internmc.facebook.com/intern/diff/D75452096)
Pull Request resolved: #154414
Approved by: https://github.com/angelayi
ghstack dependencies: #154412, #154413
ghstack-source-id: 55d13e3
etaf added a commit that referenced this pull request Jun 3, 2025
…l_binary option for XPU."


Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
etaf added a commit that referenced this pull request Jun 3, 2025
…for XPU."


Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs.

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
etaf added a commit that referenced this pull request Jun 3, 2025
…l_binary option for XPU."


Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs.

Fixes #154682, Fixes #154683, Fixes 154689, Fixes #154685 , Fixes #154690, Fixes #154681

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
etaf added a commit that referenced this pull request Jun 3, 2025
…for XPU."


Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs.

Fixes #154682, Fixes #154683, Fixes 154689, Fixes #154685 , Fixes #154690, Fixes #154681

cc voznesenskym penguinwu EikanWang jgong5 Guobing-Chen XiaobingSuper zhuhaozhe blzheng wenzhe-nrv jiayisunx ipiszy chenyang78 kadeng muchulee8 amjames chauhang aakhundov

[ghstack-poisoned]
pytorchmergebot pushed a commit that referenced this pull request Jun 3, 2025
…154514)

Following the design of #154413, this PR add XPU support for generating kernel binary files that support multiple archs.

Fixes #154682, Fixes #154683, Fixes 154689, Fixes #154685 , Fixes #154690, Fixes #154681

Pull Request resolved: #154514
Approved by: https://github.com/desertfire, https://github.com/EikanWang
iupaikov-amd pushed a commit to ROCm/pytorch that referenced this pull request Jun 4, 2025
…ytorch#154514)

Following the design of pytorch#154413, this PR add XPU support for generating kernel binary files that support multiple archs.

Fixes pytorch#154682, Fixes pytorch#154683, Fixes 154689, Fixes pytorch#154685 , Fixes pytorch#154690, Fixes pytorch#154681

Pull Request resolved: pytorch#154514
Approved by: https://github.com/desertfire, https://github.com/EikanWang
angelayi pushed a commit to angelayi/pytorch that referenced this pull request Jun 5, 2025
…ytorch#154514)

Following the design of pytorch#154413, this PR add XPU support for generating kernel binary files that support multiple archs.

Fixes pytorch#154682, Fixes pytorch#154683, Fixes 154689, Fixes pytorch#154685 , Fixes pytorch#154690, Fixes pytorch#154681

Pull Request resolved: pytorch#154514
Approved by: https://github.com/desertfire, https://github.com/EikanWang
@github-actions github-actions bot deleted the gh/desertfire/578/head branch June 27, 2025 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants