KEMBAR78
Use -compress-mode=size for CUDA 13 build for binary size reduction by tinglvv · Pull Request #161316 · pytorch/pytorch · GitHub
Skip to content

Conversation

@tinglvv
Copy link
Collaborator

@tinglvv tinglvv commented Aug 22, 2025

#159779

CUDA 13 added the support for --compress-mode flag for nvcc across all drivers of CUDA 13.X toolkits, enabling the possibility to use --compress-mode=size for significant size reduction (~71% less for CUDA Math APIs for example). https://developer.nvidia.com/blog/whats-new-and-important-in-cuda-toolkit-13-0/

Why we have to add for CUDA 13 only, quote from @ptrblck : Any usage of --compress-mode=size/balance will drop the support of older CUDA drivers and will bump the min. driver requirement to CUDA 12.4. #157791 (comment)

Default for CUDA 13 will be --compress-mode=balance which gives smaller binaries than LZ4 speed mode used in previous CUDA versions.

Related - #157791

cc @ptrblck @nWEIdia @atalman @malfet

@tinglvv tinglvv requested a review from a team as a code owner August 22, 2025 22:06
@pytorch-bot
Copy link

pytorch-bot bot commented Aug 22, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/161316

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures, 3 Unrelated Failures

As of commit be55214 with merge base c8bb0e4 (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@tinglvv tinglvv changed the title Use -compress-mode=size for CUDA 13 build Use -compress-mode=size for CUDA 13 build for binary size reduction Aug 22, 2025
@tinglvv tinglvv added ciflow/binaries Trigger all binary build and upload jobs on the PR topic: not user facing topic category labels Aug 22, 2025
@tinglvv tinglvv self-assigned this Aug 22, 2025
@tinglvv tinglvv mentioned this pull request Aug 22, 2025
15 tasks
@tinglvv
Copy link
Collaborator Author

tinglvv commented Aug 24, 2025

@pytorchbot merge -i

@pytorch-bot pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Aug 24, 2025
@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged while ignoring the following 5 checks: windows-arm64-binary-libtorch-release / libtorch-cpu-shared-with-deps-release-build, windows-arm64-binary-libtorch-debug / libtorch-cpu-shared-with-deps-debug-build, macos-arm64-binary-wheel / wheel-py3_14-cpu-build, windows-binary-wheel / wheel-py3_14-xpu-build, windows-arm64-binary-wheel / wheel-py3_12-cpu-build

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@tinglvv
Copy link
Collaborator Author

tinglvv commented Aug 24, 2025

Wheel size saving comparson:
for [manywheel-py3_9-cuda13_0] for example
after: 579 MB in https://github.com/pytorch/pytorch/actions/runs/17167063966/job/48709750243
before: 725 MB in https://github.com/pytorch/pytorch/actions/runs/17172898848

146 MB saved — ~20.1% smaller

markc-614 pushed a commit to markc-614/pytorch that referenced this pull request Sep 17, 2025
…ytorch#161316)

pytorch#159779

CUDA 13 added the support for --compress-mode flag for nvcc across all drivers of CUDA 13.X toolkits, enabling the possibility to use --compress-mode=size for significant size reduction (~71% less for CUDA Math APIs for example). https://developer.nvidia.com/blog/whats-new-and-important-in-cuda-toolkit-13-0/

Why we have to add for CUDA 13 only, quote from @ptrblck : Any usage of --compress-mode=size/balance will drop the support of older CUDA drivers and will bump the min. driver requirement to CUDA 12.4. pytorch#157791 (comment)

Default for CUDA 13 will be --compress-mode=balance which gives smaller binaries than LZ4 speed mode used in previous CUDA versions.

Related - pytorch#157791

Pull Request resolved: pytorch#161316
Approved by: https://github.com/nWEIdia, https://github.com/Skylion007
@atalman atalman removed this from PyTorch + CUDA Sep 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/binaries Trigger all binary build and upload jobs on the PR ciflow/trunk Trigger trunk jobs on your pull request Merged open source topic: not user facing topic category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants