-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[#6529][feat] CMake option to link statically with cublas/curand #7178
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughAdds two public CMake options Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant CMake
participant Options as "CUBLAS/CURAND options"
participant Platform as "OS check"
participant LinkTargets as "CUDA link targets"
CMake->>Options: read CUBLAS_DYNAMIC_LINKING / CURAND_DYNAMIC_LINKING
CMake->>Platform: check OS
alt dynamic linking selected
CMake->>LinkTargets: select CUDA::cublas / CUDA::cublasLt / CUDA::curand
Note right of LinkTargets #a9d18e: Link dynamic libs
else static linking selected
Platform->>CMake: is Windows?
alt Windows
CMake->>CMake: message(FATAL_ERROR "Static ... not available on windows")
else Non-Windows
CMake->>LinkTargets: select CUDA::cublas_static / CUDA::cublasLt_static / CUDA::curand_static
Note right of LinkTargets #f6d19b: message(DEBUG "Linking with static ...")
end
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related issues
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
✨ Finishing Touches🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (3)
cpp/CMakeLists.txt (3)
159-161: Guard the option as Linux-only to match the help textThe help says “(linux only)” but there’s no platform guard. On Windows this is likely to fail (static cublas/curand are typically unsupported there), and the mismatch between docs and behavior is confusing.
Add an explicit guard:
option(PREFER_STATIC_CUDA_LIBS "Link with static cuda libs (cublas, curand) (linux only)" OFF) +if(PREFER_STATIC_CUDA_LIBS AND WIN32) + message(FATAL_ERROR "PREFER_STATIC_CUDA_LIBS is supported on Linux only.") +endif()
159-171: Log the resolved CUDA libs for easier triageA short status print of the chosen libs helps debug CI matrix variants and user reports.
endif() +message(STATUS "Using CUDA libs: " + "CUBLAS_LIB=${CUBLAS_LIB}, " + "CUBLASLT_LIB=${CUBLASLT_LIB}, " + "CURAND_LIB=${CURAND_LIB}")
159-171: CI follow-up: add a job that toggles PREFER_STATIC_CUDA_LIBSGiven the option defaults to OFF, it can silently rot. Add a CI axis that configures with -DPREFER_STATIC_CUDA_LIBS=ON to validate configuration and linking paths.
Happy to draft a minimal CMake configure-only CI job (e.g., a Jenkins pipeline tweak or a GitHub Actions matrix entry) that exercises this flag without running heavy tests. Want me to open a follow-up PR?
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
cpp/CMakeLists.txt(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please follow the variable for NVRTC:
TensorRT-LLM/cpp/CMakeLists.txt
Line 47 in 71eef0e
| option(NVRTC_DYNAMIC_LINKING "Link against the dynamic NVRTC libraries" OFF) |
And please address the bot's comments. (The issues are valid, but the suggested changes are not good enough to be accepted as is)
Ok, but do you mean: CUBLAS_DYNAMIC_LINKING and CURAND_DYNAMIC_LINKING ? |
Yes, please. |
71eef0e to
b7eecbc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
cpp/CMakeLists.txt (2)
163-184: Verify downstream link lines include pthread/dl/rt when using static CUDA libsWhen static CUDA libs are selected, transitive deps (via culibos) typically require Threads::Threads, dl, and librt. This file only finds RT_LIB, and the actual target_link_libraries calls live in subdirectories. Please confirm those targets link Threads::Threads, dl, and ${RT_LIB} when static is selected to avoid undefined reference errors.
Run from repo root:
#!/bin/bash set -euo pipefail echo "1) Where are CUDA vars used on link lines?" rg -nP -C2 'target_link_libraries\([^)]*\$\{(CUBLAS_LIB|CUBLASLT_LIB|CURAND_LIB)\}' -- glob='!**/build/**' echo echo "2) Do we link pthread/dl/rt anywhere?" rg -nP -C2 'target_link_libraries\([^)]*(Threads::Threads|(\s|^)dl(\s|$)|\$\{RT_LIB\}|[^\w]rt[^\w])' -- glob='!**/build/**' echo echo "3) If missing, suggested pattern to add (example):" cat <<'EOF' find_package(Threads REQUIRED) # find_library(RT_LIB rt) already present at top-level target_link_libraries(<your_target> PUBLIC ${CUBLAS_LIB} ${CURAND_LIB} Threads::Threads dl ${RT_LIB}) EOF
160-162: Make CUDAToolkit component resolution conditional so static targets are discovered/fail fastRight now, find_package unconditionally requests the shared components (cublas/cublasLt/curand). If users turn OFF the dynamic-linking flags, CMake may still configure without the static imported targets present, leading to confusing link-time errors. Build the component list from the options and request cublas_static/cublasLt_static/curand_static when needed; add a target-existence check to fail early with a clear message.
Proposed diff:
-find_package(CUDAToolkit 11.2 REQUIRED COMPONENTS cudart_static cuda_driver - cublas cublasLt curand nvml) +set(_cuda_components cudart_static cuda_driver nvml) +if(CUBLAS_DYNAMIC_LINKING) + list(APPEND _cuda_components cublas cublasLt) +else() + list(APPEND _cuda_components cublas_static cublasLt_static) +endif() +if(CURAND_DYNAMIC_LINKING) + list(APPEND _cuda_components curand) +else() + list(APPEND _cuda_components curand_static) +endif() +find_package(CUDAToolkit 11.2 REQUIRED COMPONENTS ${_cuda_components}) + +# Fail fast if requested static targets are missing +if(NOT CUBLAS_DYNAMIC_LINKING) + foreach(_t IN ITEMS CUDA::cublas_static CUDA::cublasLt_static) + if(NOT TARGET ${_t}) + message(FATAL_ERROR + "${_t} not found. Install CUDA static libraries or enable CUBLAS_DYNAMIC_LINKING.") + endif() + endforeach() +endif() +if(NOT CURAND_DYNAMIC_LINKING) + if(NOT TARGET CUDA::curand_static) + message(FATAL_ERROR + "CUDA::curand_static not found. Install CUDA static libraries or enable CURAND_DYNAMIC_LINKING.") + endif() +endif()
🧹 Nitpick comments (2)
cpp/CMakeLists.txt (2)
48-51: Options added match the NVRTC naming pattern; defaults preserve existing behaviorUsing CUBLAS_DYNAMIC_LINKING/CURAND_DYNAMIC_LINKING mirrors NVRTC_DYNAMIC_LINKING and keeps dynamic linking as the default. This satisfies the prior review guidance while keeping current builds unchanged. Consider clarifying the PR description and user docs that “static linking is opt-in (OFF for dynamic-linking flags)” to avoid confusion. Also consider marking these options as ADVANCED on Windows since static isn’t supported there.
Apply to docs (no code change required here).
163-184: Selection logic looks correct and no longer overridden later — small messaging/UX tweaks
- The previous unconditional resets of CUBLAS_LIB/CUBLASLT_LIB/CURAND_LIB are gone — the option now takes effect. LGTM.
- message(DEBUG …) is easy to miss in most builds; consider STATUS so users see which mode was picked.
- Optional: on Windows, instead of FATAL_ERROR at use-time, you could force the options ON or mark them ADVANCED to prevent toggling them in the cache GUI on that platform.
Suggested tweak to visibility only:
- message(DEBUG "Linking with static cublas libs") + message(STATUS "Linking with static cublas/cublasLt libraries") ... - message(DEBUG "Linking with static curand lib") + message(STATUS "Linking with static curand library")
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
cpp/CMakeLists.txt(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thank you @WilliamTambellini !
10d60be to
e880ae5
Compare
|
Let's clean up the logic a little bit, so we don't have two I can't push to your branch, so please apply the patch below. Thanks. Index: cpp/CMakeLists.txt
===================================================================
diff --git a/cpp/CMakeLists.txt b/cpp/CMakeLists.txt
--- a/cpp/CMakeLists.txt (revision 54afbca0582e610adb3bbee15d1e854762576800)
+++ b/cpp/CMakeLists.txt (date 1756866837547)
@@ -157,65 +157,57 @@
# after that CMake handles it just fine.
setup_cuda_architectures()
+set(CUDA_DRV_LIB CUDA::cuda_driver)
+set(CUDA_NVML_LIB CUDA::nvml)
+set(CUDA_RT_LIB CUDA::cudart_static)
+set(NVPTX_LIB CUDA::nvptxcompiler_static)
+
+set(CUDA_TOOLKIT_COMPONENTS cudart_static cuda_driver nvml nvptxcompiler_static)
+
if(CUBLAS_DYNAMIC_LINKING)
- set(CUBLAS_COMP cublas cublasLt)
+ set(CUBLAS_LIB CUDA::cublas)
+ set(CUBLASLT_LIB CUDA::cublasLt)
+ list(APPEND CUDA_TOOLKIT_COMPONENTS cublas cublasLt)
else()
if(WIN32)
message(FATAL_ERROR "Static cublas not available on windows")
endif()
message(DEBUG "Linking with static cublas libs")
- set(CUBLAS_COMP cublas_static cublasLt_static)
+
+ set(CUBLAS_LIB CUDA::cublas_static)
+ set(CUBLASLT_LIB CUDA::cublasLt_static)
+ list(APPEND CUDA_TOOLKIT_COMPONENTS cublas_static cublasLt_static)
endif()
if(CURAND_DYNAMIC_LINKING)
- set(CURAND_COMP curand)
+ set(CURAND_LIB CUDA::curand)
+ list(APPEND CUDA_TOOLKIT_COMPONENTS curand)
else()
if(WIN32)
message(FATAL_ERROR "Static curand not available on windows")
endif()
message(DEBUG "Linking with static curand lib")
- set(CURAND_COMP curand_static)
-endif()
-
-if(NVRTC_DYNAMIC_LINKING)
- set(NVRTC_COMP nvrtc)
-else()
- set(NVRTC_COMP nvrtc_static)
-endif()
-find_package(
- CUDAToolkit 11.2 REQUIRED
- COMPONENTS cudart_static cuda_driver nvml ${CUBLAS_COMP} ${CURAND_COMP}
- ${NVRTC_COMP})
-
-if(CUBLAS_DYNAMIC_LINKING)
- set(CUBLAS_LIB CUDA::cublas)
- set(CUBLASLT_LIB CUDA::cublasLt)
-else()
- set(CUBLAS_LIB CUDA::cublas_static)
- set(CUBLASLT_LIB CUDA::cublasLt_static)
-endif()
-
-if(CURAND_DYNAMIC_LINKING)
- set(CURAND_LIB CUDA::curand)
-else()
set(CURAND_LIB CUDA::curand_static)
+ list(APPEND CUDA_TOOLKIT_COMPONENTS curand_static)
endif()
-set(CUDA_DRV_LIB CUDA::cuda_driver)
-set(CUDA_NVML_LIB CUDA::nvml)
-set(CUDA_RT_LIB CUDA::cudart_static)
-set(NVPTX_LIB CUDA::nvptxcompiler_static)
-set(CMAKE_CUDA_RUNTIME_LIBRARY Static)
-
if(NVRTC_DYNAMIC_LINKING)
set(NVRTC_LIB CUDA::nvrtc)
set(NVRTC_BUILTINS_LIB CUDA::nvrtc_builtins)
+ list(APPEND CUDA_TOOLKIT_COMPONENTS nvrtc nvrtc_builtins)
else()
set(NVRTC_LIB CUDA::nvrtc_static)
set(NVRTC_BUILTINS_LIB CUDA::nvrtc_builtins_static)
+ list(APPEND CUDA_TOOLKIT_COMPONENTS nvrtc_static nvrtc_builtins_static)
endif()
+find_package(
+ CUDAToolkit 11.2 REQUIRED
+ COMPONENTS ${CUDA_TOOLKIT_COMPONENTS})
+
+set(CMAKE_CUDA_RUNTIME_LIBRARY Static)
+
resolve_dirs(CUDAToolkit_INCLUDE_DIRS "${CUDAToolkit_INCLUDE_DIRS}")
message(STATUS "CUDA library status:") |
…th cublas/curand Add 2 CMake options (CUBLAS/CURAND_DYNAMIC_LINKING) in order to link statically with cublas/curand. Linux only. OFF by default. Signed-off-by: William Tambellini <wtambellini@sdl.com>
54afbca to
0d4b298
Compare
|
done |
|
/bot run |
|
PR_Github #17474 [ run ] triggered by Bot |
|
PR_Github #17474 [ run ] completed with state |
|
/bot run |
1 similar comment
|
/bot run |
|
PR_Github #17593 [ run ] triggered by Bot |
|
PR_Github #17593 [ run ] completed with state |
|
/bot run |
|
PR_Github #18126 [ run ] triggered by Bot |
|
PR_Github #18126 [ run ] completed with state |
NVIDIA#7178) Close NVIDIA#6529. Signed-off-by: William Tambellini <wtambellini@sdl.com>
Add a CMake option in order to link statically with cublas/curand.
OFF by default.
Description
Issue: no option to link statically with cublas/curand
Solution: add a cmake option and set the lib to cublas/curand_static
Test Coverage
Would need another CI job with this option on.
GitHub Bot Help
/bot [-h] ['run', 'kill', 'skip', 'reuse-pipeline'] ...Provide a user friendly way for developers to interact with a Jenkins server.
Run
/bot [-h|--help]to print this help message.See details below for each supported subcommand.
run [--reuse-test (optional)pipeline-id --disable-fail-fast --skip-test --stage-list "A10-PyTorch-1, xxx" --gpu-type "A30, H100_PCIe" --test-backend "pytorch, cpp" --add-multi-gpu-test --only-multi-gpu-test --disable-multi-gpu-test --post-merge --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx" --detailed-log --debug(experimental)]Launch build/test pipelines. All previously running jobs will be killed.
--reuse-test (optional)pipeline-id(OPTIONAL) : Allow the new pipeline to reuse build artifacts and skip successful test stages from a specified pipeline or the last pipeline if no pipeline-id is indicated. If the Git commit ID has changed, this option will be always ignored. The DEFAULT behavior of the bot is to reuse build artifacts and successful test results from the last pipeline.--disable-reuse-test(OPTIONAL) : Explicitly prevent the pipeline from reusing build artifacts and skipping successful test stages from a previous pipeline. Ensure that all builds and tests are run regardless of previous successes.--disable-fail-fast(OPTIONAL) : Disable fail fast on build/tests/infra failures.--skip-test(OPTIONAL) : Skip all test stages, but still run build stages, package stages and sanity check stages. Note: Does NOT update GitHub check status.--stage-list "A10-PyTorch-1, xxx"(OPTIONAL) : Only run the specified test stages. Examples: "A10-PyTorch-1, xxx". Note: Does NOT update GitHub check status.--gpu-type "A30, H100_PCIe"(OPTIONAL) : Only run the test stages on the specified GPU types. Examples: "A30, H100_PCIe". Note: Does NOT update GitHub check status.--test-backend "pytorch, cpp"(OPTIONAL) : Skip test stages which don't match the specified backends. Only support [pytorch, cpp, tensorrt, triton]. Examples: "pytorch, cpp" (does not run test stages with tensorrt or triton backend). Note: Does NOT update GitHub pipeline status.--only-multi-gpu-test(OPTIONAL) : Only run the multi-GPU tests. Note: Does NOT update GitHub check status.--disable-multi-gpu-test(OPTIONAL) : Disable the multi-GPU tests. Note: Does NOT update GitHub check status.--add-multi-gpu-test(OPTIONAL) : Force run the multi-GPU tests in addition to running L0 pre-merge pipeline.--post-merge(OPTIONAL) : Run the L0 post-merge pipeline instead of the ordinary L0 pre-merge pipeline.--extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx"(OPTIONAL) : Run the ordinary L0 pre-merge pipeline and specified test stages. Examples: --extra-stage "H100_PCIe-TensorRT-Post-Merge-1, xxx".--detailed-log(OPTIONAL) : Enable flushing out all logs to the Jenkins console. This will significantly increase the log volume and may slow down the job.--debug(OPTIONAL) : Experimental feature. Enable access to the CI container for debugging purpose. Note: Specify exactly one stage in thestage-listparameter to access the appropriate container environment. Note: Does NOT update GitHub check status.For guidance on mapping tests to stage names, see
docs/source/reference/ci-overview.mdand the
scripts/test_to_stage_mapping.pyhelper.kill
killKill all running builds associated with pull request.
skip
skip --comment COMMENTSkip testing for latest commit on pull request.
--comment "Reason for skipping build/test"is required. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.reuse-pipeline
reuse-pipelineReuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break.
Summary by CodeRabbit
New Features
Documentation