-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[None][fix] Clean up linking to CUDA stub libraries in build_wheel.py #6823
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[None][fix] Clean up linking to CUDA stub libraries in build_wheel.py #6823
Conversation
📝 WalkthroughWalkthroughConsolidates OpenCV steps in the Dockerfile; refactors Python binding stub generation into OS-specific helper functions with CUDA stub symlink handling and typing/signature updates in scripts/build_wheel.py; and adds CUDA driver linkage while removing CUDA stubs rpath entries in two CMakeLists. Changes
Sequence Diagram(s)sequenceDiagram
participant Dev as Developer (invoke build)
participant BW as build_wheel.py (main)
participant Venv as setup_venv
participant GenL as generate_python_stubs_linux
participant GenW as generate_python_stubs_windows
participant CUDA as create_cuda_stub_links
participant Stub as Stub Generator
Dev->>BW: start build
BW->>Venv: setup_venv() -> (venv_python, lib_dir)
alt Linux
BW->>GenL: generate_python_stubs_linux(binding_type, venv_python, deep_ep)
GenL->>CUDA: create_cuda_stub_links(missing_libs)
GenL->>Stub: install/run nanobind & pybind11-stubgen and generate stubs
else Windows
BW->>GenW: generate_python_stubs_windows(binding_type, venv_python, pkg_dir, lib_dir)
GenW->>Stub: install pybind11-stubgen, preload Torch/TRT, run stubgen, cleanup
end
BW-->>Dev: build completes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
|
/bot run --skip-multigpu-tests |
|
/bot run --disable-multi-gpu-test |
|
PR_Github #14946 Bot args parsing error: usage: /bot [-h] |
|
PR_Github #14950 [ run ] triggered by Bot |
|
PR_Github #14950 [ run ] completed with state |
|
/bot -h |
GitHub Bot Help
Provide a user friendly way for developers to interact with a Jenkins server. Run See details below for each supported subcommand.
Launch build/test pipelines. All previously running jobs will be killed.
kill
Kill all running builds associated with pull request. skip
Skip testing for latest commit on pull request. reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break. |
|
/bot run --disable-multi-gpu-test --disable-fail-fast |
|
PR_Github #14978 [ run ] triggered by Bot |
113cab8 to
bdfbeb5
Compare
|
/bot kill |
|
PR_Github #14979 [ kill ] triggered by Bot |
|
PR_Github #14978 [ run ] completed with state |
|
PR_Github #14979 [ kill ] completed with state |
|
/bot run --disable-multi-gpu-test --disable-fail-fast |
|
PR_Github #14980 [ run ] triggered by Bot |
|
PR_Github #14980 [ run ] completed with state |
|
/bot run --disable-multi-gpu-test --disable-fail-fast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (1)
scripts/build_wheel.py (1)
461-464: Windows mutation of extra_cmake_vars may TypeError with tuple defaultSee prior comment. Using += on a tuple will raise a TypeError if main() is invoked with its default. Apply the fix to coerce to list before concatenation.
🧹 Nitpick comments (1)
scripts/build_wheel.py (1)
123-124: Good: add return type annotation; minor docstring nitAdding -> tuple[Path, Path] improves clarity. Consider aligning the docstring’s “Returns:” type with the annotation (tuple[Path, Path]) for consistency with Google-style docstrings.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
cpp/tensorrt_llm/nanobind/CMakeLists.txt(2 hunks)cpp/tensorrt_llm/pybind/CMakeLists.txt(2 hunks)scripts/build_wheel.py(5 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
scripts/build_wheel.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
scripts/build_wheel.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (6)
cpp/tensorrt_llm/pybind/CMakeLists.txt (2)
47-47: Correct: link against CUDA driver explicitlyAdding CUDA::cuda_driver makes the dependency explicit and removes reliance on the Toolkit stubs at runtime. This aligns with the PR goal.
59-60: RPATH cleanup looks good; verify loader behavior in stubgen environmentsRemoving the CUDA stubs rpath is appropriate if the build-time stubs aren’t needed at runtime. Given scripts/build_wheel.py now manages LD_LIBRARY_PATH for stubs, this is consistent. Please verify that stub generation succeeds in environments without a host driver or system libcuda/libnvidia-ml by relying solely on the updated script logic.
cpp/tensorrt_llm/nanobind/CMakeLists.txt (2)
46-46: Correct: link against CUDA driver explicitlyMirroring pybind, adding CUDA::cuda_driver is the right move for consistency and to avoid Toolkit stub rpaths.
58-60: RPATH cleanup aligns with pybind; confirm stubgen workflows cover nanobindThe rpath cleanup mirrors the pybind change. Ensure that the Linux stub generation path for nanobind (nanobind.stubgen) still works in environments lacking the real driver, relying on the new LD_LIBRARY_PATH setup.
scripts/build_wheel.py (2)
30-30: Typing import change is fineSwitching to Sequence is appropriate for the updated signature in main().
804-813: Good refactor: centralize stub generation per-OSSwitching to generate_python_stubs_windows/Linux improves readability and maintainability. After fixing the Windows quoting and LD_LIBRARY_PATH handling on Linux, this flow should be robust.
|
PR_Github #15151 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
scripts/build_wheel.py (4)
284-317: Avoid usingsudoin build scripts and handle user-writable symlink directoryCreating symlinks in system directories using
sudocan fail in CI environments and is a security risk. The function should create symlinks in a user-writable temporary directory when permission is denied.Apply this refactor to avoid
sudoand return the symlink directory:-def create_cuda_stub_links(cuda_stub_dir: str, missing_libs: list[str]): +def create_cuda_stub_links(cuda_stub_dir: str, missing_libs: list[str]) -> Path: """ Creates symbolic links for CUDA stub libraries in the provided directory. Args: cuda_stub_dir (str): Path to the directory containing CUDA stubs. missing_libs: Versioned names of the missing libraries. + + Returns: + Path: Directory containing the symlinks (either cuda_stub_dir or a temp dir) """ cuda_stub_path = Path(cuda_stub_dir) if not cuda_stub_path.exists(): raise RuntimeError( f"CUDA stub directory '{cuda_stub_dir}' does not exist.") + link_dir = cuda_stub_path version_pattern = r'\.\d+$' - for missing_lib in filter(lambda x: re.search(version_pattern, x), - missing_libs): - # Define `so` as the first part of `missing_lib` with trailing '.' and digits removed - so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) - so_versioned = cuda_stub_path / missing_lib - - # Check if the library exists and the versioned link does not. - if so.exists() and not so_versioned.exists(): - try: - # Attempt to create the symbolic link. - so_versioned.symlink_to(so) - except PermissionError: - # Handle permission errors by attempting to use `sudo` to create the link. - try: - build_run(f"sudo ln -s {str(so)} {str(so_versioned)}") - except CalledProcessError as sudo_error: - print( - f"Failed to create symbolic link even with sudo: {sudo_error}" - ) + + try: + # Try creating symlinks in the original directory + for missing_lib in filter(lambda x: re.search(version_pattern, x), + missing_libs): + so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) + so_versioned = cuda_stub_path / missing_lib + + if so.exists() and not so_versioned.exists(): + so_versioned.symlink_to(so) + except PermissionError: + # Fallback to user-writable temp directory + import tempfile + link_dir = Path(tempfile.mkdtemp(prefix="tllm_cuda_stubs_")) + print(f"Creating symlinks in temporary directory: {link_dir}") + + for missing_lib in filter(lambda x: re.search(version_pattern, x), + missing_libs): + so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) + target = link_dir / missing_lib + + if so.exists() and not target.exists(): + # Also copy the original .so file to the temp dir + import shutil + shutil.copy2(so, link_dir / so.name) + target.symlink_to(so.name) + + return link_dir
333-367: Use the returned symlink directory in LD_LIBRARY_PATHThe function now needs to use the directory returned by
create_cuda_stub_linksinstead of always using the originalcuda_stub_dir.Apply this fix:
def generate_python_stubs_linux(binding_type: str, venv_python: Path, deep_ep: bool): is_nanobind = binding_type == "nanobind" if is_nanobind: build_run(f"\"{venv_python}\" -m pip install nanobind") build_run(f"\"{venv_python}\" -m pip install pybind11-stubgen") env_stub_gen = os.environ.copy() cuda_home_dir = env_stub_gen.get("CUDA_HOME") or env_stub_gen.get( "CUDA_PATH") or "/usr/local/cuda" missing_libs = check_missing_libs("bindings") cuda_stub_dir = f"{cuda_home_dir}/lib64/stubs" if missing_libs and Path(cuda_stub_dir).exists(): # Create symbolic links for the CUDA stubs - create_cuda_stub_links(cuda_stub_dir, missing_libs) + link_dir = create_cuda_stub_links(cuda_stub_dir, missing_libs) ld_library_path = env_stub_gen.get("LD_LIBRARY_PATH") env_stub_gen[ - "LD_LIBRARY_PATH"] = f"{ld_library_path}:{cuda_stub_dir}" if ld_library_path else cuda_stub_dir + "LD_LIBRARY_PATH"] = f"{ld_library_path}:{link_dir}" if ld_library_path else str(link_dir) if is_nanobind: - build_run(f"\"{venv_python}\" -m nanobind.stubgen -m bindings -O .", + build_run(f"\"{venv_python}\" -m nanobind.stubgen -m bindings -O . --exit-code", env=env_stub_gen) else: build_run( f"\"{venv_python}\" -m pybind11_stubgen -o . bindings --exit-code", env=env_stub_gen)
369-397: Fix invalid Python string escaping and usesys.exitThe Windows stub generation has several issues:
- Uses
exit(1)instead ofsys.exit(1)- Invalid escaping
r\"{lib_dir}\"in the generated Python code- Should use an f-string for cleaner interpolation
Apply this fix:
def generate_python_stubs_windows(binding_type: str, venv_python: Path, pkg_dir: Path, lib_dir: Path): if binding_type == "nanobind": print("Windows not yet supported for nanobind stubs") - exit(1) + sys.exit(1) else: build_run(f"\"{venv_python}\" -m pip install pybind11-stubgen") stubgen = "stubgen.py" - stubgen_contents = """ + stubgen_contents = f""" # Loading torch, trt before bindings is required to avoid import errors on windows. # isort: off import torch import tensorrt as trt # isort: on import os import platform from pybind11_stubgen import main if __name__ == "__main__": # Load dlls from `libs` directory before launching bindings. if platform.system() == "Windows": - os.add_dll_directory(r\"{lib_dir}\") + os.add_dll_directory(r"{lib_dir}") main() - """.format(lib_dir=lib_dir) + """ (pkg_dir / stubgen).write_text(dedent(stubgen_contents)) - build_run(f"\"{venv_python}\" {stubgen} -o . bindings") + build_run(f"\"{venv_python}\" {stubgen} -o . bindings --exit-code") (pkg_dir / stubgen).unlink()
406-407: Ensure safe mutation ofextra_cmake_varsparameterThe
main()function signature was changed to useSequence[str] = tuple()forextra_cmake_vars, but line 481 attempts to mutate it with+=. This will fail if a tuple is passed.Apply this fix at the beginning of the
main()function:def main(*, build_type: str = "Release", generator: str = "", build_dir: Path = None, dist_dir: Path = None, cuda_architectures: str = None, job_count: int = None, extra_cmake_vars: Sequence[str] = tuple(), extra_make_targets: str = "", # ... other parameters ... ): + # Convert to list to allow mutation + extra_cmake_vars = list(extra_cmake_vars) if clean: clean_wheel = TrueOr alternatively, use a safer approach at line 481:
if on_windows: # Windows does not support multi-device currently. - extra_cmake_vars += ["ENABLE_MULTI_DEVICE=0"] + extra_cmake_vars = list(extra_cmake_vars) + ["ENABLE_MULTI_DEVICE=0"]Also applies to: 481-481
🧹 Nitpick comments (1)
scripts/build_wheel.py (1)
319-331: Consider error handling improvements incheck_missing_libsThe function correctly identifies missing libraries but could benefit from better error handling if the
lddcommand fails or if the.sofile doesn't exist.Consider adding error handling:
def check_missing_libs(so_prefix: str) -> list[str]: + import glob + so_files = glob.glob(f"{so_prefix}.cpython*.so") + if not so_files: + print(f"Warning: No shared library found matching pattern {so_prefix}.cpython*.so") + return [] + + try: result = build_run(f"ldd {so_prefix}.cpython*.so", capture_output=True, text=True) + except CalledProcessError as e: + print(f"Warning: ldd command failed: {e}") + return [] + missing = [] for line in result.stdout.splitlines(): if "not found" in line: lib_name = line.split()[ 0] # Extract the library name before "=> not found" if lib_name not in missing: missing.append(lib_name) return missing
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/build_wheel.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
scripts/build_wheel.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
scripts/build_wheel.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
|
review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (4)
scripts/build_wheel.py (4)
284-317: Address previous review concerns and add error handlingThis implementation still has the issues identified in the previous review:
- Uses
sudowhich can be fragile in CI environments- Doesn't return a directory path for the caller to use
- Missing proper error handling for edge cases
Additionally, new issues:
- Line 298: The
filterandre.searchusage is inefficient- Line 315-316: The print statement spans multiple lines unnecessarily
Apply this refactored approach to address the previous review concerns:
-def create_cuda_stub_links(cuda_stub_dir: str, missing_libs: list[str]): +def create_cuda_stub_links(cuda_stub_dir: str, missing_libs: list[str]) -> Path: """ - Creates symbolic links for CUDA stub libraries in the provided directory. + Creates symbolic links for CUDA stub libraries, returning the effective directory. Args: cuda_stub_dir (str): Path to the directory containing CUDA stubs. - missing_libs: Versioned names of the missing libraries. - """ + missing_libs: Versioned names of the missing libraries. + + Returns: + Path: Directory containing the symlinks (may be original or fallback). + """ cuda_stub_path = Path(cuda_stub_dir) if not cuda_stub_path.exists(): raise RuntimeError( f"CUDA stub directory '{cuda_stub_dir}' does not exist.") version_pattern = r'\.\d+$' - for missing_lib in filter(lambda x: re.search(version_pattern, x), - missing_libs): - # Define `so` as the first part of `missing_lib` with trailing '.' and digits removed - so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) - so_versioned = cuda_stub_path / missing_lib - - # Check if the library exists and the versioned link does not. - if so.exists() and not so_versioned.exists(): - try: - # Attempt to create the symbolic link. - so_versioned.symlink_to(so) - except PermissionError: - # Handle permission errors by attempting to use `sudo` to create the link. - try: - build_run(f"sudo ln -s {str(so)} {str(so_versioned)}") - except CalledProcessError as sudo_error: - print( - f"Failed to create symbolic link even with sudo: {sudo_error}" - ) + versioned_libs = [lib for lib in missing_libs if re.search(version_pattern, lib)] + + try: + # Try in-place links inside the stubs directory first. + for missing_lib in versioned_libs: + so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) + so_versioned = cuda_stub_path / missing_lib + if so.exists() and not so_versioned.exists(): + so_versioned.symlink_to(so) + return cuda_stub_path + except PermissionError: + # Fall back to a per-user temp directory (no sudo). + import tempfile + link_dir = Path(tempfile.mkdtemp(prefix="tllm_cuda_stubs_")) + for missing_lib in versioned_libs: + so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) + target = link_dir / missing_lib + if so.exists() and not target.exists(): + target.symlink_to(so) + return link_dir
369-397: Fix invalid string escaping and use sys.exitThe Windows stub generation has the same issues identified in previous reviews:
- Line 373:
exit(1)should besys.exit(1)- Line 391:
r\"{lib_dir}\"is invalid Python syntax- Line 393: Using
.format()with a multi-line string is error-proneApply this diff to fix the issues:
if binding_type == "nanobind": print("Windows not yet supported for nanobind stubs") - exit(1) + sys.exit(1) else: build_run(f"\"{venv_python}\" -m pip install pybind11-stubgen") stubgen = "stubgen.py" - stubgen_contents = """ + stubgen_contents = f""" # Loading torch, trt before bindings is required to avoid import errors on windows. # isort: off import torch import tensorrt as trt # isort: on import os import platform from pybind11_stubgen import main if __name__ == "__main__": # Load dlls from `libs` directory before launching bindings. if platform.system() == "Windows": - os.add_dll_directory(r\"{lib_dir}\") + os.add_dll_directory(r"{lib_dir}") main() - """.format(lib_dir=lib_dir) + """ (pkg_dir / stubgen).write_text(dedent(stubgen_contents)) - build_run(f"\"{venv_python}\" {stubgen} -o . bindings") + build_run(f"\"{venv_python}\" {stubgen} -o . bindings --exit-code") (pkg_dir / stubgen).unlink()
406-406: Fix type annotation and mutation safetyThe function signature uses
Sequence[str] = tuple()but Line 481 tries to mutate it withlist(extra_cmake_vars) + [...]. This is the exact issue identified in previous reviews.The current code on Line 481 already handles this correctly with
list(extra_cmake_vars) + [...], so the implementation is actually correct. The type annotation properly reflects that the function accepts an immutable sequence.
343-351: Update to use the returned path from create_cuda_stub_linksThe current implementation doesn't use the return value from
create_cuda_stub_links, which means it may setLD_LIBRARY_PATHto a directory where symlink creation failed.Apply this diff to use the returned path:
if missing_libs and Path(cuda_stub_dir).exists(): - # Create symbolic links for the CUDA stubs - create_cuda_stub_links(cuda_stub_dir, missing_libs) + # Create symbolic links for the CUDA stubs + effective_stub_dir = create_cuda_stub_links(cuda_stub_dir, missing_libs) ld_library_path = env_stub_gen.get("LD_LIBRARY_PATH") env_stub_gen[ - "LD_LIBRARY_PATH"] = f"{ld_library_path}:{cuda_stub_dir}" if ld_library_path else cuda_stub_dir + "LD_LIBRARY_PATH"] = f"{ld_library_path}:{effective_stub_dir}" if ld_library_path else str(effective_stub_dir)Note: This change depends on first implementing the return value fix in
create_cuda_stub_links.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
scripts/build_wheel.py(6 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
scripts/build_wheel.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
scripts/build_wheel.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (6)
scripts/build_wheel.py (6)
19-19: LGTM - Necessary import for regex pattern matchingThe
reimport is required for theversion_patternregex operations increate_cuda_stub_links.
31-31: LGTM - Better type annotationUsing
Sequenceinstead ofListfor theextra_cmake_varsparameter provides more flexibility while maintaining type safety.
124-125: LGTM - Type annotation improvementThe function signature now uses proper tuple type annotation instead of the previous
Tupleimport.
333-367: LGTM - Well-structured Linux stub generationThe Linux stub generation function is well-implemented with proper:
- Environment setup
- CUDA stub directory detection
- Conditional stub generation based on binding type
- Support for optional deep_ep module
The logic correctly handles both nanobind and pybind11 stub generation paths.
481-481: LGTM - Proper immutable sequence handlingThe code correctly converts the
Sequenceto a list before adding new elements, avoiding mutation of the original parameter. This addresses the potential issue with the immutabletuple()default.
825-830: LGTM - Clean function extractionThe platform-specific stub generation has been properly extracted into helper functions, improving code organization and maintainability. The function calls pass the correct parameters for each platform.
|
PR_Github #15151 [ run ] completed with state |
|
PR_Github #15441 [ run ] triggered by Bot |
38eca31 to
eb598a3
Compare
|
/bot kill |
|
PR_Github #15445 [ kill ] triggered by Bot |
|
PR_Github #15441 [ run ] completed with state |
|
PR_Github #15445 [ kill ] completed with state |
|
/bot run |
|
@coderabbitai full review |
✅ Actions performedFull review triggered. |
|
PR_Github #15447 [ run ] triggered by Bot |
|
PR_Github #15447 [ run ] completed with state |
This reverts commit 8845e0f. Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com> Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
fe1a1aa to
8777356
Compare
|
/bot skip --comment "Passed before DCO fix" |
|
PR_Github #15627 [ skip ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (4)
scripts/build_wheel.py (4)
285-329: Use the returned link directory and avoid filesystem operations without error handlingThe function
create_cuda_stub_linksreturns a link directory but the caller doesn't consistently use it. Also, the function could fail silently in some edge cases.Current issues:
- The caller at lines 361-364 uses
link_dirbut also includescuda_stub_dirin LD_LIBRARY_PATH- No validation that temporary directory creation succeeded
- Missing cleanup on partial failures
Apply this improvement:
- env_stub_gen["LD_LIBRARY_PATH"] = ":".join( - filter(None, [link_dir, cuda_stub_dir, ld_library_path])) + env_stub_gen["LD_LIBRARY_PATH"] = ":".join( + filter(None, [link_dir, ld_library_path]))The function should only add the
link_dir(where versioned symlinks exist) to LD_LIBRARY_PATH, not both directories.
332-344: Add error handling for ldd command failureThe
check_missing_libsfunction doesn't handle the case where the glob pattern matches no files or whenlddfails.Apply this fix to handle errors gracefully:
def check_missing_libs(so_prefix: str) -> list[str]: - result = build_run(f"ldd {so_prefix}.cpython*.so", - capture_output=True, - text=True) + import glob + so_files = glob.glob(f"{so_prefix}.cpython*.so") + if not so_files: + return [] + + missing = [] + for so_file in so_files: + try: + result = build_run(f"ldd {so_file}", + capture_output=True, + text=True) + except CalledProcessError: + continue - missing = [] - for line in result.stdout.splitlines(): - if "not found" in line: - lib_name = line.split()[ - 0] # Extract the library name before "=> not found" - if lib_name not in missing: - missing.append(lib_name) + for line in result.stdout.splitlines(): + if "not found" in line: + parts = line.split() + if parts: # Ensure line has content + lib_name = parts[0] + if lib_name not in missing: + missing.append(lib_name) return missing
388-416: Fix invalid raw string escaping and use sys.exitThe Windows stub generation has incorrect string escaping that will produce invalid Python code.
Apply this fix:
def generate_python_stubs_windows(binding_type: str, venv_python: Path, pkg_dir: Path, lib_dir: Path): if binding_type == "nanobind": print("Windows not yet supported for nanobind stubs") - exit(1) + sys.exit(1) else: build_run(f"\"{venv_python}\" -m pip install pybind11-stubgen") stubgen = "stubgen.py" - stubgen_contents = """ + stubgen_contents = f""" # Loading torch, trt before bindings is required to avoid import errors on windows. # isort: off import torch import tensorrt as trt # isort: on import os import platform from pybind11_stubgen import main if __name__ == "__main__": # Load dlls from `libs` directory before launching bindings. if platform.system() == "Windows": - os.add_dll_directory(r\"{lib_dir}\") + os.add_dll_directory(r"{lib_dir}") main() - """.format(lib_dir=lib_dir) + """ (pkg_dir / stubgen).write_text(dedent(stubgen_contents)) - build_run(f"\"{venv_python}\" {stubgen} -o . bindings") + build_run(f"\"{venv_python}\" {stubgen} -o . bindings --exit-code") (pkg_dir / stubgen).unlink()
425-426: Safely handle sequence mutation on WindowsThe function signature accepts
extra_cmake_vars: Sequence[str]which could be a tuple, but line 500 attempts to mutate it on Windows.Apply this fix to ensure safe mutation:
if on_windows: # Windows does not support multi-device currently. - extra_cmake_vars = list(extra_cmake_vars) + ["ENABLE_MULTI_DEVICE=0"] + extra_cmake_vars = list(extra_cmake_vars) + extra_cmake_vars.append("ENABLE_MULTI_DEVICE=0")
🧹 Nitpick comments (2)
scripts/build_wheel.py (2)
305-309: Simplify regex pattern extraction logicThe regex operations for extracting library names could be more robust and clearer.
Consider this improvement for better readability:
- version_pattern = r'\.\d+' - for missing_lib in filter(lambda x: re.search(version_pattern, x), - missing_libs): - # Define `so` as the first part of `missing_lib` with trailing '.' and digits removed - so = cuda_stub_path / re.sub(version_pattern, '', missing_lib) + # Match library names ending with version numbers (e.g., .1, .12) + version_pattern = re.compile(r'(lib\w+\.so)(\.\d+)+$') + for missing_lib in missing_libs: + match = version_pattern.match(missing_lib) + if not match: + continue + base_lib_name = match.group(1) + so = cuda_stub_path / base_lib_name
369-382: Add consistent exit-code handling for nanobind stubgenThe nanobind stubgen call doesn't use
--exit-codewhile pybind11_stubgen calls do, leading to inconsistent error reporting.For consistency with pybind11_stubgen calls, consider adding exit code handling:
if is_nanobind: - build_run(f"\"{venv_python}\" -m nanobind.stubgen -m bindings -O .", + build_run(f"\"{venv_python}\" -m nanobind.stubgen -m bindings -O . --exit-code", env=env_stub_gen)Note: This assumes nanobind.stubgen supports the
--exit-codeflag. If not supported, this suggestion can be ignored.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (4)
cpp/tensorrt_llm/nanobind/CMakeLists.txt(2 hunks)cpp/tensorrt_llm/pybind/CMakeLists.txt(2 hunks)docker/Dockerfile.multi(1 hunks)scripts/build_wheel.py(6 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- cpp/tensorrt_llm/nanobind/CMakeLists.txt
- cpp/tensorrt_llm/pybind/CMakeLists.txt
- docker/Dockerfile.multi
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
scripts/build_wheel.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit Inference Engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
scripts/build_wheel.py
|
PR_Github #15627 [ skip ] completed with state |
Summary by CodeRabbit
Refactor
Chores