-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[https://nvbugs/5547434][fix] Fix Qwen2.5-VL device_path error #8057
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[https://nvbugs/5547434][fix] Fix Qwen2.5-VL device_path error #8057
Conversation
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
/bot run |
|
PR_Github #20241 [ run ] triggered by Bot |
📝 WalkthroughWalkthroughRemoved mrope-related entries from multimodal_data_device_paths in Qwen2VL and Qwen2_5_VL model implementations, returning only "multimodal_embedding" for device-bound multimodal data. No changes to method signatures, load_weights, or other control flow. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tensorrt_llm/_torch/models/modeling_qwen2vl.py (2)
861-877: Make host→device move explicit before assignment.Today you rely on implicit cross-device copy when assigning CPU mrope_position_ids into a CUDA buffer. Make it explicit to avoid surprises and enable non_blocking copies.
Apply this minimal change:
@@ - mrope_position_ids = multimodal_param.multimodal_data[ - 'mrope_config']['mrope_position_ids'] + mrope_position_ids = multimodal_param.multimodal_data[ + 'mrope_config']['mrope_position_ids'] + # Explicit H2D to match buffer device and allow async copy + mrope_position_ids = mrope_position_ids.to('cuda', non_blocking=True) @@ - self.mrope_position_ids_padding_cuda[:, :, : - mrope_position_ids. - shape[ - -1]] = mrope_position_ids + self.mrope_position_ids_padding_cuda[:, :, : mrope_position_ids.shape[-1]] = mrope_position_ids
280-282: Python 3.8+ typing compatibility.Use typing.Dict/Any (or add from future import annotations) and avoid lowercase any.
As per coding guidelines
- def _preprocess(self, text: dict[str, any], mm_data: dict[str, any], + def _preprocess(self, text: Dict[str, Any], mm_data: Dict[str, Any], mm_processor_kwargs: Dict[str, Any]): @@ - second_per_grid_ts: torch.Tensor = None) -> dict[str, torch.Tensor]: + second_per_grid_ts: torch.Tensor = None) -> Dict[str, torch.Tensor]:Also applies to: 306-313
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tensorrt_llm/_torch/models/modeling_qwen2vl.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{h,hpp,hh,hxx,cpp,cxx,cc,cu,cuh,py}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Use only spaces, no tabs; indent with 4 spaces.
Files:
tensorrt_llm/_torch/models/modeling_qwen2vl.py
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+.
Indent Python code with 4 spaces; do not use tabs.
Maintain module namespace when importing; prefer 'from package.subpackage import foo' then 'foo.SomeClass()' instead of importing the class directly.
Python filenames should be snake_case (e.g., some_file.py).
Python classes use PascalCase names.
Functions and methods use snake_case names.
Local variables use snake_case; prefix 'k' for variables that start with a number (e.g., k_99th_percentile).
Global variables use upper SNAKE_CASE prefixed with 'G' (e.g., G_MY_GLOBAL).
Constants use upper SNAKE_CASE (e.g., MY_CONSTANT).
Avoid shadowing variables from an outer scope.
Initialize all externally visible members of a class in the constructor.
Prefer docstrings for interfaces that may be used outside a file; comments for in-function or file-local interfaces.
Use Google-style docstrings for classes and functions (Sphinx-parsable).
Document attributes and variables inline so they render under the class/function docstring.
Avoid reflection when a simpler, explicit approach suffices (e.g., avoid dict(**locals()) patterns).
In try/except, catch the most specific exceptions possible.
For duck-typing try/except, keep the try body minimal and use else for the main logic.
Files:
tensorrt_llm/_torch/models/modeling_qwen2vl.py
**/*.{cpp,cxx,cc,h,hpp,hh,hxx,cu,cuh,py}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Prepend the NVIDIA Apache-2.0 copyright header with current year to the top of all source files (e.g., .cpp, .h, .cu, .py).
Files:
tensorrt_llm/_torch/models/modeling_qwen2vl.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (3)
tensorrt_llm/_torch/models/modeling_qwen2vl.py (3)
1031-1037: SM≥100: device paths look right.Keeping pixel values and grid_thw on device and dropping mrope paths is consistent with HF vision path. No issues spotted.
Please sanity-check image-only, video-only, and mixed prompts on SM 100+ to ensure there are no KeyErrors from the device mover.
1038-1041: SM<100: device paths simplified; OK.For the TRT-LLM vision path, moving only pixel values (plus embeddings) and excluding mrope is appropriate.
Validate text-only prompts still work (mrope stays CPU, embeddings absent) and that video/image flows don’t regress.
977-983: Approve removal of mrope entries from multimodal_data_device_paths
mrope tensors are transferred via dedicated logic (e.g., in model_engine and functional layers), so they no longer need to be included in the generic device paths.
|
PR_Github #20241 [ run ] completed with state |
|
/bot help |
GitHub Bot Help
Provide a user friendly way for developers to interact with a Jenkins server. Run See details below for each supported subcommand.
Launch build/test pipelines. All previously running jobs will be killed.
kill
Kill all running builds associated with pull request. skip
Skip testing for latest commit on pull request. reuse-pipeline
Reuse a previous pipeline to validate current commit. This action will also kill all currently running builds associated with the pull request. IMPORTANT NOTE: This is dangerous since lack of user care and validation can cause top of tree to break. |
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
…A#8057) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Summary by CodeRabbit