-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[None][fix] Fix mm_placholder_counts extraction issue. #7118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
📝 WalkthroughWalkthroughAdjusts apply_chat_template in tensorrt_llm/inputs/utils.py to pass a list containing mm_placeholder_counts to handle_placeholder_exceptions for proper per-message placeholder handling in models within PLACEHOLDER_EXCEPTIONS, specifically affecting the llava_next path. No public APIs changed. Changes
Sequence Diagram(s)Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Suggested labels
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
Status, Documentation and Community
|
|
/bot run --disable-fail-fast |
|
PR_Github #16026 [ run ] triggered by Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tensorrt_llm/inputs/utils.py (1)
1-2: Add SPDX-style NVIDIA header to Python source
- File:
tensorrt_llm/inputs/utils.py- Prepend the canonical two-line SPDX header (as used by other Python modules such as
tensorrt_llm/commands/eval.py) at the very top, before any imports:+ # SPDX-FileCopyrightText: Copyright (c) 2025 NVIDIA CORPORATION & AFFILIATES. All rights reserved. + # SPDX-License-Identifier: Apache-2.0 import asyncio import base64This aligns with the repository’s coding guidelines and the headers already present in Python files under
tensorrt_llm/commands/andtensorrt_llm/_torch/modules/.
🧹 Nitpick comments (2)
tensorrt_llm/inputs/utils.py (2)
393-405: Add unit tests for llava_next placeholder reconstruction (0/1/N images).Recommend tests that assert:
- Content shape is
[{"type":"text",...}, {"type":"image"}, ...]with correct image count.- No KeyError when there are zero images.
- Works when placeholder is
<image>or<|image|>(parametrize via registry stub/mocking).I can draft pytest cases targeting
handle_placeholder_exceptionsand thedefault_multimodal_input_loader→apply_chat_templatepath. Want me to open a test PR?
396-404: Use the registry-provided placeholder and safe.getaccessTo avoid hard-coding “<image>” (and risking a KeyError if it ever changes or is missing), pull the placeholder string from the multimodal registry and fall back to 0 when absent. Also replace the backslash continuation with parentheses:
- if model_type == "llava_next": - # we need to convert the flattened content back to conversation format - for conv, mm_placeholder_count in zip(conversation, - mm_placeholder_counts): - conv["content"] = [{"type": "text", "text": conv["content"]}, \ - *[{"type": "image"} for _ in range(mm_placeholder_count['<image>'])]] + if model_type == "llava_next": + # convert the flattened content back to conversation format + for conv, mm_placeholder_count in zip(conversation, mm_placeholder_counts): + # lookup the exact image placeholder (e.g. "<image>") and default to 0 + placeholder = MULTIMODAL_PLACEHOLDER_REGISTRY.get_placeholder(model_type, "image") + image_count = mm_placeholder_count.get(placeholder, 0) + conv["content"] = ( + [{"type": "text", "text": conv["content"]}] + + [{"type": "image"} for _ in range(image_count)] + )This ensures the code stays in sync with whatever placeholder the registry defines for
llava_nextand never KeyErrors if counts are empty.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
tensorrt_llm/inputs/utils.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Python code must target Python 3.8+
Python indentation: 4 spaces, no tabs
Maintain module namespace in imports (from package.subpackage import foo; then use foo.SomeClass())
Python file names use snake_case
Python class names use PascalCase
Python functions/methods and local variables use snake_case; variables starting with a number get k_ prefix (e.g., k_99th_percentile)
Global variables use G_ prefixed UPPER_SNAKE_CASE (e.g., G_MY_GLOBAL)
Constants use UPPER_SNAKE_CASE in Python
Avoid shadowing variables from outer scopes in Python
Initialize all externally visible members of a Python class in init
Prefer docstrings for interfaces used outside a file; comments for local code
Use Google-style docstrings for classes and functions (Sphinx-parsable)
Document attributes/variables inline with short docstrings
Avoid reflection when simple alternatives exist (e.g., prefer explicit parameters over dict(**locals()))
In try/except, catch the narrowest exceptions possible
For duck-typing with try/except, keep try body minimal and put logic in else
Files:
tensorrt_llm/inputs/utils.py
**/*.{cpp,cxx,cc,cu,h,hpp,hxx,hh,cuh,py}
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Prepend NVIDIA copyright header (current year) to all source files
Files:
tensorrt_llm/inputs/utils.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Pre-commit Check
🔇 Additional comments (2)
tensorrt_llm/inputs/utils.py (2)
433-437: Correct arg shape fix for placeholder exceptions (llava_next) — good catch.Wrapping
mm_placeholder_countsin a list aligns withhandle_placeholder_exceptions(..., mm_placeholder_counts: list[dict[str, int]])and prevents zipping a dict (iterating keys) withconversation. This unblocks correct per-message handling for the llava_next path.
433-437: It looks like we’re unable to locate and inspectparse_chat_messages_coroutinesintensorrt_llm/serve/chat_utils.py. Without confirming howmm_placeholder_countsis structured (dict vs list) after parsing, we can’t be certain our padding/length-guard logic will cover all scenarios or might even break existing behavior. Please verify manually that:
parse_chat_messages_coroutinesindeed returnsmm_placeholder_countsas a single dict (not a list) forPLACEHOLDER_EXCEPTIONSmodels.- Multi-message conversations are never passed to
apply_chat_templatewhenmodel_typeis inPLACEHOLDER_EXCEPTIONS(currently only"llava_next").- If lists are possible, adapt the guard accordingly to avoid wrapping an already-list in another list.
Once confirmed, you can safely apply the suggested non-breaking padding guard.
|
PR_Github #16026 [ run ] completed with state |
|
/bot run |
|
PR_Github #16089 [ run ] triggered by Bot |
|
PR_Github #16089 [ run ] completed with state |
Summary by CodeRabbit