-
Notifications
You must be signed in to change notification settings - Fork 60.9k
feat(alibaba): Added alibaba vision model and omni model support #6292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(alibaba): Added alibaba vision model and omni model support #6292
Conversation
|
@Little-LittleProgrammer is attempting to deploy a commit to the NextChat Team on Vercel. A member of the Team first needs to authorize it. |
WalkthroughThe pull request introduces support for multimodal content handling for Alibaba. A new interface, Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant QwenApi
participant Utils
participant AlibabaService
Client->>QwenApi: chat(options)
alt Vision Model Check
QwenApi->>Utils: preProcessImageContentForAlibabaDashScope(content)
Utils-->>QwenApi: processed content (array of multimodal items)
else Non-Vision Model
QwenApi-->>QwenApi: Process messages using standard mapping
end
QwenApi->>AlibabaService: Request via dynamic ChatPath (based on model)
AlibabaService-->>QwenApi: Response with content array
QwenApi-->>Client: Return joined text from content array
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Pro Plan! Simply use the command 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🔇 Additional comments (4)
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
app/utils/chat.ts (1)
95-115: LGTM! Well-implemented image content preprocessing for Alibaba.The function correctly handles image content conversion and error cases.
Consider refactoring to reduce code duplication with
preProcessImageContent. Here's a suggested approach:+async function preProcessImageContentBase( + content: RequestMessage["content"], + transformImageUrl: (url: string) => Promise<{[key: string]: string}>, +) { + if (typeof content === "string") { + return content; + } + const result = []; + for (const part of content) { + if (part?.type == "image_url" && part?.image_url?.url) { + try { + const url = await cacheImageToBase64Image(part?.image_url?.url); + result.push(await transformImageUrl(url)); + } catch (error) { + console.error("Error processing image URL:", error); + } + } else { + result.push({ ...part }); + } + } + return result; +} -export async function preProcessImageContent(content: RequestMessage["content"]) { +export async function preProcessImageContent(content: RequestMessage["content"]) { + return preProcessImageContentBase(content, async (url) => ({ + type: "image_url", + image_url: { url }, + })); +} -export async function preProcessImageContentForAlibabaDashScope(content: RequestMessage["content"]) { +export async function preProcessImageContentForAlibabaDashScope(content: RequestMessage["content"]) { + return preProcessImageContentBase(content, async (url) => ({ + image: url, + })); +}app/client/platforms/alibaba.ts (2)
107-119: Consider refactoring nested ternaries and type safety.
The nested ternary, plus theas anycast, can reduce readability and forfeit type guarantees. A clearer flow or explicit type could improve maintainability.
227-229: Review the comma delimiter usage.
Joining multiple text items with commas may be confusing if any item contains commas. Also confirm each array item has atextfield.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
app/client/api.ts(1 hunks)app/client/platforms/alibaba.ts(5 hunks)app/constant.ts(2 hunks)app/utils/chat.ts(1 hunks)
🔇 Additional comments (8)
app/client/api.ts (1)
43-46: LGTM! Well-structured interface for Alibaba's multimodal content.The interface is properly designed with optional properties and follows TypeScript best practices.
app/constant.ts (2)
224-229: LGTM! Well-implemented dynamic path selection based on model type.The function correctly determines the appropriate endpoint based on whether the model is vision-enabled ("vl") or omni-enabled.
576-578: LGTM! Added new model entries for vision and omni capabilities.The new entries
qwen-omni-turbo,qwen-vl-plus, andqwen-vl-maxalign with the multimodal support being added.app/client/platforms/alibaba.ts (5)
10-13: All good on the new imports.
No issues found; the import statements align well with the introduced functionalities.
28-28: Import looks fine.
Ensure there are no conflicts or duplicates with similar utility functions namedisVisionModel.
105-106: Validate the model selection logic.
The approach is straightforward, but confirm thatisVisionModel(options.config.model)covers all possible model variants.
144-144: Double-check dynamic path resolution.
Confirm thatAlibaba.ChatPath(modelConfig.model)yields valid endpoints for both vision and non-vision models.
177-177: Handle empty or invalid multimodal content arrays.
Ensure the new union type (string | null | MultimodalContentForAlibaba[]) is safely processed when array elements are missing or invalid.
💻 变更类型 | Change Type
🔀 变更说明 | Description of Change
[b709ee3] -- 增加对阿里巴巴图片理解(vl)和全模态(omni)模型的支持
[b709ee3] -- Added alibaba graph understanding and omni model support
📝 补充信息 | Additional Information
Summary by CodeRabbit