KEMBAR78
Fix Qwen3-Next MTP Loading after model update by hebiao064 · Pull Request #10255 · sgl-project/sglang · GitHub
Skip to content

Conversation

@hebiao064
Copy link
Collaborator

@hebiao064 hebiao064 commented Sep 10, 2025

Motivation


import sys
from sglang.srt.entrypoints.http_server import launch_server
from sglang.srt.server_args import prepare_server_args

if __name__ == "__main__":
    # Simulate CLI arguments (excluding the script name)
    args = [
        "--model-path",
        "Qwen-SGlang/Qwen3-Next-80B-A3B-Instruct",
        "--tp",
        "8",
        "--speculative-algo",
        "NEXTN",
        "--speculative-num-steps",
        "3",
        "--speculative-eagle-topk",
        "1",
        "--speculative-num-draft-tokens",
        "4",
        "--attention-backend",
        "hybrid_linear_attn",
        "--trust-remote-code"
    ]
    server_args = prepare_server_args(args)
    launch_server(server_args)

Modifications

Accuracy Tests

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@hebiao064 hebiao064 merged commit bc3de03 into qwen3_next Sep 10, 2025
1 check failed
@hebiao064 hebiao064 deleted the bhe/fix_mtp_loading branch September 10, 2025 05:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant