KEMBAR78
[Feature] Mamba SSM state supports selecting the data type (fp32 or bf16) via environment variables by byjiang1996 · Pull Request #10258 · sgl-project/sglang · GitHub
Skip to content

Conversation

byjiang1996
Copy link
Contributor

Motivation

Close https://github.com/Qwen-SGLang/sglang-qwen3.5/issues/55

Modifications

import sys
from sglang.srt.entrypoints.http_server import launch_server
from sglang.srt.server_args import prepare_server_args

if __name__ == "__main__":
    # Simulate CLI arguments (excluding the script name)
    args = [
        "--model-path",
        "/shared/public/sharing/sglanglearning/Qwen-SGlang/Qwen3-Next-80B-A3B-Instruct",
        "--tp",
        "8",
        "--attention-backend",
        "hybrid_linear_attn",
        "--trust-remote-code",
        "--mamba-ssm-dtype",
        "bfloat16"
    ]
    server_args = prepare_server_args(args)
    launch_server(server_args)

Accuracy Tests

GSM8k:

Setup Accuracy
TP8; float32 0.955
TP8; bfloat16 0.945
TP8; MTP; float32 0.945
TP8; MTP; bfloat16 0.950

Benchmarking and Profiling

Checklist

@gemini-code-assist
Copy link
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

@zhyncs zhyncs merged commit dfa3853 into sgl-project:qwen3_next Sep 10, 2025
yizhang2077 pushed a commit that referenced this pull request Sep 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants