-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Open
Description
Qwen/Qwen3-Next-80B-A3B-Thinking
# TP 2
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 2 --reasoning-parser deepseek-r1
# TP 4
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 4 --reasoning-parser deepseek-r1
# TP 8
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 8 --reasoning-parser deepseek-r1
# TP 4 DP 4
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 4 --dp 4 --enable-dp-attention
# TP 4 DP 4 EP 4 --reasoning-parser deepseek-r1
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 4 --dp 4 --enable-dp-attention --enable-ep-moe --reasoning-parser deepseek-r1
# TP 4 + NEXTN
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 4 --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4 --speculative-algo NEXTN --reasoning-parser deepseek-r1
# TP 4 DP 4 + NEXTN
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Thinking --tp 4 --dp 4 --enable-dp-attention --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4 --speculative-algo NEXTN --reasoning-parser deepseek-r1
Qwen/Qwen3-Next-80B-A3B-Instruct
# TP 2
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 2
# TP 4
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4
# TP 8
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 8
# TP 4 DP 4
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4 --dp 4 --enable-dp-attention
# TP 4 DP 4 EP 4
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4 --dp 4 --enable-dp-attention --enable-ep-moe
# TP 4 + NEXTN
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4 --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4 --speculative-algo NEXTN
# TP 4 DP 4 + NEXTN
python3 -m sglang.launch_server --model Qwen/Qwen3-Next-80B-A3B-Instruct --tp 4 --dp 4 --enable-dp-attention --speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4 --speculative-algo NEXTN
zhyncs, ispobock, jinmingyi1998, Timothyxxx, JustinTong0323 and 7 moreJustinTong0323, zhyncs, byjiang1996 and Swipe4057JustinTong0323, zhyncs, byjiang1996 and Swipe4057