KEMBAR78
Adding Support for Qwen3-Next by bozheng-hit · Pull Request #40771 · huggingface/transformers · GitHub
Skip to content

Conversation

@bozheng-hit
Copy link
Contributor

Adding Support for Qwen3-Next

This PR adds the support of codes for the upcoming Qwen3-Next models. For information about Qwen, please visit:
👉 https://github.com/QwenLM/Qwen3

Special thanks to @Cyrilvallez and @ArthurZucker for their valuable feedback and thorough review of this PR!

@Cyrilvallez
Copy link
Member

run-slow: qwen3_next

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

This comment contains run-slow, running the specified jobs:

models: ['models/qwen3_next']
quantizations: [] ...

@Cyrilvallez
Copy link
Member

run-slow: qwen3_next

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

This comment contains run-slow, running the specified jobs:

models: ['models/qwen3_next']
quantizations: [] ...

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, qwen3_next

@Cyrilvallez
Copy link
Member

All good! Merging!

@Cyrilvallez Cyrilvallez merged commit b928235 into huggingface:main Sep 9, 2025
18 of 21 checks passed
@woct0rdho
Copy link
Contributor

woct0rdho commented Sep 10, 2025

Hi @bozheng-hit , you mentioned it has high inference throughput, but currently the MoE layer in transformers is slow. Do you plan to replace the MoE layer with a fused kernel, like GPT-OSS did?

@surak
Copy link

surak commented Sep 10, 2025

When will it come out? It's nowhere yet

@ArthurZucker
Copy link
Collaborator

Efficient MoEs are planned for all moes in transformers ! 🤗

conv_state,
weight,
bias=None,
activation=None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The activation param seems to have no functionality, what is it good for?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make functions coherent between fast path and torch path

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see – thanks! 👍🏻

vijayabhaskar-ev pushed a commit to vijayabhaskar-ev/transformers that referenced this pull request Oct 2, 2025
* Add Qwen3-Next.

* fix

* style

* doc

* simplify

* fix name

* lazy cache init to allow multi-gpu inference

* simplify

* fix config to support different hybrid ratio.

* remove last commit (redundant)

* tests

* fix test

---------

Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
yuchenxie4645 pushed a commit to yuchenxie4645/transformers that referenced this pull request Oct 4, 2025
* Add Qwen3-Next.

* fix

* style

* doc

* simplify

* fix name

* lazy cache init to allow multi-gpu inference

* simplify

* fix config to support different hybrid ratio.

* remove last commit (redundant)

* tests

* fix test

---------

Co-authored-by: bozheng-hit <dsoul0621@gmail.com>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants