KEMBAR78
Configuration to allow output of special tokens by blahblahasdf · Pull Request #970 · vllm-project/vllm · GitHub
Skip to content

Conversation

@blahblahasdf
Copy link
Contributor

I have a model that generates special tokens with important meaning to generation tasks. At present there is no way to get these special tokens back in the generated text because of the hardcoded input to detokenize_incrementally in llm_engine.

This PR adds an option at start up to --keep-special-tokens which resolves this limitation.

I elected to resolve this with a ModelConfig change instead of changing SamplingParams but I could see an argument the other way.

@blahblahasdf
Copy link
Contributor Author

This is an approach to addressing #893.

@WoosukKwon
Copy link
Collaborator

Hi @blahblahasdf, sorry for the late response. Could you elaborate more on the reason you add it as a model parameter instead of a sampling parameter?

@blahblahasdf
Copy link
Contributor Author

No worries @WoosukKwon . For my use cases I want the special tokens for every request. To me, this meant it should be a configuration of the model more than the request. As I said I could easily see it the other way and I'd be happy to change this to an option on SamplingParams instead if that fits better with your design.

@WoosukKwon
Copy link
Collaborator

@blahblahasdf Thanks! I believe HF transformers also views it as a generation parameter, rather than a model's hyper parameter. Could you fix the code?

@blahblahasdf
Copy link
Contributor Author

Great, will do!

@blahblahasdf
Copy link
Contributor Author

I created a new PR for the different approach. #1186

amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
…ject#970)

### What this PR does / why we need it?
Optimize the performance of calculation logic in sampler and deepseekv2.

### Does this PR introduce _any_ user-facing change?
Added VLLM_ENABLE_TOPK_OPTIMZE config in sampler

### How was this patch tested?
pytest test_sampler.py

Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
Co-authored-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com>
Co-authored-by: ZhengWG <zwg0606@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants