Configuration to allow output of special tokens #970

blahblahasdf · 2023-09-06T19:48:25Z

I have a model that generates special tokens with important meaning to generation tasks. At present there is no way to get these special tokens back in the generated text because of the hardcoded input to detokenize_incrementally in llm_engine.

This PR adds an option at start up to --keep-special-tokens which resolves this limitation.

I elected to resolve this with a ModelConfig change instead of changing SamplingParams but I could see an argument the other way.

…d of always skipping them.

blahblahasdf · 2023-09-06T19:53:56Z

This is an approach to addressing #893.

WoosukKwon · 2023-09-18T18:35:19Z

Hi @blahblahasdf, sorry for the late response. Could you elaborate more on the reason you add it as a model parameter instead of a sampling parameter?

blahblahasdf · 2023-09-20T15:43:13Z

No worries @WoosukKwon . For my use cases I want the special tokens for every request. To me, this meant it should be a configuration of the model more than the request. As I said I could easily see it the other way and I'd be happy to change this to an option on SamplingParams instead if that fits better with your design.

WoosukKwon · 2023-09-20T20:42:22Z

@blahblahasdf Thanks! I believe HF transformers also views it as a generation parameter, rather than a model's hyper parameter. Could you fix the code?

blahblahasdf · 2023-09-26T18:10:39Z

Great, will do!

blahblahasdf · 2023-09-26T21:59:17Z

I created a new PR for the different approach. #1186

…ject#970) ### What this PR does / why we need it? Optimize the performance of calculation logic in sampler and deepseekv2. ### Does this PR introduce _any_ user-facing change? Added VLLM_ENABLE_TOPK_OPTIMZE config in sampler ### How was this patch tested? pytest test_sampler.py Signed-off-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin (A) <wangxiaoxin7@huawei.com> Co-authored-by: ZhengWG <zwg0606@gmail.com>

blahblahasdf added 2 commits September 6, 2023 12:38

Add option to ModelConfig to keep special tokens in the output instea…

39dbd7b

…d of always skipping them.

Update documentation.

10bf588

blahblahasdf closed this Sep 26, 2023

blahblahasdf mentioned this pull request Sep 26, 2023

Keep special sampling params #1186

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Configuration to allow output of special tokens #970

Configuration to allow output of special tokens #970

Uh oh!

blahblahasdf commented Sep 6, 2023

Uh oh!

blahblahasdf commented Sep 6, 2023

Uh oh!

WoosukKwon commented Sep 18, 2023

Uh oh!

blahblahasdf commented Sep 20, 2023

Uh oh!

WoosukKwon commented Sep 20, 2023

Uh oh!

blahblahasdf commented Sep 26, 2023

Uh oh!

blahblahasdf commented Sep 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Configuration to allow output of special tokens #970

Configuration to allow output of special tokens #970

Uh oh!

Conversation

blahblahasdf commented Sep 6, 2023

Uh oh!

blahblahasdf commented Sep 6, 2023

Uh oh!

WoosukKwon commented Sep 18, 2023

Uh oh!

blahblahasdf commented Sep 20, 2023

Uh oh!

WoosukKwon commented Sep 20, 2023

Uh oh!

blahblahasdf commented Sep 26, 2023

Uh oh!

blahblahasdf commented Sep 26, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants