-
Notifications
You must be signed in to change notification settings - Fork 30.9k
🚨 Remove Contrastive Search decoding strategy #40428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🚨 Remove Contrastive Search decoding strategy #40428
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks mostly good to me, added a few minor nits :D
| ) | ||
| generation_config.cache_implementation = None | ||
|
|
||
| # assisted decoding and contrastive search need to roll-back the Cache, which is not supported if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things:
- related to our conversation today, here we force non-sliding windows for speculative decoding
- Contrastive search should pass
cache_implementation="dynamic_full", and we should create anotherifbelow. Otherwise, when we removeGenerationMode.CONTRASTIVE_SEARCH, we won't be able to instantiate the right type of cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added dynamic_full as suggested and a warning on the hub code: https://huggingface.co/transformers-community/contrastive-search/commit/1e37df04e0e3f4c0a36cadeb09b65eef59d884f1
Since dynamic_full is quite niche, we probably don't want to document it further for now right?
cc @Cyrilvallez as well as this is related to cache refactors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since dynamic_full is quite niche, we probably don't want to document it further for now right?
Yeah, I'm okay with it being an internal option for now 👍
|
build-doc |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: bart, csm, gemma, gpt2, gpt_bigcode, gptj, idefics, idefics2, idefics3, kosmos2_5, lfm2, llama, mistral, opt, paligemma2, smolvlm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM :)
Removes Contrastive Search generation strategy from the codebase. Directs users to the
transformers-community/contrastive-searchrepository.It has been a warning for a few releases, but now
trust_remote_code=Trueis required to run contrastive search.