KEMBAR78
server : disable context shift by default by ggerganov · Pull Request #15416 · ggml-org/llama.cpp · GitHub
Skip to content

Conversation

@ggerganov
Copy link
Member

Context shift was a useful feature in the past with pre-trained models and the raw /completions API. But today, it is causing a lot of confusion, so it is better to disable it by default. Can be re-enabled with --context-shift CLI arg.

@ggerganov ggerganov requested a review from ngxson as a code owner August 19, 2025 07:11
@github-actions github-actions bot added examples python python script changes server labels Aug 19, 2025
@GuillaumeBruand
Copy link

@ggerganov I'm looking for ressources about the behaviour when context overflows. I was planning to conduct experiments using this --context-shift along with --keep N option (still not sure if this one is relevant) and --ctx-size smaller than training context.

What should I get from this change ? Is there a link with attention sink recently supported in llama.cpp ? Is this --context-shift option unrelevant for instruct fine-tuned model ?

@ngxson
Copy link
Collaborator

ngxson commented Aug 19, 2025

What should I get from this change ?

This only changes the default behavior, instead of having context shift on by default, it's now off by default.

You can manually enable it.

Comment on lines 28 to 30
server.enable_ctx_shift = True
server.start()
server.enable_ctx_shift = False
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ngxson I noticed that the server parameters are stateful - i.e. if we change a parameter in one test, it will remain changed for the rest of the tests. This is the reason I do it like this here.

Is there a better way to set the parameter just for the scope of the current test?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be possible that the scope=module is the problem. Could you try removing it? (While keeping auto_use)

I was a bit confused about the notion of scope in pytest

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - this seems to work.

@ggerganov ggerganov merged commit d2fcd91 into master Aug 19, 2025
50 checks passed
@ggerganov ggerganov deleted the gg/server-disable-context-shift-default branch August 19, 2025 13:46
@ggerganov
Copy link
Member Author

@GuillaumeBruand The context shift is difficult to handle with formatted endpoints such as /chat/completions because it can destroy the structure of the chat template, degrading the quality. So strongly recommend against using it in such cases.

@GuillaumeBruand
Copy link

Thanks for the insight, I'll go on with this PR and let it disabled for my experiments.

@DamonFool
Copy link
Contributor

Hi @ggerganov , the help msg about --context-shift seems incorrect?
Please see #15448 .
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants