KEMBAR78
server: add warning message for deprecated context field in /api/generate by jmorganca · Pull Request #7878 · ollama/ollama · GitHub
Skip to content

Conversation

@jmorganca
Copy link
Member

The context parameter in /api/generate has been longtime replaced by functionality in the /api/chat endpoint. This PR adds a deprecation warning in the logs when used.

@jmorganca jmorganca force-pushed the jmorganca/deprecate-context branch 2 times, most recently from 2535ba4 to 3f17495 Compare November 30, 2024 21:38
@jmorganca jmorganca force-pushed the jmorganca/deprecate-context branch from 3f17495 to 16f353f Compare November 30, 2024 21:49
@jmorganca jmorganca merged commit d543b28 into main Nov 30, 2024
13 checks passed
@jmorganca jmorganca deleted the jmorganca/deprecate-context branch November 30, 2024 22:05
@WizardMiner
Copy link

WizardMiner commented May 8, 2025

Can we revisit this please? @jmorganca said..

The context parameter in /api/generate has been longtime replaced by functionality in the /api/chat endpoint. This PR adds a deprecation warning in the logs when used.

How so? Are you talking about this?..

POST /api/chat

What we need is to be able to create a context array. Then use it to seed with prompts over and over again. This isn't about streaming chat, if that's what you meant. This is about being able to restart the request/response chain from any prior point. For instance, maybe the prompt engineering was off and we want to rephrase the second question after we're 10 deep.

Can you demonstrate how we can have 10 turns and then go back up to number 2, use the original context to go in different directions? Can we do that with /api/chat and if so, how? We all would very much like to know how to do that.

Link back to the issue.. 10576

@asterbini
Copy link

Please undo this pull or at least leave the context parameter as is.
I am doing research on LLM based assessment of student submissions based on rubrics and this means that I am optimizing prompts to get the best assessment and suggestions for improvement.
This means trying many many different ways to prompt the LLM, with multi-question chats.
The contect parameter allows me to cache the chat state and efficiently try many different second and third questions from the same exact point in time.

Using the chat endpoint would, instead, be a lot more inefficient because the LLM would have to re-interpret all the initial part of the chat.
Moreover, reproducing the exact same context would not be guaranteed.

By leaving the context parameter as is, context caching is easy to implement and experiments are a lot faster, and the experiments would be repeatable.

@ioquatix
Copy link

ioquatix commented Jul 25, 2025

My understanding is that context1 + input1 => output1 + context2 (=> means produces).

If the context parameter is removed from /api/generate, what is the correct way to implement the above correctly? Are we supposed to embed all the past conversation into the prompt? It's not clear how this should work. If you plan to deprecate a feature, please at least explain what the alternative is.

The /api/chat interface seems to provide an input for all past messages, but does this have the same behaviour, e.g. the current interface:

context1 + input1 => output1 + context2
context2 + input2 => output2 + context3
context3 + input3 => output3 + context4

Is it really the same as:

input1 => output1
input1 + input2 => output2
input1 + input2 + input3 => output3

It seems to me that the latter would be far more unstable as the interpretation of input1 + input2 + input3 = output3 seems likely to have significantly more variability than context3 + input3 = output3 + context4. In other words, context3 should have a more stable interpretation than input1 + input2. For situations where you want to fork the conversation, this stability is quite important. In other words, this seems more stable:

context3 + input3a => output3a
context3 + input3b => output3b
...

vs

input1 + input2 + input3a => output3a
input1 + input2 + input3b => output3b
...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants