KEMBAR78
server : add [DONE] event to /chat/completions stream response by VoidIsVoid · Pull Request #9459 · ggml-org/llama.cpp · GitHub
Skip to content

Conversation

@VoidIsVoid
Copy link
Contributor

@ggerganov
Copy link
Member

What is the purpose of this?

@VoidIsVoid
Copy link
Contributor Author

VoidIsVoid commented Sep 13, 2024

@ggerganov
OpenAI's /chat/completions has a "data: [DONE]\n\n" at the end of stream.
Some openai api compatible client just treat it as the end of stream request.
I think it's better to follow openai's behavior.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could confirm that this is missing from our implementation. It's actually documented and here is an example of request having this [DONE] message:

image

I tried adding this in the past but was having some issues with the test.

Currently, this test case will fail:

Failing scenarios:
  features/parallel.feature:56  Multi users OAI completions compatibility -- @1.2

So you need to update the test script to be able to handle this.

@ngxson ngxson changed the title server: add data: [DONE] to /chat/completions stream response server : add [DONE] event to /chat/completions stream response Sep 14, 2024
@ngxson ngxson merged commit dcdcee3 into ggml-org:master Sep 14, 2024
53 checks passed
@isaac-mcfadyen
Copy link
Contributor

Wonder if we should add this to the README.md in some place?

This is breaking for many apps that use the stream raw without an OpenAI compatible client (i.e. via ReadableStream).

@ngxson
Copy link
Collaborator

ngxson commented Sep 14, 2024

IMO this is not a breaking change, because it was break before and now we just fix it. Also, /chat/completion is declared as OAI-compatible, so usage without OAI-compatible clients is not expected

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants