-
Notifications
You must be signed in to change notification settings - Fork 240
fix: Google: use maxTokens from params #734
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Google: use maxTokens from params #734
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @nicolasf, for catching this!
Let's add/modify some test to avoid regressions in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no tests
d5fea78 to
174924f
Compare
Current implementation has a hardcoded maxOutputTokens=2048. Use the value from prompt.params.maxTokens similar to the OpenAILLMClient.
174924f to
bd14e68
Compare
|
Hi @kpavlov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @nicolasf
I would also add a check that max tokens value is positive, but in general, it makes things better
|
HI @kpavlov. Is it possible to get this merged before 0.5.0 is released? |
|
Thanks for fixing this @nicolasf! @Ololoshechkin, are there plans to release 0.5.0 or 0.4.3 soon (or even a feature or SNAPSHOT build)? We're loving Koog so far, but we cannot migrate to it fully, as this issue prevents the use of Gemini models. |
(cherry picked from commit 45012a3)
(cherry picked from commit 45012a3)
Current implementation has a hardcoded maxOutputTokens=2048. Use the value from prompt.params.maxTokens similar to the OpenAILLMClient.
Motivation and Context
During execution of structured output requests using Gemini 2.5 I encountered unexpected failures due to reaching MAX_TOKENS.
After checking the GoogleLLMClient implementation I noticed a hardcoded
maxOutputTokens = 2048.Checking the other clients, I'm replicating the same logic that exists on
OpenAILLMClient, which applies the value fromprompt.params.maxTokens.Breaking Changes
No breaking changes.
Type of the changes
Checklist
developas the base branchAdditional steps for pull requests adding a new feature