KEMBAR78
Fix get_max_num_running_seqs for waiting seq groups by Yard1 · Pull Request #1034 · vllm-project/vllm · GitHub
Skip to content

Conversation

@Yard1
Copy link
Collaborator

@Yard1 Yard1 commented Sep 13, 2023

Currently, get_max_num_running_seqs will always return 0 for waiting sequences in the case of best_of == self.num_seqs(), leading to incorrect scheduler behavior of scheduling more requests than max_num_seqs.

@Yard1
Copy link
Collaborator Author

Yard1 commented Sep 16, 2023

cc @zhuohan123

return self.sampling_params.best_of
else:
if self.sampling_params.best_of > self.num_seqs():
if self.sampling_params.best_of >= self.num_seqs():
Copy link
Member

@zhuohan123 zhuohan123 Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reporting the issue! This fix will make the if always be true. Please find the correct change in #1068

@Yard1 Yard1 closed this Sep 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants