KEMBAR78
Update TensorRT-LLM by kaiyux · Pull Request #1427 · NVIDIA/TensorRT-LLM · GitHub
Skip to content

Conversation

@kaiyux
Copy link
Member

@kaiyux kaiyux commented Apr 9, 2024

  • Model Support
  • Bug fixes
    • Fix some unexpected behaviors in beam search and early stopping, so that the outputs are more accurate
  • Benchmark
    • Enable streaming and support “Time To the First Token (TTFT)” latency and “Inter-Token Latency (ITL)” metrics

@kaiyux kaiyux merged commit 035b99e into main Apr 9, 2024
@kaiyux kaiyux deleted the kaiyu/update branch April 9, 2024 09:04
@kaiyux kaiyux mentioned this pull request Apr 9, 2024
wu1du2 pushed a commit to wu1du2/TensorRT-LLM that referenced this pull request May 11, 2025
* Update TensorRT-LLM

---------

Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants