KEMBAR78
Add Model Revision Support · Pull Request #1014 · vllm-project/vllm · GitHub
Skip to content

Conversation

@ghost
Copy link

@ghost ghost commented Sep 11, 2023

This PR adds an additional attribute to the LLM engine, the revision options download specify the commit version for the model for consistency and reliability.

For API_Server usage

python3 -m vllm.entrypoints.api_server --model facebook/opt-125m --revision 507a3991d874042a92e7581eb6e7cc7074b0c77e

For LLM Engine usage

from vllm import LLM, SamplingParams

llm = LLM(model="facebook/opt-125m", revision="507a3991d874042a92e7581eb6e7cc7074b0c77e")
sampling_params = SamplingParams(temperature=0.8, top_p=0.95)
outputs = llm.generate(["hello, my name is "], sampling_params)

@ghost
Copy link
Author

ghost commented Sep 11, 2023

Hi @WoosukKwon,

Could you review this PR

Thanks.

@ghost ghost changed the title Add revision attribute Add Model Revision Support Sep 12, 2023
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! In general LGTM! Left a small comment on the default value. We can merge this branch after that is fixed. Also, please format your script with format.sh

@zhuohan123 zhuohan123 merged commit ab019ee into vllm-project:main Sep 13, 2023
hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
Co-authored-by: Jasmond Loh <Jasmond.Loh@hotmail.com>
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
### What this PR does / why we need it?

Add benchmark workflows

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
Run locally

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
amy-why-3459 pushed a commit to amy-why-3459/vllm that referenced this pull request Sep 15, 2025
…#1039)

### What this PR does / why we need it?

This is a post patch of vllm-project#1014, for some convenience optimization
- Set cached dataset path for speed
- Use pypi to install escli-tool
- Add benchmark results convert script to have a developer-friendly
result
- Patch the `benchmark_dataset.py` to disable streaming load for
internet
- Add more trigger ways for different purpose, `pr` for debug,
`schedule` for daily test, `dispatch` and `pr-labled` for manual testing
of a single(current) commit
- Disable latency test for `qwen-2.5-vl`, (This script does not support
multi-modal yet)

### Does this PR introduce _any_ user-facing change?
No

### How was this patch tested?
CI passed

---------

Signed-off-by: wangli <wangli858794774@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants