[doc] Add RAG Integration example #17692

reidliu41 · 2025-05-06T03:46:35Z

RAG (Retrieval-Augmented Generation) enhances LLMs by retrieving relevant context from external sources,
improving factual accuracy and grounding. It's widely adopted in modern LLM applications.
This PR introduces basic RAG example using:

vLLM + LangChain + Milvus
vLLM + LlamaIndex + Milvus

Signed-off-by: reidliu41 <reid201711@gmail.com>

github-actions · 2025-05-06T03:46:45Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

examples/online_serving/retrieval_augmented_generation_with_langchain.py

Signed-off-by: reidliu41 <reid201711@gmail.com>

docs/source/deployment/frameworks/retrieval_augmented_generation.md

Signed-off-by: reidliu41 <reid201711@gmail.com>

DarkLight1337 · 2025-05-06T08:59:44Z

cc @hmellor there seems to be some indentation errors even though this PR doesn't change them

hmellor · 2025-05-06T09:07:36Z

Those indentation errors (which should be solved) are harmless for the build.

The main issue is that:

:::{argparse}
:module: examples.online_serving.retrieval_augmented_generation_with_langchain
:func: get_parser
:prog: retrieval_augmented_generation_with_langchain.py
:::

requires the imports from that module to either be mocked (docs/source/conf.py) or installed (requirements/docs.txt).

reidliu41 · 2025-05-06T09:15:33Z

@hmellor yeah, seems that, thanks
@DarkLight1337 seems cannot import, maybe rollback the previous command output? seems not good to change some settings/configs for the examples.

DarkLight1337 · 2025-05-06T09:24:48Z

We can add langchain and llamaindex to the dependencies to mock inside conf.py

Signed-off-by: reidliu41 <reid201711@gmail.com>

reidliu41 · 2025-05-06T13:05:55Z


[2025-05-06T13:03:22Z] Warning, treated as error:
--
  | [2025-05-06T13:03:22Z] /vllm-workspace/test_docs/docs/source/deployment/frameworks/retrieval_augmented_generation.md:42:Failed to import "get_parser" from "examples.online_serving.retrieval_augmented_generation_with_langchain".
  | [2025-05-06T13:03:22Z] No module named 'examples.online_serving'
  | [2025-05-06T13:03:45Z] make: *** [Makefile:20: html] Error 2
  | [2025-05-06T13:03:46Z] 🚨 Error: The command exited with status 2

still failed...

reidliu41 · 2025-05-06T13:16:40Z

@DarkLight1337 maybe just simply remove it or just roll back??

DarkLight1337 · 2025-05-06T13:33:10Z

I prefer removing the help text if you can't get it to work, so we don't have to worry about the help text getting out of sync with the actual code

Signed-off-by: reidliu41 <reid201711@gmail.com>

reidliu41 · 2025-05-06T14:20:14Z

ok, thanks

DarkLight1337

Thanks for your effort and patience!

* [Model] Add GraniteMoeHybrid 4.0 model (vllm-project#17497) Signed-off-by: Thomas Ortner <boh@zurich.ibm.com> Signed-off-by: Stanislaw Wozniak <stw@zurich.ibm.com> Co-authored-by: Thomas Ortner <boh@zurich.ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> * [easy] Fix logspam on PiecewiseBackend errors (vllm-project#17138) Signed-off-by: rzou <zou3519@gmail.com> * [Bugfix] Fixed prompt length for random dataset (vllm-project#17408) Signed-off-by: Mikhail Podvitskii <podvitskiymichael@gmail.com> * [Doc] Update notes for H2O-VL and Gemma3 (vllm-project#17219) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> * [Misc] Fix ScalarType float4 naming (vllm-project#17690) Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> * Fix `dockerfilegraph` pre-commit hook (vllm-project#17698) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * [Bugfix] Fix triton import with local TritonPlaceholder (vllm-project#17446) Signed-off-by: Mengqing Cao <cmq0113@163.com> * [V1] Enable TPU V1 backend by default (vllm-project#17673) Signed-off-by: mgoin <mgoin64@gmail.com> * [V1][PP] Support PP for MultiprocExecutor (vllm-project#14219) Signed-off-by: jiang1.li <jiang1.li@intel.com> Signed-off-by: jiang.li <jiang1.li@intel.com> * [v1] AttentionMetadata for each layer (vllm-project#17394) Signed-off-by: Chen Zhang <zhangch99@outlook.com> * [Feat] Add deprecated=True to CLI args (vllm-project#17426) Signed-off-by: Aaron Pham <contact@aarnphm.xyz> * [Docs] Use gh-file to add links to tool_calling.md (vllm-project#17709) Signed-off-by: windsonsea <haifeng.yao@daocloud.io> * [v1] Introduce KVCacheBlocks as interface between Scheduler and KVCacheManager (vllm-project#17479) Signed-off-by: Chen Zhang <zhangch99@outlook.com> * [doc] Add RAG Integration example (vllm-project#17692) Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> * [Bugfix] Fix modality limits in vision language example (vllm-project#17721) Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> * Make right sidebar more readable in "Supported Models" (vllm-project#17723) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> * [TPU] Increase block size and reset block shapes (vllm-project#16458) * [Misc] Add Next Edit Prediction (NEP) datasets support in `benchmark_serving.py` (vllm-project#16839) Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal> Signed-off-by: dtransposed <> Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal> * [Bugfix] Fix for the condition to accept empty encoder inputs for mllama (vllm-project#17732) Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> * [Kernel] Unified Triton kernel that doesn't distinguish between prefill + decode (vllm-project#16828) Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> --------- Signed-off-by: Thomas Ortner <boh@zurich.ibm.com> Signed-off-by: Stanislaw Wozniak <stw@zurich.ibm.com> Signed-off-by: rzou <zou3519@gmail.com> Signed-off-by: Mikhail Podvitskii <podvitskiymichael@gmail.com> Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk> Signed-off-by: Lucas Wilkinson <lwilkinson@neuralmagic.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Mengqing Cao <cmq0113@163.com> Signed-off-by: mgoin <mgoin64@gmail.com> Signed-off-by: jiang1.li <jiang1.li@intel.com> Signed-off-by: jiang.li <jiang1.li@intel.com> Signed-off-by: Chen Zhang <zhangch99@outlook.com> Signed-off-by: Aaron Pham <contact@aarnphm.xyz> Signed-off-by: windsonsea <haifeng.yao@daocloud.io> Signed-off-by: reidliu41 <reid201711@gmail.com> Signed-off-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal> Signed-off-by: dtransposed <> Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com> Signed-off-by: Thomas Parnell <tpa@zurich.ibm.com> Signed-off-by: rshaw@neuralmagic.com <robertgshaw2@gmail.com> Co-authored-by: Stan Wozniak <77159600+s3woz@users.noreply.github.com> Co-authored-by: Thomas Ortner <boh@zurich.ibm.com> Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com> Co-authored-by: Tyler Michael Smith <tysmith@redhat.com> Co-authored-by: Richard Zou <zou3519@users.noreply.github.com> Co-authored-by: Mikhail Podvitskii <podvitskiymichael@gmail.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk> Co-authored-by: Lucas Wilkinson <LucasWilkinson@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Mengqing Cao <cmq0113@163.com> Co-authored-by: Michael Goin <mgoin64@gmail.com> Co-authored-by: Li, Jiang <jiang1.li@intel.com> Co-authored-by: Chen Zhang <zhangch99@outlook.com> Co-authored-by: Aaron Pham <contact@aarnphm.xyz> Co-authored-by: Michael Yao <haifeng.yao@daocloud.io> Co-authored-by: Reid <61492567+reidliu41@users.noreply.github.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Co-authored-by: Jevin Jiang <jevin0change@gmail.com> Co-authored-by: d.transposed <damian.bogunowicz@gmail.com> Co-authored-by: dtransposed <damian@damian-ml-machine.europe-west3-b.c.jetbrains-grazie.internal> Co-authored-by: Gregory Shtrasberg <156009573+gshtras@users.noreply.github.com> Co-authored-by: Thomas Parnell <tpa@zurich.ibm.com> Co-authored-by: Lucas Wilkinson <lwilkinson@neuralmagic.com>

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Signed-off-by: Mu Huai <tianbowen.tbw@antgroup.com>

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>

[doc] Add RAG Integration example

7ab9942

Signed-off-by: reidliu41 <reid201711@gmail.com>

mergify bot added the documentation Improvements or additions to documentation label May 6, 2025

DarkLight1337 reviewed May 6, 2025

View reviewed changes

examples/online_serving/retrieval_augmented_generation_with_langchain.py Show resolved Hide resolved

update llamaindex with config

dbe5db5

Signed-off-by: reidliu41 <reid201711@gmail.com>

DarkLight1337 reviewed May 6, 2025

View reviewed changes

docs/source/deployment/frameworks/retrieval_augmented_generation.md Outdated Show resolved Hide resolved

auto generate help

0306fb6

Signed-off-by: reidliu41 <reid201711@gmail.com>

reidliu41 force-pushed the add-rag branch from e34397e to 0306fb6 Compare May 6, 2025 08:02

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label May 6, 2025

Merge remote-tracking branch 'upstream/main' into add-rag

d0ba49e

reidliu41 added 3 commits May 6, 2025 17:31

add mock imports

ef033cc

Signed-off-by: reidliu41 <reid201711@gmail.com>

add missing mock imports

7acaf4a

Signed-off-by: reidliu41 <reid201711@gmail.com>

correct the name

9370c3d

Signed-off-by: reidliu41 <reid201711@gmail.com>

remove help text

0ab9508

Signed-off-by: reidliu41 <reid201711@gmail.com>

DarkLight1337 approved these changes May 6, 2025

View reviewed changes

DarkLight1337 enabled auto-merge (squash) May 6, 2025 14:22

DarkLight1337 merged commit 7525d5f into vllm-project:main May 6, 2025
32 checks passed

mawong-amd pushed a commit to ROCm/vllm that referenced this pull request May 14, 2025

[doc] Add RAG Integration example (vllm-project#17692)

8692892

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com>

zzzyq pushed a commit to zzzyq/vllm that referenced this pull request May 24, 2025

[doc] Add RAG Integration example (vllm-project#17692)

c3354ce

Signed-off-by: reidliu41 <reid201711@gmail.com> Co-authored-by: reidliu41 <reid201711@gmail.com> Signed-off-by: Yuqi Zhang <yuqizhang@google.com>

Uh oh!

[doc] Add RAG Integration example #17692

[doc] Add RAG Integration example #17692

Uh oh!

Conversation

reidliu41 commented May 6, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented May 6, 2025

Uh oh!

Uh oh!

Uh oh!

DarkLight1337 commented May 6, 2025

Uh oh!

hmellor commented May 6, 2025

Uh oh!

reidliu41 commented May 6, 2025

Uh oh!

DarkLight1337 commented May 6, 2025

Uh oh!

reidliu41 commented May 6, 2025

Uh oh!

reidliu41 commented May 6, 2025

Uh oh!

DarkLight1337 commented May 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

reidliu41 commented May 6, 2025

Uh oh!

DarkLight1337 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

reidliu41 commented May 6, 2025 •

edited by github-actions bot

Loading

DarkLight1337 commented May 6, 2025 •

edited

Loading