KEMBAR78

AgentFlow: In-the-Flow Agentic System Optimization

📣 News

[2025.10.16] 🏆 Our paper has been accepted by NeurIPS 2025 Efficient Reasoning Workshop!
[2025.10.13] 📸 Excited to have a tutorial video for AgentFlow covered by Discover AI on YouTube!
[2025.10.10] 🚀 Our X post received 1K+ likes! Feel free to check out the post and join the discussion! 💬
[2025.10.08] 🔥 We are honored to be featured as 🤗 HuggingFace Daily Paper #2.

🌟 Why AgentFlow?

AgentFlow is a trainable, tool-integrated agentic framework designed to overcome the scalability and generalization limits of today’s tool-augmented reasoning approaches.

Unlike prevailing approaches such as Search-R1 which train a single LLM to interleave reasoning steps with tool calls, AgentFlow introduces a modular agentic system with four specialized modules: 🧭 Planner, 🛠 Executor, ✅ Verifier, and ✍️ Generator.

For effective planning and tool use, the framework directly optimizes planner agent within the system in an online fashion using Flow-based Group Refined Policy Optimization (Flow-GRPO), achieving superior performance across diverse domains with improved tool-calling reliability and long-horizon reasoning capabilities.

📺 YouTube Tutorial

Excited to have a tutorial video for AgentFlow covered by Discover AI on YouTube!

🚀 Key Features

🧩 Modular Agentic System – Four specialized agent modules (Planner, Executor, Verifier, Generator) that coordinate via evolving memory and integrated tools across multiple turns.
🔗 Multi-Tool Integration – Seamlessly connect with diverse tool ecosystems, including base_generator, python_coder, google_search, wikipedia_search, web_search, and more.
🎯 Flow-GRPO Algorithm – Enables in-the-flow agent optimization for long-horizon reasoning tasks with sparse rewards.
📈 Proven Results – AgentFlow (7B Backbone) beats top baselines on 10 benchmarks, with +14.9% search, +14.0% agentic, +14.5% math, +4.1% science, even outperforming ~200B-parameter GPT-4o.

🏆 Experiments

📊 Main Results

AgentFlow (Qwen-2.5-7B-Instruct Backbone) outperforms top baselines on 10 benchmarks:

+14.9% on search
+14.0% on agentic reasoning
+14.5% on math
+4.1% on science

💡 Even surpasses larger proprietary models like GPT-4o (~200B).

🔍 In-Depth Analysis

Improved planning and decision-making
Enhanced tool-calling reliability
Positive scaling trends with model size & reasoning turns

Explore more in our paper or project page.

📑 Table of Contents

⚙️ Setup

Installation

bash setup.sh
source .venv/bin/activate
# (Optional) Install `parallel` for running benchmark experiments in parallel:
sudo apt-get update
sudo apt-get install parallel

Setup Environment Variables

Copy the .env.template file from agentflow/.env.template and rename it to .env, then place it in the agentflow/ folder. Update the following variables with your own API keys:

OPENAI_API_KEY (for judging reasponse)
GOOGLE_API_KEY (for Google Search tool)
DASHSCOPE_API_KEY (for calling Qwen-2.5-7B-Instruct as engine for agents and tools)
TOGETHER_API_KEY (alternative for calling Qwen-2.5-7B-Instruct as engine for agents and tools - recommended for international users)
More ways: serve Qwen2.5-7B-instruct model with vLLM (details refer to serve_vllm_local.md).

Please check API Key Setup Guide for detailed instructions on how to obtain these keys.

cp agentflow/.env.template agentflow/.env
# Then edit agentflow/.env with your API keys

⚡ Quick Start on AgentFlow Inference

AgentFlow provides a modular agentic system with four specialized modules (planner, executor, verifier, generator) that coordinate through evolving memory and a toolkit over multiple turns to solve complex reasoning tasks.

To quickly experience the system in action, run the command below (don’t forget to set up your API key):

python quick_start.py

Here is the content of quick_start.py:

# Import the solver
from agentflow.agentflow.solver import construct_solver

# Set the LLM engine name
llm_engine_name = "dashscope"

# Construct the solver
solver = construct_solver(llm_engine_name=llm_engine_name)

# Solve the user query
output = solver.solve("What is the capital of France?")
print(output["direct_output"])

💥 Quick Start on AgentFlow Flow-GRPO Training

For effective planning and tool use, the framework directly optimizes the planner agent within the system in an online fashion using Flow-GRPO. Below is a quick start for training.

(Optional) Test Your Environment

Before diving in, we recommend verifying that AgentFlow's tools, LLM engines, and network configuration are properly set up. See test_env.md for detailed testing instructions.

Dataset Preparation

We mix two datasets for training: NQ (Natural Questions) for agentic search and DeepMath-103K for mathematical reasoning.

# train data
python data/get_train_data.py
# validation data
python data/aime24_data.py

After that, data dir should be:

data/
├── train/
│   └── combined_train.parquet (182,190 samples)
├── val/
│   └── aime24.parquet (30 samples)
├── aime24_data.py
└── get_train_data.py

Flow-GRPO Training

Start agentflow training using Flow-GRPO with tmux:

# Create tmux session and start agentflow service (Window 0)
tmux new-session -s agentflow
bash train/serve_with_logs.sh

# Create new window (Ctrl+B then C) and start training (Window 1)
bash train/train_with_logs.sh

Configuration: All training hyperparameters are in train/config.yaml (model settings, tools, RL parameters, resources, etc.)

Logging: We provide a comprehensive logging to monitor training. See logs.md for more details.

🎯 AgentFlow Benchmark

Serve the trained planner model with VLLM (here we deploy our 7B Flow-GRPO planner model):

bash scripts/serve_vllm.sh

Run inference on benchmark tasks:

cd test
bash exp/run_all_models_all_datasets.sh

You can find more benchmarking details in benchmark.md.

🧩 Use Your Own Model in AgentFlow

AgentFlow supports different LLM engines for each agent module. See llm_engine.md for supported models and factory.py for the corresponding model_string configuration:

Planner Agent:

Modify the llm_engine_name parameter in test/exp/run_all_models_all_datasets.sh

Other Agents (Executor, Verifier, Generator):

By default, these agents use a fixed LLM engine (Qwen-2.5-7B-Instruct via DashScope)
To use your own model, modify self.llm_engine_fixed in agentflow/agentflow/models/planner.py:19:

self.llm_engine_fixed = create_llm_engine(model_string="your-engine", is_multimodal=False, temperature=temperature)

and

Modify the llm_engine_name parameter in the Executor instantiation from agentflow/agentflow/solver.py:232:

# Instantiate Executor
executor = Executor(
    # llm_engine_name=llm_engine_name,
    llm_engine_name="dashscope",
    root_cache_dir=root_cache_dir,
    verbose=verbose,
    # base_url=base_url,
    temperature=temperature
)

For detailed information on supported engines and model_string formats, see llm_engine.md

🤝 Core Contributors

_{Zhuofeng Li}

_{Haoxiang Zhang}

_{Pan Lu}

🎓 Advisors

_{James Zou}

_{Yejin Choi}

_{Yu Zhang}

🙏 Acknowledgements

We thank the following open-source projects:

verl for the excellent RL framework design.
vLLM for fast LLM inference support.
Verl-Tool and agent-lightning for their early-stage exploration in agentic RL Training.

We thank Lambda for GPU support!

🚀 Contributing

We are truly looking forward to open-source contributions to AgentFlow! If you’re interested in contributing, collaborating, or reporting issues, please feel free to open an issue or submit a pull request (PR). You can also reach us at zhuofengli12345@gmail.com, isaacpfino@gmail.com, lupantech@gmail.com or join our Slack community: AgentFlow.

We are also looking forward to your feedback and suggestions!

📚 Citation

@article{li2025flow,
  title={In-the-Flow Agentic System Optimization for Effective Planning and Tool Use},
  author={Li, Zhuofeng and Zhang, Haoxiang and Han, Seungju and Liu, Sheng and Xie, Jianwen and Zhang, Yu and Choi, Yejin and Zou, James and Lu, Pan},
  journal={arXiv preprint arXiv:2510.05592},
  year={2025}
}

⭐ Star History

↑ Back to Top ↑

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
agentflow		agentflow
assets		assets
data		data
scripts		scripts
test		test
train		train
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
quick_start.py		quick_start.py
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AgentFlow: In-the-Flow Agentic System Optimization

📣 News

🌟 Why AgentFlow?

📺 YouTube Tutorial

🚀 Key Features

🏆 Experiments

📊 Main Results

🔍 In-Depth Analysis

📑 Table of Contents

⚙️ Setup

Installation

Setup Environment Variables

⚡ Quick Start on AgentFlow Inference

💥 Quick Start on AgentFlow Flow-GRPO Training

(Optional) Test Your Environment

Dataset Preparation

Flow-GRPO Training

🎯 AgentFlow Benchmark

🧩 Use Your Own Model in AgentFlow

🤝 Core Contributors

🎓 Advisors

🙏 Acknowledgements

🚀 Contributing

📚 Citation

⭐ Star History

About

Uh oh!

Releases

Packages

Contributors 3

Languages

License

lupantech/AgentFlow

Folders and files

Latest commit

History

Repository files navigation

AgentFlow: In-the-Flow Agentic System Optimization

📣 News

🌟 Why AgentFlow?

📺 YouTube Tutorial

🚀 Key Features

🏆 Experiments

📊 Main Results

🔍 In-Depth Analysis

📑 Table of Contents

⚙️ Setup

Installation

Setup Environment Variables

⚡ Quick Start on AgentFlow Inference

💥 Quick Start on AgentFlow Flow-GRPO Training

(Optional) Test Your Environment

Dataset Preparation

Flow-GRPO Training

🎯 AgentFlow Benchmark

🧩 Use Your Own Model in AgentFlow

🤝 Core Contributors

🎓 Advisors

🙏 Acknowledgements

🚀 Contributing

📚 Citation

⭐ Star History

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages