KEMBAR78
GitHub - lupantech/AgentFlow: AgentFlow: In-the-Flow Agentic System Optimization
Skip to content

lupantech/AgentFlow

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AgentFlow

AgentFlow: In-the-Flow Agentic System Optimization

Arxiv Gradio Demo Huggingface Paper Huggingface Model Website X Youtube Slack Wechat AgentFlow

πŸ“£ News

  • [2025.10.16] πŸ† Our paper has been accepted by NeurIPS 2025 Efficient Reasoning Workshop!
  • [2025.10.13] πŸ“Έ Excited to have a tutorial video for AgentFlow covered by Discover AI on YouTube!
  • [2025.10.10] πŸš€ Our X post received 1K+ likes! Feel free to check out the post and join the discussion! πŸ’¬
  • [2025.10.08] πŸ”₯ We are honored to be featured as πŸ€— HuggingFace Daily Paper #2.

🌟 Why AgentFlow?

AgentFlow is a trainable, tool-integrated agentic framework designed to overcome the scalability and generalization limits of today’s tool-augmented reasoning approaches.

Unlike prevailing approaches such as Search-R1 which train a single LLM to interleave reasoning steps with tool calls, AgentFlow introduces a modular agentic system with four specialized modules: 🧭 Planner, πŸ›  Executor, βœ… Verifier, and ✍️ Generator.

framework_overall

For effective planning and tool use, the framework directly optimizes planner agent within the system in an online fashion using Flow-based Group Refined Policy Optimization (Flow-GRPO), achieving superior performance across diverse domains with improved tool-calling reliability and long-horizon reasoning capabilities.

flow_grpo

πŸ“Ί YouTube Tutorial

Excited to have a tutorial video for AgentFlow covered by Discover AI on YouTube!

πŸš€ Key Features

  • 🧩 Modular Agentic System – Four specialized agent modules (Planner, Executor, Verifier, Generator) that coordinate via evolving memory and integrated tools across multiple turns.
  • πŸ”— Multi-Tool Integration – Seamlessly connect with diverse tool ecosystems, including base_generator, python_coder, google_search, wikipedia_search, web_search, and more.
  • 🎯 Flow-GRPO Algorithm – Enables in-the-flow agent optimization for long-horizon reasoning tasks with sparse rewards.
  • πŸ“ˆ Proven Results – AgentFlow (7B Backbone) beats top baselines on 10 benchmarks, with +14.9% search, +14.0% agentic, +14.5% math, +4.1% science, even outperforming ~200B-parameter GPT-4o.

πŸ† Experiments

πŸ“Š Main Results

AgentFlow (Qwen-2.5-7B-Instruct Backbone) outperforms top baselines on 10 benchmarks:

  • +14.9% on search
  • +14.0% on agentic reasoning
  • +14.5% on math
  • +4.1% on science

πŸ’‘ Even surpasses larger proprietary models like GPT-4o (~200B).

main_table1 main_table2

πŸ” In-Depth Analysis

  • Improved planning and decision-making
  • Enhanced tool-calling reliability
  • Positive scaling trends with model size & reasoning turns

Explore more in our paper or project page.

tool_call


πŸ“‘ Table of Contents

βš™οΈ Setup

Installation

bash setup.sh
source .venv/bin/activate
# (Optional) Install `parallel` for running benchmark experiments in parallel:
sudo apt-get update
sudo apt-get install parallel

Setup Environment Variables

Copy the .env.template file from agentflow/.env.template and rename it to .env, then place it in the agentflow/ folder. Update the following variables with your own API keys:

  • OPENAI_API_KEY (for judging reasponse)
  • GOOGLE_API_KEY (for Google Search tool)
  • DASHSCOPE_API_KEY (for calling Qwen-2.5-7B-Instruct as engine for agents and tools)
  • TOGETHER_API_KEY (alternative for calling Qwen-2.5-7B-Instruct as engine for agents and tools - recommended for international users)
  • More ways: serve Qwen2.5-7B-instruct model with vLLM (details refer to serve_vllm_local.md).

Please check API Key Setup Guide for detailed instructions on how to obtain these keys.

cp agentflow/.env.template agentflow/.env
# Then edit agentflow/.env with your API keys

⚑ Quick Start on AgentFlow Inference

AgentFlow provides a modular agentic system with four specialized modules (planner, executor, verifier, generator) that coordinate through evolving memory and a toolkit over multiple turns to solve complex reasoning tasks.

To quickly experience the system in action, run the command below (don’t forget to set up your API key):

python quick_start.py

Here is the content of quick_start.py:

# Import the solver
from agentflow.agentflow.solver import construct_solver

# Set the LLM engine name
llm_engine_name = "dashscope"

# Construct the solver
solver = construct_solver(llm_engine_name=llm_engine_name)

# Solve the user query
output = solver.solve("What is the capital of France?")
print(output["direct_output"])

πŸ’₯ Quick Start on AgentFlow Flow-GRPO Training

For effective planning and tool use, the framework directly optimizes the planner agent within the system in an online fashion using Flow-GRPO. Below is a quick start for training.

(Optional) Test Your Environment

Before diving in, we recommend verifying that AgentFlow's tools, LLM engines, and network configuration are properly set up. See test_env.md for detailed testing instructions.

Dataset Preparation

We mix two datasets for training: NQ (Natural Questions) for agentic search and DeepMath-103K for mathematical reasoning.

# train data
python data/get_train_data.py
# validation data
python data/aime24_data.py

After that, data dir should be:

data/
β”œβ”€β”€ train/
β”‚   └── combined_train.parquet (182,190 samples)
β”œβ”€β”€ val/
β”‚   └── aime24.parquet (30 samples)
β”œβ”€β”€ aime24_data.py
└── get_train_data.py

Flow-GRPO Training

Start agentflow training using Flow-GRPO with tmux:

# Create tmux session and start agentflow service (Window 0)
tmux new-session -s agentflow
bash train/serve_with_logs.sh

# Create new window (Ctrl+B then C) and start training (Window 1)
bash train/train_with_logs.sh

Configuration: All training hyperparameters are in train/config.yaml (model settings, tools, RL parameters, resources, etc.)

Logging: We provide a comprehensive logging to monitor training. See logs.md for more details.

🎯 AgentFlow Benchmark

Serve the trained planner model with VLLM (here we deploy our 7B Flow-GRPO planner model):

bash scripts/serve_vllm.sh

Run inference on benchmark tasks:

cd test
bash exp/run_all_models_all_datasets.sh

You can find more benchmarking details in benchmark.md.

🧩 Use Your Own Model in AgentFlow

AgentFlow supports different LLM engines for each agent module. See llm_engine.md for supported models and factory.py for the corresponding model_string configuration:

Planner Agent:

Other Agents (Executor, Verifier, Generator):

self.llm_engine_fixed = create_llm_engine(model_string="your-engine", is_multimodal=False, temperature=temperature)

and

# Instantiate Executor
executor = Executor(
    # llm_engine_name=llm_engine_name,
    llm_engine_name="dashscope",
    root_cache_dir=root_cache_dir,
    verbose=verbose,
    # base_url=base_url,
    temperature=temperature
)
  • For detailed information on supported engines and model_string formats, see llm_engine.md

🀝 Core Contributors

Zhuofeng Li
Zhuofeng Li
Haoxiang Zhang
Haoxiang Zhang
Pan Lu
Pan Lu

πŸŽ“ Advisors

James Zou
James Zou
Yejin Choi
Yejin Choi
Yu Zhang
Yu Zhang

πŸ™ Acknowledgements

We thank the following open-source projects:

  • verl for the excellent RL framework design.
  • vLLM for fast LLM inference support.
  • Verl-Tool and agent-lightning for their early-stage exploration in agentic RL Training.

We thank Lambda for GPU support!

πŸš€ Contributing

We are truly looking forward to open-source contributions to AgentFlow! If you’re interested in contributing, collaborating, or reporting issues, please feel free to open an issue or submit a pull request (PR). You can also reach us at zhuofengli12345@gmail.com, isaacpfino@gmail.com, lupantech@gmail.com or join our Slack community: AgentFlow.

We are also looking forward to your feedback and suggestions!

πŸ“š Citation

@article{li2025flow,
  title={In-the-Flow Agentic System Optimization for Effective Planning and Tool Use},
  author={Li, Zhuofeng and Zhang, Haoxiang and Han, Seungju and Liu, Sheng and Xie, Jianwen and Zhang, Yu and Choi, Yejin and Zou, James and Lu, Pan},
  journal={arXiv preprint arXiv:2510.05592},
  year={2025}
}

⭐ Star History

Star History Chart

↑ Back to Top ↑