Movatterモバイル変換

sail-sg/FlowReasonerPublic

NotificationsYou must be signed in to change notification settings
Fork6
Star102

102 stars 6 forks Branches Tags Activity

You must be signed in to change notification settings

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
code		code
images		images
README.md		README.md

Repository files navigation

FlowReasoner: Reinforcing Query-Level Meta-Agents

Hongcheng Gao^*,Yue Liu^*,Yufei He,Longxu Dou,Chao Du,Zhijie Deng,
Bryan Hooi,Min Lin,Tianyu Pang^†

^*Equal Contribution^† Corresponding Author

This paper proposes a query-level meta-agent named FlowReasoner to automate the design of query-level multi-agent systems, i.e., one system per user query. Our core idea is to incentivize a reasoning-based meta-agent via external execution feedback. Concretely, by distilling DeepSeek R1, we first endow the basic reasoning ability regarding the generation of multi-agent systems to FlowReasoner. Then, we further enhance it via reinforcement learning (RL) with external execution feedback.A multi-purpose reward is designed to guide the RL training from aspects of performance, complexity, and efficiency. In this manner, FlowReasoner is enabled to generate a personalized multi-agent system for each user query via deliberative reasoning. Experiments on both engineering and competition code benchmarks demonstrate the superiority of FlowReasoner. Remarkably, it surpasses o1-mini by$\mathbf{10.52}$% accuracy across three benchmarks.

Installation

We follow theMetaGPT to install the required dependencies, please run the following commands:

git clone https://github.com/sail-sg/FlowReasonercd codepip install --upgrade -e.

All experiments are conducted on NVIDIA A100 GPUs with 80GB of memory.

Configure optimization parameters

Configure LLM parameters in config/config2.yaml (see examples/FlowReasoner/config2.example.yaml for reference)

models:"<model_name>":# model: "gpt-4-turbo"  # or gpt-3.5-turbo   api_type:"openai"# or azure / ollama / groq etc.   base_url:"<your base url>"    api_key:"<your api key>"   temperature: 0"<model_name>":     api_type:"openai"     base_url:"<your base url>"   api_key:"<your api key>"   temperature: 0CALC_USAGE: True

Run the inference

Using default parameters

python -m examples.FlowReasoner.optimize --dataset MATH

Or with custom parameters

python -m examples.FlowReasoner.optimize --dataset MATH --sample n --optimized_path xxx ...

Note that the test cases of each dataset should be split to two part with keyval and keytest seperately. Theval test cases are used for external execution feedback for optimaze workflow.

Training Stage

The SFT dataset is generated by the inference stage. The SFT is conducted by the standard training process usingLLaMA-Factory while the RL is based onEasyRL.

Acknowledgments

This repository is based on the codebase of theMetaGPT,LLaMA-Factory, andEasyRL. Thanks for their impressive work!

Citation

If you find our work helpful, please cite as

@misc{gao2025flowreasonerreinforcingquerylevelmetaagents,      title={FlowReasoner: Reinforcing Query-Level Meta-Agents},       author={Hongcheng Gao and Yue Liu and Yufei He and Longxu Dou and Chao Du and Zhijie Deng and Bryan Hooi and Min Lin and Tianyu Pang},      year={2025},      eprint={2504.15257},      archivePrefix={arXiv},      primaryClass={cs.AI},      url={https://arxiv.org/abs/2504.15257}, }

About

No description, website, or topics provided.

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Folders and files

Latest commit

History

Repository files navigation

FlowReasoner: Reinforcing Query-Level Meta-Agents

Installation

Configure optimization parameters

Run the inference

Using default parameters

Or with custom parameters

Training Stage

Acknowledgments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages

Contributors2

Languages

Movatterモバイル変換

sail-sg/FlowReasoner

Folders and files

Latest commit

History

Repository files navigation

FlowReasoner: Reinforcing Query-Level Meta-Agents

Installation

Configure optimization parameters

Run the inference

Using default parameters

Or with custom parameters

Training Stage

Acknowledgments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages0

Contributors2

Languages

Packages