AgentEvolver: Towards Efficient Self-Evolving Agent System

Yunpeng Zhai; Shuchang Tao; Cheng Chen; Anni Zou; Ziqian Chen; Qingxu Fu; Shinji Mai; Li Yu; Jiaji Deng; Zouying Cao; Zhaoyang Liu; Bolin Ding; Jingren Zhou

AgentEvolver: Towards Efficient Self-Evolving Agent System

Yunpeng Zhai, Shuchang Tao, Cheng Chen, Anni Zou, Ziqian Chen, Qingxu Fu, Shinji Mai, Li Yu, Jiaji Deng, Zouying Cao, Zhaoyang Liu, Bolin Ding, Jingren Zhou

TL;DR

AgentEvolver introduces a self-evolving agent framework that leverages LLMs to autonomously drive task generation, exploration, and fine-grained credit assignment, addressing data scarcity and sample inefficiency in long-horizon tool-augmented tasks. It formalizes learning in an open-ended environment by separating the sandbox E from an unknown target task distribution p_target(g) and defines proxy functions F_task and F_reward to generate training tasks and rewards. The framework comprises three synergistic mechanisms—self-questioning (curiosity-driven task generation), self-navigating (experience-guided exploration), and self-attributing (step-wise credit assignment with LLM justification)—coupled with a modular infrastructure (framework, context manager, environment service) enabling scalable, continual improvement. Empirical results on AppWorld and BFCL show substantial gains in exploration efficiency, sample utilization, and adaptation speed over PPO/GRPO baselines, with larger models deriving larger benefits, and ablations demonstrating the individual and combined value of the mechanisms. The work advances a scalable paradigm for autonomous, data-efficient evolution of agent capabilities, and outlines future directions toward larger models and LLM-level self-improvement.

Abstract

Autonomous agents powered by large language models (LLMs) have the potential to significantly enhance human productivity by reasoning, using tools, and executing complex tasks in diverse environments. However, current approaches to developing such agents remain costly and inefficient, as they typically require manually constructed task datasets and reinforcement learning (RL) pipelines with extensive random exploration. These limitations lead to prohibitively high data-construction costs, low exploration efficiency, and poor sample utilization. To address these challenges, we present AgentEvolver, a self-evolving agent system that leverages the semantic understanding and reasoning capabilities of LLMs to drive autonomous agent learning. AgentEvolver introduces three synergistic mechanisms: (i) self-questioning, which enables curiosity-driven task generation in novel environments, reducing dependence on handcrafted datasets; (ii) self-navigating, which improves exploration efficiency through experience reuse and hybrid policy guidance; and (iii) self-attributing, which enhances sample efficiency by assigning differentiated rewards to trajectory states and actions based on their contribution. By integrating these mechanisms into a unified framework, AgentEvolver enables scalable, cost-effective, and continual improvement of agent capabilities. Preliminary experiments indicate that AgentEvolver achieves more efficient exploration, better sample utilization, and faster adaptation compared to traditional RL-based baselines.

AgentEvolver: Towards Efficient Self-Evolving Agent System

TL;DR

Abstract

AgentEvolver: Towards Efficient Self-Evolving Agent System

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (16)