Table of Contents
Fetching ...

Self-evolving Agents with reflective and memory-augmented abilities

Xuechen Liang, Yangfan He, Yinghui Xia, Xinyuan Song, Jianhui Wang, Meiling Tao, Li Sun, Xinhang Yuan, Jiayi Su, Keqin Li, Jiaqi Chen, Jinsong Yang, Siyuan Chen, Tianyu Shi

TL;DR

The work tackles the difficulty of sustaining decision-making and long-term information retention in LLM-based agents. It introduces SAGE, a framework that combines iterative feedback, reflective self-assessment, and a MemorySyntax module that leverages the Ebbinghaus forgetting curve to manage a dual STM/LTM memory system. The authors formalize a three-agent interaction (User, Assistant, Checker) and prove convergence to stable strategies via a Nash-equilibrium perspective, while demonstrating significant empirical gains across AgentBench, long-context tasks, and RAG-based QA benchmarks, particularly for smaller models. The results imply practical enhancements in multi-task autonomy, memory efficiency, and error reduction, suggesting broad applicability to real-world agent systems.

Abstract

Large language models (LLMs) have made significant advances in the field of natural language processing, but they still face challenges such as continuous decision-making. In this research, we propose a novel framework by integrating iterative feedback, reflective mechanisms, and a memory optimization mechanism based on the Ebbinghaus forgetting curve, it significantly enhances the agents' capabilities in handling multi-tasking and long-span information.

Self-evolving Agents with reflective and memory-augmented abilities

TL;DR

The work tackles the difficulty of sustaining decision-making and long-term information retention in LLM-based agents. It introduces SAGE, a framework that combines iterative feedback, reflective self-assessment, and a MemorySyntax module that leverages the Ebbinghaus forgetting curve to manage a dual STM/LTM memory system. The authors formalize a three-agent interaction (User, Assistant, Checker) and prove convergence to stable strategies via a Nash-equilibrium perspective, while demonstrating significant empirical gains across AgentBench, long-context tasks, and RAG-based QA benchmarks, particularly for smaller models. The results imply practical enhancements in multi-task autonomy, memory efficiency, and error reduction, suggesting broad applicability to real-world agent systems.

Abstract

Large language models (LLMs) have made significant advances in the field of natural language processing, but they still face challenges such as continuous decision-making. In this research, we propose a novel framework by integrating iterative feedback, reflective mechanisms, and a memory optimization mechanism based on the Ebbinghaus forgetting curve, it significantly enhances the agents' capabilities in handling multi-tasking and long-span information.
Paper Structure (30 sections, 6 theorems, 47 equations, 4 figures, 6 tables)

This paper contains 30 sections, 6 theorems, 47 equations, 4 figures, 6 tables.

Key Result

Theorem 3.1

Let $\mathcal{U}, \mathcal{A}, \mathcal{C}$ denote the compact, convex strategy spaces of the user (U), assistant (A), and checker (C), respectively. Assume that the utility functions are continuous in each player’s strategy. Then, by the Debreu-Glicksberg-Fan fixed-point theorem, there exists a Nash equilibrium Furthermore, suppose that the assistant’s policy $\pi_\theta$ is updated via policy

Figures (4)

  • Figure 1: An illustration of the SAGE: a user provides a description and instance to the assistant with short-term (STM) and long-term (LTM) memory. The assistant performs observation, action, reflection, and output, which the checker reviews. The retention rate curve on the right illustrates memory decay over time, with a self-evolving loop guiding continued updates.
  • Figure 2: An example of the assistant's iterative workflow, including checker evaluation, prompt templates for feedback, and reflection processes integrating short-term and long-term memory.
  • Figure 3: Execution results across six tasks (CLE: Context Limit Exceeded, TLE: Task Limit Exceeded). Task limits are the main cause of incomplete tasks, highlighting LLM agents' limitations under time constraints.
  • Figure 4: The illustration of an example HotpotQA with SAGE.

Theorems & Definitions (12)

  • Theorem 3.1: Theory for the multi-agent iterative feedback system
  • Lemma A.1
  • proof
  • Lemma A.2
  • proof
  • Lemma A.3
  • proof
  • proof
  • Lemma A.4
  • proof
  • ...and 2 more