Table of Contents
Fetching ...

Towards Autonomous Memory Agents

Xinle Wu, Rui Zhang, Mustafa Anis Hussain, Yao Lu

TL;DR

U-Mem proposes autonomous memory agents that actively acquire, validate, and curate knowledge at a minimum cost, and can surpass RL-based optimization on both verifiable and non-verifiable benchmarks.

Abstract

Recent memory agents improve LLMs by extracting experiences and conversation history into an external storage. This enables low-overhead context assembly and online memory update without expensive LLM training. However, existing solutions remain passive and reactive; memory growth is bounded by information that happens to be available, while memory agents seldom seek external inputs in uncertainties. We propose autonomous memory agents that actively acquire, validate, and curate knowledge at a minimum cost. U-Mem materializes this idea via (i) a cost-aware knowledge-extraction cascade that escalates from cheap self/teacher signals to tool-verified research and, only when needed, expert feedback, and (ii) semantic-aware Thompson sampling to balance exploration and exploitation over memories and mitigate cold-start bias. On both verifiable and non-verifiable benchmarks, U-Mem consistently beats prior memory baselines and can surpass RL-based optimization, improving HotpotQA (Qwen2.5-7B) by 14.6 points and AIME25 (Gemini-2.5-flash) by 7.33 points.

Towards Autonomous Memory Agents

TL;DR

U-Mem proposes autonomous memory agents that actively acquire, validate, and curate knowledge at a minimum cost, and can surpass RL-based optimization on both verifiable and non-verifiable benchmarks.

Abstract

Recent memory agents improve LLMs by extracting experiences and conversation history into an external storage. This enables low-overhead context assembly and online memory update without expensive LLM training. However, existing solutions remain passive and reactive; memory growth is bounded by information that happens to be available, while memory agents seldom seek external inputs in uncertainties. We propose autonomous memory agents that actively acquire, validate, and curate knowledge at a minimum cost. U-Mem materializes this idea via (i) a cost-aware knowledge-extraction cascade that escalates from cheap self/teacher signals to tool-verified research and, only when needed, expert feedback, and (ii) semantic-aware Thompson sampling to balance exploration and exploitation over memories and mitigate cold-start bias. On both verifiable and non-verifiable benchmarks, U-Mem consistently beats prior memory baselines and can surpass RL-based optimization, improving HotpotQA (Qwen2.5-7B) by 14.6 points and AIME25 (Gemini-2.5-flash) by 7.33 points.
Paper Structure (18 sections, 7 equations, 12 figures, 5 tables)

This paper contains 18 sections, 7 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Passive vs. autonomous memory agents.
  • Figure 2: Overview of U-Mem.
  • Figure 3: Scaling trend of U-Mem on HotpotQA.
  • Figure 4: Correlation between task similarity and memory benefit. Each point denotes a task; the x-axis is AMCS (task similarity), and the y-axis is U-Mem’s performance gain over the base model. Pearson correlation: r = 0.888..
  • Figure 5: Comparison of average tokens usage between U-MEM, ReasoningBank, ReMe, MemRL and No Mem.
  • ...and 7 more figures