Table of Contents
Fetching ...

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

Guibin Zhang, Muxin Fu, Shuicheng Yan

TL;DR

MemGen proposes a dynamic latent-memory system that interleaves memory generation with reasoning in LLM agents, using a memory trigger to decide when to invoke a memory weaver that creates machine-native latent tokens. Trained via RL or supervised fine-tuning, the weaver augments the reasoning process without modifying the frozen backbone, and can integrate external retrieval when desired. Across nine benchmarks, MemGen outperforms parametric and retrieval-based memory baselines, demonstrates strong cross-domain generalization, and shows emergent human-like memory faculties such as planning, procedural, and working memory. The framework also exhibits continual learning benefits and maintains efficiency, indicating a promising direction toward self-evolving, cognitive-like AI systems. These results underscore the value of generative latent memory as a core component of future intelligent agents.

Abstract

Agent memory shapes how Large Language Model (LLM)-powered agents, akin to the human brain, progressively refine themselves through environment interactions. Existing paradigms remain constrained: parametric memory forcibly adjusts model parameters, and retrieval-based memory externalizes experience into structured databases, yet neither captures the fluid interweaving of reasoning and memory that underlies human cognition. To address this gap, we propose MemGen, a dynamic generative memory framework that equips agents with a human-esque cognitive faculty. It consists of a \textit{memory trigger}, which monitors the agent's reasoning state to decide explicit memory invocation, and a \textit{memory weaver}, which takes the agent's current state as stimulus to construct a latent token sequence as machine-native memory to enrich its reasoning. In this way, MemGen enables agents to recall and augment latent memory throughout reasoning, producing a tightly interwoven cycle of memory and cognition. Extensive experiments across eight benchmarks show that MemGen surpasses leading external memory systems such as ExpeL and AWM by up to $38.22\%$, exceeds GRPO by up to $13.44\%$, and exhibits strong cross-domain generalization ability. More importantly, we find that without explicit supervision, MemGen spontaneously evolves distinct human-like memory faculties, including planning memory, procedural memory, and working memory, suggesting an emergent trajectory toward more naturalistic forms of machine cognition.

MemGen: Weaving Generative Latent Memory for Self-Evolving Agents

TL;DR

MemGen proposes a dynamic latent-memory system that interleaves memory generation with reasoning in LLM agents, using a memory trigger to decide when to invoke a memory weaver that creates machine-native latent tokens. Trained via RL or supervised fine-tuning, the weaver augments the reasoning process without modifying the frozen backbone, and can integrate external retrieval when desired. Across nine benchmarks, MemGen outperforms parametric and retrieval-based memory baselines, demonstrates strong cross-domain generalization, and shows emergent human-like memory faculties such as planning, procedural, and working memory. The framework also exhibits continual learning benefits and maintains efficiency, indicating a promising direction toward self-evolving, cognitive-like AI systems. These results underscore the value of generative latent memory as a core component of future intelligent agents.

Abstract

Agent memory shapes how Large Language Model (LLM)-powered agents, akin to the human brain, progressively refine themselves through environment interactions. Existing paradigms remain constrained: parametric memory forcibly adjusts model parameters, and retrieval-based memory externalizes experience into structured databases, yet neither captures the fluid interweaving of reasoning and memory that underlies human cognition. To address this gap, we propose MemGen, a dynamic generative memory framework that equips agents with a human-esque cognitive faculty. It consists of a \textit{memory trigger}, which monitors the agent's reasoning state to decide explicit memory invocation, and a \textit{memory weaver}, which takes the agent's current state as stimulus to construct a latent token sequence as machine-native memory to enrich its reasoning. In this way, MemGen enables agents to recall and augment latent memory throughout reasoning, producing a tightly interwoven cycle of memory and cognition. Extensive experiments across eight benchmarks show that MemGen surpasses leading external memory systems such as ExpeL and AWM by up to , exceeds GRPO by up to , and exhibits strong cross-domain generalization ability. More importantly, we find that without explicit supervision, MemGen spontaneously evolves distinct human-like memory faculties, including planning memory, procedural memory, and working memory, suggesting an emergent trajectory toward more naturalistic forms of machine cognition.

Paper Structure

This paper contains 64 sections, 22 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: The comparison among parametric memory, retrieval-based memory and MemGen. We drew inspiration from the layout presented in Figure 1 of li2025seekdarkreasoningtesttime.
  • Figure 2: The overview of our proposed MemGen.
  • Figure 3: The generalization study of MemGen. We train MemGen$_\mathsf{SFT}$ on one dataset (ALFWorld or TriviaQA) and evaluate it on four datasets (TriviaQA, ALFWorld, ScienceWorld, and FEVER).
  • Figure 4: Memory invocation frequency across benchmarks at inference (trained on MemGen$_\mathsf{SFT}$+ Qwen3-8B+GSM8K).
  • Figure 5: (Left) t-SNE visualization of latent memories generated by MemGen+ Qwen3-8B across datasets; (Middle and Right) Latent memory visualization within the TriviaQA and GSM8K datasets, clustered using $K$-means. The text at each cluster center represents the common pattern shared by many memory sequences in the cluster, such as Cluster 0 in GSM8K, where many sequences end with "_check".
  • ...and 6 more figures