Table of Contents
Fetching ...

NextMem: Towards Latent Factual Memory for LLM-based Agents

Zeyu Zhang, Rui Li, Xiaoyan Zhao, Yang Zhang, Wenjie Wang, Xu Chen, Tat-Seng Chua

Abstract

Memory is critical for LLM-based agents to preserve past observations for future decision-making, where factual memory serves as its foundational part. However, existing approaches to constructing factual memory face several limitations. Textual methods impose heavy context and indexing burdens, while parametric methods suffer from catastrophic forgetting and high costs. To address these challenges, we introduce NextMem, a latent factual memory framework that utilizes an autoregressive autoencoder to efficiently construct latent memory while ensuring accurate reconstruction. For better optimization, we propose a two-stage training process, including autoregressive reconstruction alignment and progressive latent substitution. We also incorporate quantization to reduce storage overhead. Extensive experiments demonstrate that NextMem achieves superior performance, and excels in retrieval, robustness, and extensibility properties. We release our code and model checkpoints at https://github.com/nuster1128/NextMem.

NextMem: Towards Latent Factual Memory for LLM-based Agents

Abstract

Memory is critical for LLM-based agents to preserve past observations for future decision-making, where factual memory serves as its foundational part. However, existing approaches to constructing factual memory face several limitations. Textual methods impose heavy context and indexing burdens, while parametric methods suffer from catastrophic forgetting and high costs. To address these challenges, we introduce NextMem, a latent factual memory framework that utilizes an autoregressive autoencoder to efficiently construct latent memory while ensuring accurate reconstruction. For better optimization, we propose a two-stage training process, including autoregressive reconstruction alignment and progressive latent substitution. We also incorporate quantization to reduce storage overhead. Extensive experiments demonstrate that NextMem achieves superior performance, and excels in retrieval, robustness, and extensibility properties. We release our code and model checkpoints at https://github.com/nuster1128/NextMem.
Paper Structure (33 sections, 24 equations, 7 figures, 4 tables)

This paper contains 33 sections, 24 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Comparison between task-oriented and factual memory.
  • Figure 2: Overview of NextMem framework.
  • Figure 3: Results under varying compression ratios.
  • Figure 4: Robustness results of latent representations under varying levels of Gaussian noise ($\sigma$) and NF4 quantization.
  • Figure 5: Semantic assignment analysis of latent memory.
  • ...and 2 more figures