MemFly: On-the-Fly Memory Optimization via Information Bottleneck

Zhenyuan Zhang; Xianzhang Jia; Zhiqin Yang; Zhenbo Song; Wei Xue; Sirui Han; Yike Guo

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

Zhenyuan Zhang, Xianzhang Jia, Zhiqin Yang, Zhenbo Song, Wei Xue, Sirui Han, Yike Guo

TL;DR

MemFly reframes agent memory as an online information bottleneck problem to compress history while preserving task-relevant evidence, tackling the trade-off between memory fidelity and compactness. It introduces a Note-Keyword-Topic memory hierarchy and a gradient-free, LLM-guided approach to online memory consolidation via merge, link, and append operations. Retrieval is performed through a tri-pathway hybrid mechanism (macro-semantic topics, micro-symbolic keywords, and topological expansion) with iterative evidence refinement to handle complex, multi-hop queries. Empirical results on LoCoMo show MemFly outperforming state-of-the-art baselines across diverse backbone models in memory coherence, response fidelity, and reasoning accuracy, demonstrating the approach’s robustness and scalability.

Abstract

Long-term memory enables large language model agents to tackle complex tasks through historical interactions. However, existing frameworks encounter a fundamental dilemma between compressing redundant information efficiently and maintaining precise retrieval for downstream tasks. To bridge this gap, we propose MemFly, a framework grounded in information bottleneck principles that facilitates on-the-fly memory evolution for LLMs. Our approach minimizes compression entropy while maximizing relevance entropy via a gradient-free optimizer, constructing a stratified memory structure for efficient storage. To fully leverage MemFly, we develop a hybrid retrieval mechanism that seamlessly integrates semantic, symbolic, and topological pathways, incorporating iterative refinement to handle complex multi-hop queries. Comprehensive experiments demonstrate that MemFly substantially outperforms state-of-the-art baselines in memory coherence, response fidelity, and accuracy.

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

TL;DR

Abstract

Paper Structure (43 sections, 20 equations, 2 figures, 5 tables)

This paper contains 43 sections, 20 equations, 2 figures, 5 tables.

Introduction
Related Work
Retrieval-Centric Systems
Memory-Augmented Agents
The MemFly Framework
Problem Formulation
Notation.
The Optimization Objective.
Online Approximation via Greedy Agglomeration.
LLM as JS-Divergence Approximator.
Structural Prior
Layer 1: Notes $\mathcal{N}$ (Fidelity Layer).
Layer 2: Keywords $\mathcal{K}$ (Anchoring Layer).
Layer 3: Topics $\mathcal{T}$ (Navigation Layer).
Memory Construction
...and 28 more sections

Figures (2)

Figure 1: Overview of the MemFly framework. Left: Memory construction processes incoming observations through semantic ingestion and gated structural update, where an LLM-based optimizer performs Merge, Link, or Append operations to minimize the IB objective. Center: The memory state is organized as a stratified Note-Keyword-Topic hierarchy with associative edges following the double clustering principle. Right: Memory retrieval employs tri-pathway search via Topics, Keywords, and topological expansion, followed by iterative evidence refinement for complex queries.
Figure 2: Category-wise F1 scores (%) for ablation variants on LoCoMo (Qwen3-8B). (a) Ablations on memory construction components. (b) Ablations on retrieval pathways and iterative refinement.

Theorems & Definitions (3)

Remark 3.1: Generality of the Framework
Remark 3.2: Information-Theoretic Interpretation of Link
Remark 3.3: Extension Beyond Classical AIB

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

TL;DR

Abstract

MemFly: On-the-Fly Memory Optimization via Information Bottleneck

Authors

TL;DR

Abstract

Table of Contents

Figures (2)

Theorems & Definitions (3)