Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

Bingbing Wang; Jing Li; Ruifeng Xu

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

Bingbing Wang, Jing Li, Ruifeng Xu

TL;DR

ProStream is proposed, a proactive hierarchical memory framework for streaming dialogues that enables ad-hoc memory recall on demand by reasoning over continuous streams with multi-granular distillation and enables a bounded knowledge state for lower inference latency without sacrificing reasoning fidelity.

Abstract

Real-world dialogue usually unfolds as an infinite stream. It thus requires bounded-state memory mechanisms to operate within an infinite horizon. However, existing read-then-think memory is fundamentally misaligned with this setting, as it cannot support ad-hoc memory recall while streams unfold. To explore this challenge, we introduce \textbf{STEM-Bench}, the first benchmark for \textbf{ST}reaming \textbf{E}valuation of \textbf{M}emory. It comprises over 14K QA pairs in dialogue streams that assess perception fidelity, temporal reasoning, and global awareness under infinite-horizon constraints. The preliminary analysis on STEM-Bench indicates a critical \textit{fidelity-efficiency dilemma}: retrieval-based methods use fragment context, while full-context models incur unbounded latency. To resolve this, we propose \textbf{ProStream}, a proactive hierarchical memory framework for streaming dialogues. It enables ad-hoc memory recall on demand by reasoning over continuous streams with multi-granular distillation. Moreover, it employs Adaptive Spatiotemporal Optimization to dynamically optimize retention based on expected utility. It enables a bounded knowledge state for lower inference latency without sacrificing reasoning fidelity. Experiments show that ProStream outperforms baselines in both accuracy and efficiency.

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

TL;DR

Abstract

Paper Structure (35 sections, 3 theorems, 8 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 35 sections, 3 theorems, 8 equations, 8 figures, 3 tables, 1 algorithm.

Introduction
Related Work
Dialogue Benchmarks for Long-Term Memory.
Our STEM-Bench
Problem Formulation
STEM-Bench Construction
Data Construction Pipeline
Dataset Statistics
Evaluation Metric
Preliminary Analysis
The ProStream Framework
Proactive Semantic Stream Perception
Hierarchical Multi-Granular Distillation
Adaptive Spatiotemporal Optimization
The ProStream Policy.
...and 20 more sections

Key Result

Proposition 1.3

According to Anderson's Rational Analysis of Memory anderson2013adaptive, the probability $P$ that a memory trace $v$ is needed follows: By setting $u_{v,t}$ as a linear combination of frequency (History) and temporal proximity (Context), ProStream effectively maximizes the Expected Recall Probability under a strict resource constraint.

Figures (8)

Figure 1: Comparison of the read-then-think paradigm (left) and the streaming memory paradigm (right) based on TBBT dialogues.
Figure 2: Overview of STEM-Bench Benchmark Curation. (a) Taxonomy of cognitive dimensions and tasks, where QA pairs are categorized by cognitive challenges (HFP, SLR, DGA) and distinct task types. (b) Data construction pipeline of the STEM-Bench dataset.
Figure 3: Preliminary analysis of RAG and full-context performance. (Top) The average accuracy of all performance metrics across evidence distances. (Bottom) Inference latency over dialogue turns. The dashed lines indicate the overall average results.
Figure 4: Overview of our ProStream framework with four components discussed in turn from $\S$\ref{['subsec:buffering']} to $\S$\ref{['subsec:reasoning']}.
Figure 5: Average performance metrics (left) and latency (right) of varying LLM backbones. The numbers above are the percentage improvement of ProStream over the Full-Context baseline.
...and 3 more figures

Theorems & Definitions (7)

Definition 1.1: Memory State and Budget
Definition 1.2: The Optimization Objective
Proposition 1.3: Connection to Rational Analysis of Memory
Theorem 1.4: Approximation Ratio
proof
Theorem 1.5: Bounded Time Complexity
proof

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

TL;DR

Abstract

Bounded State in an Infinite Horizon: Proactive Hierarchical Memory for Ad-Hoc Recall over Streaming Dialogues

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (8)

Theorems & Definitions (7)