Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval

Yingyi Zhang; Junyi Li; Wenlin Zhang; Penyue Jia; Xianneng Li; Yichao Wang; Derong Xu; Yi Wen; Huifeng Guo; Yong Liu; Xiangyu Zhao

Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval

Yingyi Zhang, Junyi Li, Wenlin Zhang, Penyue Jia, Xianneng Li, Yichao Wang, Derong Xu, Yi Wen, Huifeng Guo, Yong Liu, Xiangyu Zhao

TL;DR

RF-Mem (Recollection-Familiarity Memory Retrieval), a familiarity uncertainty-guided dual-path memory retriever, embeds human-like dual-process recognition into the retriever, avoiding full-context overhead and enabling scalable, adaptive personalization.

Abstract

Personalized large language models (LLMs) rely on memory retrieval to incorporate user-specific histories, preferences, and contexts. Existing approaches either overload the LLM by feeding all the user's past memory into the prompt, which is costly and unscalable, or simplify retrieval into a one-shot similarity search, which captures only surface matches. Cognitive science, however, shows that human memory operates through a dual process: Familiarity, offering fast but coarse recognition, and Recollection, enabling deliberate, chain-like reconstruction for deeply recovering episodic content. Current systems lack both the ability to perform recollection retrieval and mechanisms to adaptively switch between the dual retrieval paths, leading to either insufficient recall or the inclusion of noise. To address this, we propose RF-Mem (Recollection-Familiarity Memory Retrieval), a familiarity uncertainty-guided dual-path memory retriever. RF-Mem measures the familiarity signal through the mean score and entropy. High familiarity leads to the direct top-K Familiarity retrieval path, while low familiarity activates the Recollection path. In the Recollection path, the system clusters candidate memories and applies alpha-mix with the query to iteratively expand evidence in embedding space, simulating deliberate contextual reconstruction. This design embeds human-like dual-process recognition into the retriever, avoiding full-context overhead and enabling scalable, adaptive personalization. Experiments across three benchmarks and corpus scales demonstrate that RF-Mem consistently outperforms both one-shot retrieval and full-context reasoning under fixed budget and latency constraints. Our code can be found in the Reproducibility Statement.

Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval

TL;DR

Abstract

Paper Structure (52 sections, 6 theorems, 22 equations, 19 figures, 17 tables, 3 algorithms)

This paper contains 52 sections, 6 theorems, 22 equations, 19 figures, 17 tables, 3 algorithms.

Introduction
Method: Recollection–Familiarity Memory Retrieval
Familiarity Uncertainty-Driven Retrieval Selection
Familiarity Retrieval
Recollection Retrieval
Experiments
Experimental Setup
Overall performance in Personalized Generation
Overall performance in Personalized Retrieval
Adaptive Experiment
Adaptive to Index Building Method
Adaptive to Query Expansion Method
Adaptive to Iterative RAG Method
Related Works
Conclusion
...and 37 more sections

Key Result

Lemma 1

Assume the similarity distribution admits a monotone likelihood ratio in $s_i$ between relevant and nonrelevant items, and that the probe softmax temperature $\lambda$ is fixed. Then $\mathcal{E}(\text{Familiarity}\mid q)$ is nonincreasing in $\bar{s}$ and nondecreasing in $H(p)$, while $C(\text{Fam

Figures (19)

Figure 1: Comparison between standard familiarity-based retrieval and recollection-based retrieval in user health narratives. And the brain figure motivated by rugg2007eventyonelinas2024role.
Figure 2: The overall architecture of RF-Mem. A dual-process memory retrieval system dynamically switches between the Familiarity and the Recollection paths.
Figure 3: Illustration of adaptive study setup. Offline indexes (e.g., MemoryBank or origin memory) provide different storage, while RF-Mem serves as an online retrieval layer that adapts to them.
Figure 4: Illustration of the adaptive study setup. Nearline query expansion (e.g., HyDE) enriches the query representation, and RF-Mem operates as the online retrieval layer.
Figure 5: Illustration of adaptive study setup. Iterative RAG (e.g., Search-o1) provides a multi-turn retrieval for answer generation, while RF-Mem serves as the retrieval layer that adapts to it.
...and 14 more figures

Theorems & Definitions (6)

Lemma 1: Monotonicity of proxy signals
Theorem 1: Threshold optimality within monotone policies
Lemma 2: Entropy certificate
Proposition 1: Bound on familiarity error under low entropy
Proposition 2: Gating error bound via concentration
Proposition 3: Complexity bounds

Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval

TL;DR

Abstract

Evoking User Memory: Personalizing LLM via Recollection-Familiarity Adaptive Retrieval

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (19)

Theorems & Definitions (6)