Table of Contents
Fetching ...

AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval

Kai Zhang, Xinyuan Zhang, Ejaz Ahmed, Hongda Jiang, Caleb Kumar, Kai Sun, Zhaojiang Lin, Sanat Sharma, Shereen Oraby, Aaron Colak, Ahmed Aly, Anuj Kumar, Xiaozhong Liu, Xin Luna Dong

TL;DR

AssoMem tackles memory recall in large-scale, similarity-dense memory stores by introducing an associative memory graph that links utterances to automatically extracted clues. It integrates relevance, importance, and temporal signals through a mutual-information–driven fusion in the RITRanker, and further enhances QA generation with targeted fine-tuning. The framework is validated on LongMemEval and MeetingQA, showing consistent SOTA improvements across retrieval and generation, with robust gains across memory sizes and diverse question types. The work demonstrates the practical impact of structuring memories associatively to enable context-aware, scalable memory recall for memory-augmented AI systems.

Abstract

Accurate recall from large scale memories remains a core challenge for memory augmented AI assistants performing question answering (QA), especially in similarity dense scenarios where existing methods mainly rely on semantic distance to the query for retrieval. Inspired by how humans link information associatively, we propose AssoMem, a novel framework constructing an associative memory graph that anchors dialogue utterances to automatically extracted clues. This structure provides a rich organizational view of the conversational context and facilitates importance aware ranking. Further, AssoMem integrates multi-dimensional retrieval signals-relevance, importance, and temporal alignment using an adaptive mutual information (MI) driven fusion strategy. Extensive experiments across three benchmarks and a newly introduced dataset, MeetingQA, demonstrate that AssoMem consistently outperforms SOTA baselines, verifying its superiority in context-aware memory recall.

AssoMem: Scalable Memory QA with Multi-Signal Associative Retrieval

TL;DR

AssoMem tackles memory recall in large-scale, similarity-dense memory stores by introducing an associative memory graph that links utterances to automatically extracted clues. It integrates relevance, importance, and temporal signals through a mutual-information–driven fusion in the RITRanker, and further enhances QA generation with targeted fine-tuning. The framework is validated on LongMemEval and MeetingQA, showing consistent SOTA improvements across retrieval and generation, with robust gains across memory sizes and diverse question types. The work demonstrates the practical impact of structuring memories associatively to enable context-aware, scalable memory recall for memory-augmented AI systems.

Abstract

Accurate recall from large scale memories remains a core challenge for memory augmented AI assistants performing question answering (QA), especially in similarity dense scenarios where existing methods mainly rely on semantic distance to the query for retrieval. Inspired by how humans link information associatively, we propose AssoMem, a novel framework constructing an associative memory graph that anchors dialogue utterances to automatically extracted clues. This structure provides a rich organizational view of the conversational context and facilitates importance aware ranking. Further, AssoMem integrates multi-dimensional retrieval signals-relevance, importance, and temporal alignment using an adaptive mutual information (MI) driven fusion strategy. Extensive experiments across three benchmarks and a newly introduced dataset, MeetingQA, demonstrate that AssoMem consistently outperforms SOTA baselines, verifying its superiority in context-aware memory recall.

Paper Structure

This paper contains 31 sections, 4 equations, 4 figures, 12 tables.

Figures (4)

  • Figure 1: An example showing limitations in relevance solely retrieval. Our AssoMem consistently outperforms SOTA baselines on three datasets.
  • Figure 2: Overview of the proposed AssoMem framework. A topic–utterance graph is constructed from historical dialogues, enabling the integration of relevance, importance, and temporal signals. These are adaptively fused to guide accurate memory retrieval for question answering.
  • Figure 3: The radar figure showing performance on different question types.
  • Figure 4: Supplementary results