Table of Contents
Fetching ...

Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory

Zihao Tang, Xin Yu, Ziyu Xiao, Zengxuan Wen, Zelin Li, Jiaxi Zhou, Hualei Wang, Haohua Wang, Haizhen Huang, Weiwei Deng, Feng Sun, Qi Zhang

TL;DR

The paper addresses the challenge of long-horizon memory for LLMs by proposing Mnemis, a dual-route retrieval framework that fuses System-1 similarity search on a refined base graph with System-2 global selection over a hierarchical graph. The base graph stores granular memory components (Episodes, Entities, Edges, Episodic Edges) while the hierarchical graph enables top-down, layer-wise reasoning through Category Nodes and Category Edges under principled constraints. Through embedding+BM25 retrieval, RRF re-ranking, and a top-down global search, Mnemis achieves state-of-the-art performance on LoCoMo (93.9) and LongMemEval-S (91.6) with GPT-4.1-mini, outperforming RAG, Graph-RAG, and several memory baselines. The work demonstrates that combining complementary retrieval routes yields superior coverage and structural relevance for long-term memory, with practical implications for persistent AI agents and memory-intensive tasks; future work includes multimodal extensions and more flexible global-traversal planning.

Abstract

AI Memory, specifically how models organizes and retrieves historical messages, becomes increasingly valuable to Large Language Models (LLMs), yet existing methods (RAG and Graph-RAG) primarily retrieve memory through similarity-based mechanisms. While efficient, such System-1-style retrieval struggles with scenarios that require global reasoning or comprehensive coverage of all relevant information. In this work, We propose Mnemis, a novel memory framework that integrates System-1 similarity search with a complementary System-2 mechanism, termed Global Selection. Mnemis organizes memory into a base graph for similarity retrieval and a hierarchical graph that enables top-down, deliberate traversal over semantic hierarchies. By combining the complementary strength from both retrieval routes, Mnemis retrieves memory items that are both semantically and structurally relevant. Mnemis achieves state-of-the-art performance across all compared methods on long-term memory benchmarks, scoring 93.9 on LoCoMo and 91.6 on LongMemEval-S using GPT-4.1-mini.

Mnemis: Dual-Route Retrieval on Hierarchical Graphs for Long-Term LLM Memory

TL;DR

The paper addresses the challenge of long-horizon memory for LLMs by proposing Mnemis, a dual-route retrieval framework that fuses System-1 similarity search on a refined base graph with System-2 global selection over a hierarchical graph. The base graph stores granular memory components (Episodes, Entities, Edges, Episodic Edges) while the hierarchical graph enables top-down, layer-wise reasoning through Category Nodes and Category Edges under principled constraints. Through embedding+BM25 retrieval, RRF re-ranking, and a top-down global search, Mnemis achieves state-of-the-art performance on LoCoMo (93.9) and LongMemEval-S (91.6) with GPT-4.1-mini, outperforming RAG, Graph-RAG, and several memory baselines. The work demonstrates that combining complementary retrieval routes yields superior coverage and structural relevance for long-term memory, with practical implications for persistent AI agents and memory-intensive tasks; future work includes multimodal extensions and more flexible global-traversal planning.

Abstract

AI Memory, specifically how models organizes and retrieves historical messages, becomes increasingly valuable to Large Language Models (LLMs), yet existing methods (RAG and Graph-RAG) primarily retrieve memory through similarity-based mechanisms. While efficient, such System-1-style retrieval struggles with scenarios that require global reasoning or comprehensive coverage of all relevant information. In this work, We propose Mnemis, a novel memory framework that integrates System-1 similarity search with a complementary System-2 mechanism, termed Global Selection. Mnemis organizes memory into a base graph for similarity retrieval and a hierarchical graph that enables top-down, deliberate traversal over semantic hierarchies. By combining the complementary strength from both retrieval routes, Mnemis retrieves memory items that are both semantically and structurally relevant. Mnemis achieves state-of-the-art performance across all compared methods on long-term memory benchmarks, scoring 93.9 on LoCoMo and 91.6 on LongMemEval-S using GPT-4.1-mini.
Paper Structure (18 sections, 5 figures, 9 tables)

This paper contains 18 sections, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Framework of Mnemis together with the workflow of base graph ingestion, hierarchical graph ingestion and search. Left is a real case from LoCoMo.
  • Figure 2: Mnemis Hierarchical Graph Overview.
  • Figure 3: Mnemis win case on LoCoMo benchmark. While similarity search fixates on the surface-level cause gastritis, Mnemis successfully identifies the underlying root cause, namely that Sam is overweight. This condition leads to the gastritis and motivates him to change his lifestyle.
  • Figure 4: LoCoMo result across different top-$k$ settings.
  • Figure 5: Mnemis win case on LongMemEval-S benchmark, where Mnemis successfully retrieve "Company's Annual Charity Soccer Tournament" from "Sports Events".