Table of Contents
Fetching ...

Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents

Haoran Sun, Shaoning Zeng

TL;DR

Problem: Long-term reasoning in LLM agents is hampered by inefficient memory organization and retrieval.Approach: The authors propose H-MEM, a four-layer hierarchical memory with position-index routing to enable structured, layer-wise memory retrieval and grounding.Contributions: four-layer storage (Domain, Category, Memory Trace, Episode), memory update via user feedback, top-down retrieval with FAISS, and extensive evaluation on LoCoMo showing consistent improvements over baselines and strong efficiency.Significance: The method improves long-term dialogue reasoning and scalability, with potential extensions to multimodal memory.

Abstract

Long-term memory is one of the key factors influencing the reasoning capabilities of Large Language Model Agents (LLM Agents). Incorporating a memory mechanism that effectively integrates past interactions can significantly enhance decision-making and contextual coherence of LLM Agents. While recent works have made progress in memory storage and retrieval, such as encoding memory into dense vectors for similarity-based search or organizing knowledge in the form of graph, these approaches often fall short in structured memory organization and efficient retrieval. To address these limitations, we propose a Hierarchical Memory (H-MEM) architecture for LLM Agents that organizes and updates memory in a multi-level fashion based on the degree of semantic abstraction. Each memory vector is embedded with a positional index encoding pointing to its semantically related sub-memories in the next layer. During the reasoning phase, an index-based routing mechanism enables efficient, layer-by-layer retrieval without performing exhaustive similarity computations. We evaluate our method on five task settings from the LoCoMo dataset. Experimental results show that our approach consistently outperforms five baseline methods, demonstrating its effectiveness in long-term dialogue scenarios.

Hierarchical Memory for High-Efficiency Long-Term Reasoning in LLM Agents

TL;DR

Problem: Long-term reasoning in LLM agents is hampered by inefficient memory organization and retrieval.Approach: The authors propose H-MEM, a four-layer hierarchical memory with position-index routing to enable structured, layer-wise memory retrieval and grounding.Contributions: four-layer storage (Domain, Category, Memory Trace, Episode), memory update via user feedback, top-down retrieval with FAISS, and extensive evaluation on LoCoMo showing consistent improvements over baselines and strong efficiency.Significance: The method improves long-term dialogue reasoning and scalability, with potential extensions to multimodal memory.

Abstract

Long-term memory is one of the key factors influencing the reasoning capabilities of Large Language Model Agents (LLM Agents). Incorporating a memory mechanism that effectively integrates past interactions can significantly enhance decision-making and contextual coherence of LLM Agents. While recent works have made progress in memory storage and retrieval, such as encoding memory into dense vectors for similarity-based search or organizing knowledge in the form of graph, these approaches often fall short in structured memory organization and efficient retrieval. To address these limitations, we propose a Hierarchical Memory (H-MEM) architecture for LLM Agents that organizes and updates memory in a multi-level fashion based on the degree of semantic abstraction. Each memory vector is embedded with a positional index encoding pointing to its semantically related sub-memories in the next layer. During the reasoning phase, an index-based routing mechanism enables efficient, layer-by-layer retrieval without performing exhaustive similarity computations. We evaluate our method on five task settings from the LoCoMo dataset. Experimental results show that our approach consistently outperforms five baseline methods, demonstrating its effectiveness in long-term dialogue scenarios.

Paper Structure

This paper contains 12 sections, 2 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Memory Architecture Comparison. The above is the traditional memory mechanism, which make query to calculate the similarity with all stored specific memories and selects the top-k related memories memorybank. The following is H-MEM, which uses hierarchical memory and position index to search layer by layer and can effectively remove the influence of irrelevant memories on calculation.
  • Figure 2: H-MEM architecture. (a) shows the hierarchical memory structure of H-MEM, divided into four memory layers: Domain Layer, Category Layer, Memory Trace Layer, and Episode Layer. (b) shows the specific memory extraction workflow of H-MEM. After encoding the questions into semantic vectors, perform similarity calculation with H-MEM memory, select the most relevant top-k related memories and user profile, attach corresponding memory weights to provide LLM with Confidence Level reference, and input them together with the questions to LLM to achieve long-term dialogue reasoning.
  • Figure 3: Memory Retrieval Calculation Comparison. The above is the traditional memory retrieval method memorybank, which make query to calculate the similarity with all stored specific memories and selects the top-k related memories. The following is H-MEM, which uses position index to search layer by layer.
  • Figure 4: Comparative analysis of computational efficiency. We compare the calculation amount and time of H-MEM and baseline (MemoryBank) when using Qwen-1.5b to perform five types of QA tasks to verify the efficiency of H-MEM's memory retrieval. We select the end of each task type as a checkpoint to calculate the calculation time of completing a task type, and calculate the calculation amount every ten tasks.
  • Figure 5: Ablation study result. In this figure, H represents the hierarchical memory storage of H-MEM, and R is the position index retrieval in H-MEM.