Table of Contents
Fetching ...

LiCoMemory: Lightweight and Cognitive Agentic Memory for Efficient Long-Term Reasoning

Zhengjun Huang, Zhoujin Tian, Qintian Guo, Fangyuan Zhang, Yingli Zhou, Di Jiang, Xiaofang Zhou

TL;DR

LiCoMemory tackles the memory bottleneck of LLM agents by introducing CogniGraph, a lightweight hierarchical graph that serves as a semantic index rather than a static store. The retrieval process uses a top-down, hierarchical approach with a Weibull-based temporal decay to produce structured prompts for RAG-based generation, enabling coherent cross-session reasoning. Empirical results on LoCoMo and LongMemEval show LiCoMemory achieves up to about 23% accuracy gains over baselines and reduces update latency and token usage, highlighting strong temporal and multi-session reasoning capabilities. This architecture offers a scalable, interpretable memory solution for real-time long-term reasoning in agentic systems, with potential for multi-agent extensions and adaptive memory compression.

Abstract

Large Language Model (LLM) agents exhibit remarkable conversational and reasoning capabilities but remain constrained by limited context windows and the lack of persistent memory. Recent efforts address these limitations via external memory architectures, often employing graph-based representations, yet most adopt flat, entangled structures that intertwine semantics with topology, leading to redundant representations, unstructured retrieval, and degraded efficiency and accuracy. To resolve these issues, we propose LiCoMemory, an end-to-end agentic memory framework for real-time updating and retrieval, which introduces CogniGraph, a lightweight hierarchical graph that utilizes entities and relations as semantic indexing layers, and employs temporal and hierarchy-aware search with integrated reranking for adaptive and coherent knowledge retrieval. Experiments on long-term dialogue benchmarks, LoCoMo and LongMemEval, show that LiCoMemory not only outperforms established baselines in temporal reasoning, multi-session consistency, and retrieval efficiency, but also notably reduces update latency. Our official code and data are available at https://github.com/EverM0re/LiCoMemory.

LiCoMemory: Lightweight and Cognitive Agentic Memory for Efficient Long-Term Reasoning

TL;DR

LiCoMemory tackles the memory bottleneck of LLM agents by introducing CogniGraph, a lightweight hierarchical graph that serves as a semantic index rather than a static store. The retrieval process uses a top-down, hierarchical approach with a Weibull-based temporal decay to produce structured prompts for RAG-based generation, enabling coherent cross-session reasoning. Empirical results on LoCoMo and LongMemEval show LiCoMemory achieves up to about 23% accuracy gains over baselines and reduces update latency and token usage, highlighting strong temporal and multi-session reasoning capabilities. This architecture offers a scalable, interpretable memory solution for real-time long-term reasoning in agentic systems, with potential for multi-agent extensions and adaptive memory compression.

Abstract

Large Language Model (LLM) agents exhibit remarkable conversational and reasoning capabilities but remain constrained by limited context windows and the lack of persistent memory. Recent efforts address these limitations via external memory architectures, often employing graph-based representations, yet most adopt flat, entangled structures that intertwine semantics with topology, leading to redundant representations, unstructured retrieval, and degraded efficiency and accuracy. To resolve these issues, we propose LiCoMemory, an end-to-end agentic memory framework for real-time updating and retrieval, which introduces CogniGraph, a lightweight hierarchical graph that utilizes entities and relations as semantic indexing layers, and employs temporal and hierarchy-aware search with integrated reranking for adaptive and coherent knowledge retrieval. Experiments on long-term dialogue benchmarks, LoCoMo and LongMemEval, show that LiCoMemory not only outperforms established baselines in temporal reasoning, multi-session consistency, and retrieval efficiency, but also notably reduces update latency. Our official code and data are available at https://github.com/EverM0re/LiCoMemory.

Paper Structure

This paper contains 14 sections, 2 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Motivation of LiCoMemory, illustrating how LiCoMemory resolves key challenges of existing memory frameworks. For the performance graph on the bottom, radius of the circles represent the construction token consumption per dialogue.
  • Figure 2: Overview of LiCoMemory workflow. Upon interaction, dialogue chunks are incrementally organized through the CogniGraph (Preliminary), a lightweight hierarchical graph linking session summaries, entity–relation triples, and dialogue chunks via cross-layer hyperlinks. New knowledge is continuously integrated and deduplicated to preserve structural consistency (Phase 1). At inference, entity extraction and hierarchical retrieval guide top-down search across graph layers (Phase 2), followed by hierarchy-temporal–semantic reranking to generate a structured prompt for retrieval-augmented generation (Phase 3).
  • Figure 3: Practical case study of LiCoMemory.
  • Figure 4: Accuracy breakdown of LiCoMemory and baselines on subsets of LoCoMo and LongmemEval.
  • Figure 5: Performance breakdown of LiCoMemory and baselines on LoCoMo and LongmemEval.