LiCoMemory: Lightweight and Cognitive Agentic Memory for Efficient Long-Term Reasoning
Zhengjun Huang, Zhoujin Tian, Qintian Guo, Fangyuan Zhang, Yingli Zhou, Di Jiang, Xiaofang Zhou
TL;DR
LiCoMemory tackles the memory bottleneck of LLM agents by introducing CogniGraph, a lightweight hierarchical graph that serves as a semantic index rather than a static store. The retrieval process uses a top-down, hierarchical approach with a Weibull-based temporal decay to produce structured prompts for RAG-based generation, enabling coherent cross-session reasoning. Empirical results on LoCoMo and LongMemEval show LiCoMemory achieves up to about 23% accuracy gains over baselines and reduces update latency and token usage, highlighting strong temporal and multi-session reasoning capabilities. This architecture offers a scalable, interpretable memory solution for real-time long-term reasoning in agentic systems, with potential for multi-agent extensions and adaptive memory compression.
Abstract
Large Language Model (LLM) agents exhibit remarkable conversational and reasoning capabilities but remain constrained by limited context windows and the lack of persistent memory. Recent efforts address these limitations via external memory architectures, often employing graph-based representations, yet most adopt flat, entangled structures that intertwine semantics with topology, leading to redundant representations, unstructured retrieval, and degraded efficiency and accuracy. To resolve these issues, we propose LiCoMemory, an end-to-end agentic memory framework for real-time updating and retrieval, which introduces CogniGraph, a lightweight hierarchical graph that utilizes entities and relations as semantic indexing layers, and employs temporal and hierarchy-aware search with integrated reranking for adaptive and coherent knowledge retrieval. Experiments on long-term dialogue benchmarks, LoCoMo and LongMemEval, show that LiCoMemory not only outperforms established baselines in temporal reasoning, multi-session consistency, and retrieval efficiency, but also notably reduces update latency. Our official code and data are available at https://github.com/EverM0re/LiCoMemory.
