Multiple Memory Systems for Enhancing the Long-term Memory of Agent
Gaoke Zhang, Bo Wang, Yunlong Ma, Dongming Zhao, Zifei Yu
TL;DR
This work introduces Multiple Memory Systems (MMS), a memory architecture for LLM-powered agents that transforms short-term memory into high-quality long-term memory fragments across five types (M_key, M_short, M_cog, M_epi, M_sem). It constructs Retrieval Memory Units (MU_ret) for query matching and Contextual Memory Units (MU_cont) to enrich generation, with a cosineSimilarity-based top-$k$ retrieval process and a one-to-one mapping between MU_ret and MU_cont. Experimental evaluation on the LoCoMo dataset shows MMS outperforming MemoryBank, A-MEM, and Naive RAG in recall and generation metrics, with ablations confirming the value of individual memory fragments and robustness analyses validating performance across memory counts and overheads. The approach demonstrates practical value by improving long-term memory recall and response quality while maintaining acceptable computational costs, and it grounds memory design in cognitive psychology principles such as encoding specificity and levels of processing.
Abstract
An agent powered by large language models have achieved impressive results, but effectively handling the vast amounts of historical data generated during interactions remains a challenge. The current approach is to design a memory module for the agent to process these data. However, existing methods, such as MemoryBank and A-MEM, have poor quality of stored memory content, which affects recall performance and response quality. In order to better construct high-quality long-term memory content, we have designed a multiple memory system (MMS) inspired by cognitive psychology theory. The system processes short-term memory to multiple long-term memory fragments, and constructs retrieval memory units and contextual memory units based on these fragments, with a one-to-one correspondence between the two. During the retrieval phase, MMS will match the most relevant retrieval memory units based on the user's query. Then, the corresponding contextual memory units is obtained as the context for the response stage to enhance knowledge, thereby effectively utilizing historical data. Experiments on LoCoMo dataset compared our method with three others, proving its effectiveness. Ablation studies confirmed the rationality of our memory units. We also analyzed the robustness regarding the number of selected memory segments and the storage overhead, demonstrating its practical value.
