Coarse-to-Fine Grounded Memory for LLM Agent Planning
Wei Yang, Jinwei Xiao, Hongming Zhang, Qingyang Zhang, Yanna Wang, Bo Xu
TL;DR
The paper tackles the challenge of memory quality and adaptability in LLM-based planning by introducing Coarse-to-Fine Grounded Memory (CFGM), which grounds memories across coarse, hybrid, and fine granularities using the LLM's internal knowledge. The approach combines coarse-grained focus points to steer experience collection, hybrid-grained tips distilled from trajectories, and fine-grained key information reflections during online planning to handle anomalies, with retrieval of relevant memories guiding decisions. Empirical evaluations across AlfWorld, WebShop, and ScienceWorld show CFGM achieving state-of-the-art performance and robust ablations demonstrating the contributions of each memory-grounding component. The work presents a principled, integrated memory grounding framework that improves exploration, memory diversity, and adaptive planning for complex interactive tasks, with evidence of cross-model generalization and practical efficiency gains.
Abstract
Recent advancements in Large Language Models (LLMs) have driven growing interest in LLM-based agents for complex planning tasks. To avoid costly agent training, many studies adopted memory mechanism that enhances LLM with offline experiences or online trajectory analysis. However, existing works focus on single-granularity memory derived from dynamic environmental interactions, which are inherently constrained by the quality of the collected experiences. This limitation, in turn, constrain the diversity of knowledge and the flexibility of planning. We propose Coarse-to-Fine Grounded Memory (\Ours{}), a novel framework that grounds coarse-to-fine memories with LLM, thereby fully leverage them for flexible adaptation to diverse scenarios. \Ours{} grounds environmental information into coarse-grained focus points to guide experience collection in training tasks, followed by grounding of actionable hybrid-grained tips from each experience. At inference, \Ours{} retrieves task-relevant experiences and tips to support planning. When facing environmental anomalies, the LLM grounds the current situation into fine-grained key information, enabling flexible self-QA reflection and plan correction.
