Mindscape-Aware Retrieval Augmented Generation for Improved Long Context Understanding
Yuqing Li, Jiangnan Li, Zheng Lin, Ziyan Zhou, Junjie Wu, Weiping Wang, Jie Zhou, Mo Yu
TL;DR
MiA-RAG introduces a psycho- and neuro-inspired global mindscape that conditions both retrieval and generation to improve long-context understanding. The framework builds a hierarchical global memory S from document summaries and uses MiA-Emb and MiA-Gen to align local evidence with this global scaffold. Across English/Chinese narratives and multiple benchmarks, MiA-RAG achieves superior performance and demonstrates enhanced integrative reasoning, supported by analyses of embedding geometry and attention patterns. The work highlights the value of global semantic guidance for robust, human-like long-context reasoning, while acknowledging the need for precomputed mindscapes and narrative-focused evaluation as avenues for future work.
Abstract
Humans understand long and complex texts by relying on a holistic semantic representation of the content. This global view helps organize prior knowledge, interpret new information, and integrate evidence dispersed across a document, as revealed by the Mindscape-Aware Capability of humans in psychology. Current Retrieval-Augmented Generation (RAG) systems lack such guidance and therefore struggle with long-context tasks. In this paper, we propose Mindscape-Aware RAG (MiA-RAG), the first approach that equips LLM-based RAG systems with explicit global context awareness. MiA-RAG builds a mindscape through hierarchical summarization and conditions both retrieval and generation on this global semantic representation. This enables the retriever to form enriched query embeddings and the generator to reason over retrieved evidence within a coherent global context. We evaluate MiA-RAG across diverse long-context and bilingual benchmarks for evidence-based understanding and global sense-making. It consistently surpasses baselines, and further analysis shows that it aligns local details with a coherent global representation, enabling more human-like long-context retrieval and reasoning.
