Chronological Passage Assembling in RAG framework for Temporal Question Answering
Byeongjeong Kim, Jeonghyun Park, Joonho Yang, Hwanhee Lee
TL;DR
ChronoRAG tackles narrative QA where temporal ordering is crucial by building a two-layer offline graph that encodes events and their relations, then performing online hierarchical retrieval with neighborhood assembling to produce temporally coherent context. The offline stage chunking the document into fixed-length units ($k$), clustering $l$ chunks for summarization, and extracting relations to form Layer 1 enables a compact, relational graph; the online stage retrieves Layer 1 relations and their adjacent Layer 0 chunks to preserve narrative flow. Across NarrativeQA and GutenQA, ChronoRAG yields strong improvements, particularly on Time Questions, while maintaining efficiency with lightweight graph construction and a single-generation pass for answers. The work highlights that explicitly modeling event-to-event relations and temporal order, rather than merely extracting entities or summarizing text, is key to robust narrative question answering, with practical implications for long-form comprehension tasks and temporal reasoning in AI systems. $k$ and $l$ denote chunk size and cluster size in the offline graph, respectively, and the approach demonstrates that temporal coherence can be achieved with a principled two-stage retrieval framework."
Abstract
Long-context question answering over narrative tasks is challenging because correct answers often hinge on reconstructing a coherent timeline of events while preserving contextual f low in a limited context window. Retrievalaugmented generation (RAG) methods aim to address this challenge by selectively retrieving only necessary document segments. However, narrative texts possess unique characteristics that limit the effectiveness of these existing approaches. Specifically, understanding narrative texts requires more than isolated segments, as the broader context and sequential relationships between segments are crucial for comprehension. To address these limitations, we propose ChronoRAG, a novel RAG framework specialized for narrative texts. This approach focuses on two essential aspects: refining dispersed document information into coherent and structured passages and preserving narrative flow by explicitly capturing and maintaining the temporal order among retrieved passages. We empirically demonstrate the effectiveness of ChronoRAG through experiments on the NarrativeQA and GutenQAdataset, showing substantial improvements in tasks requiring both factual identification and comprehension of complex sequential relationships, underscoring that reasoning over temporal order is crucial in resolving narrative QA.
