VersionRAG: Version-Aware Retrieval-Augmented Generation for Evolving Documents
Daniel Huwiler, Kurt Stockinger, Jonathan Fürst
TL;DR
VersionRAG introduces a novel version-aware retrieval framework that explicitly models document evolution through a hierarchical graph to address two core challenges in evolving documents: version conflation and implicit changes. By routing queries via intent-aware retrieval paths and indexing changes as explicit or inferred graph elements, VersionRAG achieves 90% accuracy on a new VersionQA benchmark and substantially reduces indexing tokens by 97% compared with GraphRAG. The approach combines a five-level version-aware graph with a hybrid retrieval strategy and generation grounded in version-specific context, delivering strong performance across content, version listing, and change retrieval tasks. The work demonstrates practical deployment benefits, including improved efficiency, incremental updates, and broader applicability to domains with tracked document revisions, while also outlining limitations and directions for future research.
Abstract
Retrieval-Augmented Generation (RAG) systems fail when documents evolve through versioning-a ubiquitous characteristic of technical documentation. Existing approaches achieve only 58-64% accuracy on version-sensitive questions, retrieving semantically similar content without temporal validity checks. We present VersionRAG, a version-aware RAG framework that explicitly models document evolution through a hierarchical graph structure capturing version sequences, content boundaries, and changes between document states. During retrieval, VersionRAG routes queries through specialized paths based on intent classification, enabling precise version-aware filtering and change tracking. On our VersionQA benchmark-100 manually curated questions across 34 versioned technical documents-VersionRAG achieves 90% accuracy, outperforming naive RAG (58%) and GraphRAG (64%). VersionRAG reaches 60% accuracy on implicit change detection where baselines fail (0-10%), demonstrating its ability to track undocumented modifications. Additionally, VersionRAG requires 97% fewer tokens during indexing than GraphRAG, making it practical for large-scale deployment. Our work establishes versioned document QA as a distinct task and provides both a solution and benchmark for future research.
