Walk&Retrieve: Simple Yet Effective Zero-shot Retrieval-Augmented Generation via Knowledge Graph Walks
Martin Böckling, Heiko Paulheim, Andreea Iana
TL;DR
This work tackles the problem of LLM hallucinations and knowledge staleness in knowledge-intensive QA by grounding responses in a knowledge graph (KG) through a retrieval-augmented generation (RAG) framework. Walk&Retrieve uses two-stage reasoning: (i) corpus generation via walk-based traversal (random walks and BFS walks) with knowledge verbalization to create a structured, text-friendly corpus, and (ii) knowledge-enhanced answer generation where the LLM is prompted with the query plus the most relevant verbalized walks, enabling zero-shot RAG with any off-the-shelf LLM. Key contributions include: (a) a lightweight, dynamic KG-adaptive approach that requires no LLM fine-tuning, (b) a single-LMM-call inference pipeline with efficient retrieval of nearby KG nodes and walks via cosine similarity, and (c) strong empirical results on MetaQA and CRAG showing improved accuracy and reduced hallucinations relative to vanilla and graph-based RAG baselines. The method demonstrates robust scalability to large KGs and favorable latency, offering a practical, adaptable baseline for future KG-RAG research.
Abstract
Large Language Models (LLMs) have showcased impressive reasoning abilities, but often suffer from hallucinations or outdated knowledge. Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) remedies these shortcomings by grounding LLM responses in structured external information from a knowledge base. However, many KG-based RAG approaches struggle with (i) aligning KG and textual representations, (ii) balancing retrieval accuracy and efficiency, and (iii) adapting to dynamically updated KGs. In this work, we introduce Walk&Retrieve, a simple yet effective KG-based framework that leverages walk-based graph traversal and knowledge verbalization for corpus generation for zero-shot RAG. Built around efficient KG walks, our method does not require fine-tuning on domain-specific data, enabling seamless adaptation to KG updates, reducing computational overhead, and allowing integration with any off-the-shelf backbone LLM. Despite its simplicity, Walk&Retrieve performs competitively, often outperforming existing RAG systems in response accuracy and hallucination reduction. Moreover, it demonstrates lower query latency and robust scalability to large KGs, highlighting the potential of lightweight retrieval strategies as strong baselines for future RAG research.
