Table of Contents
Fetching ...

Walk&Retrieve: Simple Yet Effective Zero-shot Retrieval-Augmented Generation via Knowledge Graph Walks

Martin Böckling, Heiko Paulheim, Andreea Iana

TL;DR

This work tackles the problem of LLM hallucinations and knowledge staleness in knowledge-intensive QA by grounding responses in a knowledge graph (KG) through a retrieval-augmented generation (RAG) framework. Walk&Retrieve uses two-stage reasoning: (i) corpus generation via walk-based traversal (random walks and BFS walks) with knowledge verbalization to create a structured, text-friendly corpus, and (ii) knowledge-enhanced answer generation where the LLM is prompted with the query plus the most relevant verbalized walks, enabling zero-shot RAG with any off-the-shelf LLM. Key contributions include: (a) a lightweight, dynamic KG-adaptive approach that requires no LLM fine-tuning, (b) a single-LMM-call inference pipeline with efficient retrieval of nearby KG nodes and walks via cosine similarity, and (c) strong empirical results on MetaQA and CRAG showing improved accuracy and reduced hallucinations relative to vanilla and graph-based RAG baselines. The method demonstrates robust scalability to large KGs and favorable latency, offering a practical, adaptable baseline for future KG-RAG research.

Abstract

Large Language Models (LLMs) have showcased impressive reasoning abilities, but often suffer from hallucinations or outdated knowledge. Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) remedies these shortcomings by grounding LLM responses in structured external information from a knowledge base. However, many KG-based RAG approaches struggle with (i) aligning KG and textual representations, (ii) balancing retrieval accuracy and efficiency, and (iii) adapting to dynamically updated KGs. In this work, we introduce Walk&Retrieve, a simple yet effective KG-based framework that leverages walk-based graph traversal and knowledge verbalization for corpus generation for zero-shot RAG. Built around efficient KG walks, our method does not require fine-tuning on domain-specific data, enabling seamless adaptation to KG updates, reducing computational overhead, and allowing integration with any off-the-shelf backbone LLM. Despite its simplicity, Walk&Retrieve performs competitively, often outperforming existing RAG systems in response accuracy and hallucination reduction. Moreover, it demonstrates lower query latency and robust scalability to large KGs, highlighting the potential of lightweight retrieval strategies as strong baselines for future RAG research.

Walk&Retrieve: Simple Yet Effective Zero-shot Retrieval-Augmented Generation via Knowledge Graph Walks

TL;DR

This work tackles the problem of LLM hallucinations and knowledge staleness in knowledge-intensive QA by grounding responses in a knowledge graph (KG) through a retrieval-augmented generation (RAG) framework. Walk&Retrieve uses two-stage reasoning: (i) corpus generation via walk-based traversal (random walks and BFS walks) with knowledge verbalization to create a structured, text-friendly corpus, and (ii) knowledge-enhanced answer generation where the LLM is prompted with the query plus the most relevant verbalized walks, enabling zero-shot RAG with any off-the-shelf LLM. Key contributions include: (a) a lightweight, dynamic KG-adaptive approach that requires no LLM fine-tuning, (b) a single-LMM-call inference pipeline with efficient retrieval of nearby KG nodes and walks via cosine similarity, and (c) strong empirical results on MetaQA and CRAG showing improved accuracy and reduced hallucinations relative to vanilla and graph-based RAG baselines. The method demonstrates robust scalability to large KGs and favorable latency, offering a practical, adaptable baseline for future KG-RAG research.

Abstract

Large Language Models (LLMs) have showcased impressive reasoning abilities, but often suffer from hallucinations or outdated knowledge. Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) remedies these shortcomings by grounding LLM responses in structured external information from a knowledge base. However, many KG-based RAG approaches struggle with (i) aligning KG and textual representations, (ii) balancing retrieval accuracy and efficiency, and (iii) adapting to dynamically updated KGs. In this work, we introduce Walk&Retrieve, a simple yet effective KG-based framework that leverages walk-based graph traversal and knowledge verbalization for corpus generation for zero-shot RAG. Built around efficient KG walks, our method does not require fine-tuning on domain-specific data, enabling seamless adaptation to KG updates, reducing computational overhead, and allowing integration with any off-the-shelf backbone LLM. Despite its simplicity, Walk&Retrieve performs competitively, often outperforming existing RAG systems in response accuracy and hallucination reduction. Moreover, it demonstrates lower query latency and robust scalability to large KGs, highlighting the potential of lightweight retrieval strategies as strong baselines for future RAG research.

Paper Structure

This paper contains 7 sections, 3 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overview of the Walk&Retrieve framework: (1) We combine walk-based graph traversal with knowledge verbalization for corpus generation; (2) The answer is generated with a prompt augmenting the query with the most similar verbalized walks.
  • Figure 2: Prompt templates used for knowledge verbalization and answer generation.
  • Figure 3: Missing vs. truthfulness rates over MetaQA subsets.
  • Figure 4: Truthfulness rates for different (i) walk approaches and (ii) backbone LLMs, over the MetaQA subsets.