FRAG: A Flexible Modular Framework for Retrieval-Augmented Generation based on Knowledge Graphs
Zengyi Gao, Yukun Cao, Hairu Wang, Ao Ke, Yuan Feng, Xike Xie, S Kevin Zhou
TL;DR
FRAG tackles LLM hallucinations by marrying modular KG-RAG flexibility with retrieval quality through a reasoning-aware hop-range estimator. It introduces a three-module architecture—Reasoning-aware, Flexible-retrieval, and Reasoning—that classifies queries as Simple or Complex, adapts retrieval by preprocessing subgraphs and choosing BFS or Dijkstra-based path searches, and then augments prompts with concise reasoning paths for LLM inference. The approach achieves state-of-the-art results among modular KG-RAG methods while avoiding KG-specific fine-tuning, offering substantial efficiency gains and broad applicability across LLM backbones. Overall, FRAG advances practical, scalable KG-RAG by leveraging query-driven structural cues to guide retrieval and reasoning, reducing resource use without sacrificing accuracy.
Abstract
To mitigate the hallucination and knowledge deficiency in large language models (LLMs), Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) has shown promising potential by utilizing KGs as external resource to enhance LLMs reasoning. However, existing KG-RAG approaches struggle with a trade-off between flexibility and retrieval quality. Modular methods prioritize flexibility by avoiding the use of KG-fine-tuned models during retrieval, leading to fixed retrieval strategies and suboptimal retrieval quality. Conversely, coupled methods embed KG information within models to improve retrieval quality, but at the expense of flexibility. In this paper, we propose a novel flexible modular KG-RAG framework, termed FRAG, which synergizes the advantages of both approaches. FRAG estimates the hop range of reasoning paths based solely on the query and classify it as either simple or complex. To match the complexity of the query, tailored pipelines are applied to ensure efficient and accurate reasoning path retrieval, thus fostering the final reasoning process. By using the query text instead of the KG to infer the structural information of reasoning paths and employing adaptable retrieval strategies, FRAG improves retrieval quality while maintaining flexibility. Moreover, FRAG does not require extra LLMs fine-tuning or calls, significantly boosting efficiency and conserving resources. Extensive experiments show that FRAG achieves state-of-the-art performance with high efficiency and low resource consumption.
