keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

Chaojie Wang; Yishi Xu; Zhong Peng; Chenxi Zhang; Bo Chen; Xinrun Wang; Lei Feng; Bo An

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

Chaojie Wang, Yishi Xu, Zhong Peng, Chenxi Zhang, Bo Chen, Xinrun Wang, Lei Feng, Bo An

TL;DR

The paper tackles hallucination in LLM-based knowledge QA by introducing Keqing, a four-stage KBQA framework that retrieves structured knowledge from a knowledge graph and guides LLMs via interpretable, chain-of-thought–style reasoning paths. It replaces traditional embedding-based retrieval with symbolic KG triplet retrieval aligned to predefined question templates, followed by candidate reasoning over retrieved triplets and a final response generation that exposes the reasoning process. Keqing leverages LoRA-finetuned LLaMA for decomposition, RoBERTa-based template matching, and ChatGPT for final reasoning and explanation, achieving competitive results on MetaQA and WebQSP while improving transparency. The work demonstrates that knowledge-based question answering can serve as a natural CoT mentor for LLMs, enabling scalable, interpretable, and more reliable QA over large knowledge graphs.

Abstract

Large language models (LLMs) have exhibited remarkable performance on various natural language processing (NLP) tasks, especially for question answering. However, in the face of problems beyond the scope of knowledge, these LLMs tend to talk nonsense with a straight face, where the potential solution could be incorporating an Information Retrieval (IR) module and generating response based on these retrieved knowledge. In this paper, we present a novel framework to assist LLMs, such as ChatGPT, to retrieve question-related structured information on the knowledge graph, and demonstrate that Knowledge-based question answering (Keqing) could be a nature Chain-of-Thought (CoT) mentor to guide the LLM to sequentially find the answer entities of a complex question through interpretable logical chains. Specifically, the workflow of Keqing will execute decomposing a complex question according to predefined templates, retrieving candidate entities on knowledge graph, reasoning answers of sub-questions, and finally generating response with reasoning paths, which greatly improves the reliability of LLM's response. The experimental results on KBQA datasets show that Keqing can achieve competitive performance and illustrate the logic of answering each question.

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

TL;DR

Abstract

Paper Structure (16 sections, 4 equations, 6 figures, 4 tables)

This paper contains 16 sections, 4 equations, 6 figures, 4 tables.

Introduction
Related Works
Retrieval-Augmented Language Generation
LLMs for Knowledge Based Question Answering
Method
Decompose Complex Questions through Slot Filling
Retrieve Candidate Entities on Knowledge Graph
Answer Questions with Retrieved Candidate Entities
Generate Response by Summarizing Question Answers
Experiments
Datasets & Baselines
Implantation Details
Qualitative Visualization
Quantitative Comparison
Ablation Study
...and 1 more sections

Figures (6)

Figure 1: The workflow of Keqing applied for KBQA mainly consists of four stages: #1Question Decomposition: decompose a complex question into several sub-questions according to predefined question templates; #2Knowledge Retrieval: retrieve candidate entities on the knowledge graph by aligning decomposed sub-questions to pre-collected logical chains; #3Candidate Reasoning: select the correct answer from the candidate answers to solve each sub-question; #4Response Generation: generate response by summarizing multiple rounds of questions and answers.
Figure 2: The pipeline of aligning decomposed sub-questions to executable logical chains on KG, where each sub-question will be mapped to a set of logical chains of top-K relevant question templates.
Figure 3: Case study of evaluating Keqing on the testing samples of various KBQA benchmarks.
Figure 4: The compound value types (CVTs) of Freebase dataset, where each triplet $(s,r,o)$ will be converted to text by serializing their text surface forms.
Figure 5: Performance of Keqing on WebQSP using different numbers of question templates to match each sub-question.
...and 1 more figures

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

TL;DR

Abstract

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

Authors

TL;DR

Abstract

Table of Contents

Figures (6)