CoTKR: Chain-of-Thought Enhanced Knowledge Rewriting for Complex Knowledge Graph Question Answering
Yike Wu, Yi Huang, Nan Hu, Yuncheng Hua, Guilin Qi, Jiaoyan Chen, Jeff Z. Pan
TL;DR
This work tackles factuality and semantic alignment in KGQA by introducing CoTKR, a chain-of-thought enhanced knowledge rewriting that interleaves reasoning traces with knowledge extraction. It pairs CoTKR with PAQAF, a preference-alignment framework that uses QA feedback and direct preference optimization to tailor rewrites to a given QA model. Across GrailQA and GraphQuestions, using diverse LLMs and retrieval methods, CoTKR and PAQAF yield consistent performance gains over traditional KR baselines, highlighting the benefit of structured, reasoning-driven knowledge representations. The results suggest that carefully crafted natural-language knowledge representations, when aligned with QA objectives, can significantly improve KGQA and potentially generalize to other retrieval-augmented reasoning tasks.
Abstract
Recent studies have explored the use of Large Language Models (LLMs) with Retrieval Augmented Generation (RAG) for Knowledge Graph Question Answering (KGQA). They typically require rewriting retrieved subgraphs into natural language formats comprehensible to LLMs. However, when tackling complex questions, the knowledge rewritten by existing methods may include irrelevant information, omit crucial details, or fail to align with the question's semantics. To address them, we propose a novel rewriting method CoTKR, Chain-of-Thought Enhanced Knowledge Rewriting, for generating reasoning traces and corresponding knowledge in an interleaved manner, thereby mitigating the limitations of single-step knowledge rewriting. Additionally, to bridge the preference gap between the knowledge rewriter and the question answering (QA) model, we propose a training strategy PAQAF, Preference Alignment from Question Answering Feedback, for leveraging feedback from the QA model to further optimize the knowledge rewriter. We conduct experiments using various LLMs across several KGQA benchmarks. Experimental results demonstrate that, compared with previous knowledge rewriting methods, CoTKR generates the most beneficial knowledge representation for QA models, which significantly improves the performance of LLMs in KGQA.
