Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG
Longpeng Qiu, Ting Li, Shuai Mao, Nan Yang, Xiaohui Yan
TL;DR
This work tackles input errors in Chinese QA by addressing misinterpretation and over-correction in LLMs. It introduces QuestionRAG, which employs retrieval-augmented generation to supply external context (web results, similar questions, and entity descriptions) that guides corrections. To prevent over-correction, it uses GRPO-based reinforcement learning with a reward that combines format fidelity and accuracy (via a CER-based distance). Experiments on QCSet, MCSCSet, and QSpell show that knowledge augmentation substantially improves performance and that RL-aligned Post-Training (GRPO) yields the best results, with knowledge augmentation sometimes surpassing gains from simply scaling the model. The approach holds promise for robust question understanding and can extend to tasks like question rewriting and search-query understanding, though its effectiveness hinges on retrieval quality and incurs additional latency.
Abstract
Input errors in question-answering (QA) systems often lead to incorrect responses. Large language models (LLMs) struggle with this task, frequently failing to interpret user intent (misinterpretation) or unnecessarily altering the original question's structure (over-correction). We propose QuestionRAG, a framework that tackles these problems. To address misinterpretation, it enriches the input with external knowledge (e.g., search results, related entities). To prevent over-correction, it uses reinforcement learning (RL) to align the model's objective with precise correction, not just paraphrasing. Our results demonstrate that knowledge augmentation is critical for understanding faulty questions. Furthermore, RL-based alignment proves significantly more effective than traditional supervised fine-tuning (SFT), boosting the model's ability to follow instructions and generalize. By integrating these two strategies, QuestionRAG unlocks the full potential of LLMs for the question correction task.
