Table of Contents
Fetching ...

Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG

Longpeng Qiu, Ting Li, Shuai Mao, Nan Yang, Xiaohui Yan

TL;DR

This work tackles input errors in Chinese QA by addressing misinterpretation and over-correction in LLMs. It introduces QuestionRAG, which employs retrieval-augmented generation to supply external context (web results, similar questions, and entity descriptions) that guides corrections. To prevent over-correction, it uses GRPO-based reinforcement learning with a reward that combines format fidelity and accuracy (via a CER-based distance). Experiments on QCSet, MCSCSet, and QSpell show that knowledge augmentation substantially improves performance and that RL-aligned Post-Training (GRPO) yields the best results, with knowledge augmentation sometimes surpassing gains from simply scaling the model. The approach holds promise for robust question understanding and can extend to tasks like question rewriting and search-query understanding, though its effectiveness hinges on retrieval quality and incurs additional latency.

Abstract

Input errors in question-answering (QA) systems often lead to incorrect responses. Large language models (LLMs) struggle with this task, frequently failing to interpret user intent (misinterpretation) or unnecessarily altering the original question's structure (over-correction). We propose QuestionRAG, a framework that tackles these problems. To address misinterpretation, it enriches the input with external knowledge (e.g., search results, related entities). To prevent over-correction, it uses reinforcement learning (RL) to align the model's objective with precise correction, not just paraphrasing. Our results demonstrate that knowledge augmentation is critical for understanding faulty questions. Furthermore, RL-based alignment proves significantly more effective than traditional supervised fine-tuning (SFT), boosting the model's ability to follow instructions and generalize. By integrating these two strategies, QuestionRAG unlocks the full potential of LLMs for the question correction task.

Knowledge-Augmented Question Error Correction for Chinese Question Answer System with QuestionRAG

TL;DR

This work tackles input errors in Chinese QA by addressing misinterpretation and over-correction in LLMs. It introduces QuestionRAG, which employs retrieval-augmented generation to supply external context (web results, similar questions, and entity descriptions) that guides corrections. To prevent over-correction, it uses GRPO-based reinforcement learning with a reward that combines format fidelity and accuracy (via a CER-based distance). Experiments on QCSet, MCSCSet, and QSpell show that knowledge augmentation substantially improves performance and that RL-aligned Post-Training (GRPO) yields the best results, with knowledge augmentation sometimes surpassing gains from simply scaling the model. The approach holds promise for robust question understanding and can extend to tasks like question rewriting and search-query understanding, though its effectiveness hinges on retrieval quality and incurs additional latency.

Abstract

Input errors in question-answering (QA) systems often lead to incorrect responses. Large language models (LLMs) struggle with this task, frequently failing to interpret user intent (misinterpretation) or unnecessarily altering the original question's structure (over-correction). We propose QuestionRAG, a framework that tackles these problems. To address misinterpretation, it enriches the input with external knowledge (e.g., search results, related entities). To prevent over-correction, it uses reinforcement learning (RL) to align the model's objective with precise correction, not just paraphrasing. Our results demonstrate that knowledge augmentation is critical for understanding faulty questions. Furthermore, RL-based alignment proves significantly more effective than traditional supervised fine-tuning (SFT), boosting the model's ability to follow instructions and generalize. By integrating these two strategies, QuestionRAG unlocks the full potential of LLMs for the question correction task.

Paper Structure

This paper contains 21 sections, 1 equation, 2 figures, 6 tables.

Figures (2)

  • Figure 1: The workflow of QuestionRAG includes a search stage collecting relevant webpages, questions, and entities from external knowledge sources. It utilizes multi-facet search (n-gram for lexical similarity, embedding from semantic similarity, entity for conceptual similarity, and Pinyin for phonics similarity, etc.) from multiple knowledge sources (either general or domain-specific). With the search results as augmented knowledge, a LLM trained with reinforcement learning is utilized to generate the correction.
  • Figure 2: Impact of model size on CER(%) for Qwen3 models (1.7B to 32B) within the QuestionRAG framework, illustrating how increasing model size affects performance.