Table of Contents
Fetching ...

KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models

Zhongxin Liu, Zhiwei Wang, Jun Niu, Ying Li, Hongyu Sun, Meng Xu, He Wang, Gaofei Wu, Yuqing Zhang

TL;DR

Knowledge-shortcut hallucinations arise when LLMs rely on high-similarity, high-frequency data patterns rather than genuine factual understanding. The authors propose a data-level High Similarity Pruning method to mitigate these effects and a hybrid detection framework that combines semantic similarity with self-check uncertainty in Context-Question-Answer tasks. Across diverse datasets and model scales, the approach reduces knowledge-shortcut hallucinations—particularly during fine-tuning—without compromising correct-answer quality, and the methods are validated with reproducible experiments and open-source code. This work introduces a data-driven paradigm for enhancing the reliability of generative models in real-world QA applications.

Abstract

The emergence of large language models (LLMs) has significantly advanced the development of natural language processing (NLP), especially in text generation tasks like question answering. However, model hallucinations remain a major challenge in natural language generation (NLG) tasks due to their complex causes. We systematically expand on the causes of factual hallucinations from the perspective of knowledge shortcuts, analyzing hallucinations arising from correct and defect-free data and demonstrating that knowledge-shortcut hallucinations are prevalent in generative models. To mitigate this issue, we propose a high similarity pruning algorithm at the data preprocessing level to reduce spurious correlations in the data. Additionally, we design a specific detection method for knowledge-shortcut hallucinations to evaluate the effectiveness of our mitigation strategy. Experimental results show that our approach effectively reduces knowledge-shortcut hallucinations, particularly in fine-tuning tasks, without negatively impacting model performance in question answering. This work introduces a new paradigm for mitigating specific hallucination issues in generative models, enhancing their robustness and reliability in real-world applications.

KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models

TL;DR

Knowledge-shortcut hallucinations arise when LLMs rely on high-similarity, high-frequency data patterns rather than genuine factual understanding. The authors propose a data-level High Similarity Pruning method to mitigate these effects and a hybrid detection framework that combines semantic similarity with self-check uncertainty in Context-Question-Answer tasks. Across diverse datasets and model scales, the approach reduces knowledge-shortcut hallucinations—particularly during fine-tuning—without compromising correct-answer quality, and the methods are validated with reproducible experiments and open-source code. This work introduces a data-driven paradigm for enhancing the reliability of generative models in real-world QA applications.

Abstract

The emergence of large language models (LLMs) has significantly advanced the development of natural language processing (NLP), especially in text generation tasks like question answering. However, model hallucinations remain a major challenge in natural language generation (NLG) tasks due to their complex causes. We systematically expand on the causes of factual hallucinations from the perspective of knowledge shortcuts, analyzing hallucinations arising from correct and defect-free data and demonstrating that knowledge-shortcut hallucinations are prevalent in generative models. To mitigate this issue, we propose a high similarity pruning algorithm at the data preprocessing level to reduce spurious correlations in the data. Additionally, we design a specific detection method for knowledge-shortcut hallucinations to evaluate the effectiveness of our mitigation strategy. Experimental results show that our approach effectively reduces knowledge-shortcut hallucinations, particularly in fine-tuning tasks, without negatively impacting model performance in question answering. This work introduces a new paradigm for mitigating specific hallucination issues in generative models, enhancing their robustness and reliability in real-world applications.

Paper Structure

This paper contains 27 sections, 9 equations, 15 figures, 9 tables, 2 algorithms.

Figures (15)

  • Figure 1: An example of what is the knowledge-shortcut hallucinations in CQA tasks
  • Figure 2: Overview of detection and mitigation.
  • Figure 3: Evaluation of nanoGPT models under the fine-tuning method across three parameter scales: normal, medium and large(from left to right). Each x-axis represents a different similarity metric: Jaccard similarity, TF-IDF similarity, and Pre-trained model-based similarity, respectively.
  • Figure 4: Evaluation of nanoGPT models under the training method across three parameter scales: normal, medium and large(from left to right). Each x-axis represents a different similarity metric: Jaccard similarity, TF-IDF similarity, and Pre-trained model-based similarity, respectively.
  • Figure 5: The number of Knowledge-Shortcut hallucination in CQA tasks before and after mitigation
  • ...and 10 more figures