Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination

Jongyoon Song; Sangwon Yu; Sungroh Yoon

Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination

Jongyoon Song, Sangwon Yu, Sungroh Yoon

TL;DR

The paper investigates a bias in large language models where context-based factuality judgments disproportionately yield false negatives, creating input-conflicting hallucinations. It introduces All-True and All-False prompts to isolate this bias and conducts experiments on StrategyQA and BoolQ with Mistral, ChatGPT, and GPT-4, revealing a robust tendency to deny true statements and overconfident incorrect responses. The study finds that the false negative problem is largely independent of model size and is influenced by whether the target answer is True, with higher confidence for incorrect False outcomes in All-True prompts. Context and query rewriting partially mitigates the problem, offering a practical direction for improving reliability in context-grounded reasoning, though model-specific reactions (notably GPT-4's null responses in some rewriting scenarios) warrant further investigation.

Abstract

In this paper, we identify a new category of bias that induces input-conflicting hallucinations, where large language models (LLMs) generate responses inconsistent with the content of the input context. This issue we have termed the false negative problem refers to the phenomenon where LLMs are predisposed to return negative judgments when assessing the correctness of a statement given the context. In experiments involving pairs of statements that contain the same information but have contradictory factual directions, we observe that LLMs exhibit a bias toward false negatives. Specifically, the model presents greater overconfidence when responding with False. Furthermore, we analyze the relationship between the false negative problem and context and query rewriting and observe that both effectively tackle false negatives in LLMs.

Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination

TL;DR

Abstract

Paper Structure (24 sections, 9 figures, 3 tables)

This paper contains 24 sections, 9 figures, 3 tables.

Introduction
False Negative Problems of Large Language Models
Experimental Setup
Dataset and Models
All-True and All-False Prompts
Discrepancy in Accuracy between All-True and All-False Prompts
Prediction Confidence Analysis
Rewriting and False Negative Problems
Related Work
Conclusion
Experiments on BoolQ Dataset
Case Study
Knowledge Conflict and False Negative Problems
Data Splitting
Experimental Results
...and 9 more sections

Figures (9)

Figure 1: An example of the false negative problem in ChatGPT. The ground truth of the question is True, yet the large language model responds False to both statements. The context and question are sampled from StrategyQA (geva-etal-2021-aristotle).
Figure 2: The ratio of false positives and negatives among the entire samples in the Original, All-True, and All-False prompts.
Figure 3: Histogram of LLMs based on the confidence of predicted labels for all samples in the All-True (Top) and All-False (Bottom) prompt.
Figure 4: The ratio of false positives and negatives among the entire samples in the Original, All-True, and All-False prompts in the BoolQ dataset.
Figure 5: Histogram of LLMs based on the confidence of predicted labels for all samples in the All-True (Top) and All-False (Bottom) prompt in the BoolQ dataset.
...and 4 more figures

Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination

TL;DR

Abstract

Large Language Models are Skeptics: False Negative Problem of Input-conflicting Hallucination

Authors

TL;DR

Abstract

Table of Contents

Figures (9)