Table of Contents
Fetching ...

Explaining Sources of Uncertainty in Automated Fact-Checking

Jingyi Sun, Greta Warren, Irina Shklovski, Isabelle Augenstein

TL;DR

The paper tackles the problem of explaining uncertainty in automated fact-checking by grounding model uncertainty in explicit span-level interactions among a claim and multiple evidences. It introduces CLUE, a plug-and-play framework that (1) unsupervisedly identifies conflict- and agreement-bearing spans across claim–evidence pairs, (2) quantifies predictive uncertainty via entropy $u(X)$ over the candidate labels, and (3) generates uncertainty explanations through instruction-based prompting or attention steering centered on the extracted spans. Empirically, CLUE improves faithfulness to the model's uncertainty and alignment with fact-checking labels across three open-weight LLMs and two health-domain datasets, with human evaluators finding the explanations more helpful, informative, and coherent than baseline prompts. The work demonstrates that grounding explanations in concrete evidentiary conflicts enables more actionable, maintainable, and generalizable support for fact-checking and other information synthesis tasks, without any model fine-tuning. $u(X)$ and $P(y_i|X)$ are defined where applicable and all math is presented in $...$ format for clarity and reproducibility.

Abstract

Understanding sources of a model's uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior work proposes using numerical uncertainty or hedges ("I'm not sure, but ..."), which do not explain uncertainty that arises from conflicting evidence, leaving users unable to resolve disagreements or rely on the output. We introduce CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations), the first framework to generate natural language explanations of model uncertainty by (i) identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and (ii) generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, we show that CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance. Human evaluators judge our explanations to be more helpful, more informative, less redundant, and more logically consistent with the input than this baseline. CLUE requires no fine-tuning or architectural changes, making it plug-and-play for any white-box language model. By explicitly linking uncertainty to evidence conflicts, it offers practical support for fact-checking and generalises readily to other tasks that require reasoning over complex information.

Explaining Sources of Uncertainty in Automated Fact-Checking

TL;DR

The paper tackles the problem of explaining uncertainty in automated fact-checking by grounding model uncertainty in explicit span-level interactions among a claim and multiple evidences. It introduces CLUE, a plug-and-play framework that (1) unsupervisedly identifies conflict- and agreement-bearing spans across claim–evidence pairs, (2) quantifies predictive uncertainty via entropy over the candidate labels, and (3) generates uncertainty explanations through instruction-based prompting or attention steering centered on the extracted spans. Empirically, CLUE improves faithfulness to the model's uncertainty and alignment with fact-checking labels across three open-weight LLMs and two health-domain datasets, with human evaluators finding the explanations more helpful, informative, and coherent than baseline prompts. The work demonstrates that grounding explanations in concrete evidentiary conflicts enables more actionable, maintainable, and generalizable support for fact-checking and other information synthesis tasks, without any model fine-tuning. and are defined where applicable and all math is presented in format for clarity and reproducibility.

Abstract

Understanding sources of a model's uncertainty regarding its predictions is crucial for effective human-AI collaboration. Prior work proposes using numerical uncertainty or hedges ("I'm not sure, but ..."), which do not explain uncertainty that arises from conflicting evidence, leaving users unable to resolve disagreements or rely on the output. We introduce CLUE (Conflict-and-Agreement-aware Language-model Uncertainty Explanations), the first framework to generate natural language explanations of model uncertainty by (i) identifying relationships between spans of text that expose claim-evidence or inter-evidence conflicts and agreements that drive the model's predictive uncertainty in an unsupervised way, and (ii) generating explanations via prompting and attention steering that verbalize these critical interactions. Across three language models and two fact-checking datasets, we show that CLUE produces explanations that are more faithful to the model's uncertainty and more consistent with fact-checking decisions than prompting for uncertainty explanations without span-interaction guidance. Human evaluators judge our explanations to be more helpful, more informative, less redundant, and more logically consistent with the input than this baseline. CLUE requires no fine-tuning or architectural changes, making it plug-and-play for any white-box language model. By explicitly linking uncertainty to evidence conflicts, it offers practical support for fact-checking and generalises readily to other tasks that require reasoning over complex information.

Paper Structure

This paper contains 55 sections, 13 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Example of claim and evidence documents, alongside span interactions for uncertainty and generated natural language explanations.
  • Figure 2: Explanations produced by earlier systems, e-FEVER stammbach2020fever, Explain-MT atanasova-etal-2020-generating-fact, and JustiLM zeng-gao-2024-justilm, compared with those from our CLUE framework. CLUE is the only approach that explicitly traces model uncertainty to the conflicts and agreements between the claim and multiple evidence passages.
  • Figure 3: Prompt template for span interaction relation labelling.
  • Figure 4: Three-shot prompt for PromptBaseline (Shots 2–3 omitted) on the HealthVer and DRuiD datasets.
  • Figure 5: Three-shot prompt for CLUE-Span and CLUE-Span+Steering (Shots 2–3 omitted) on the HealthVer and Druid datasets.
  • ...and 1 more figures