Table of Contents
Fetching ...

Mapping Clinical Doubt: Locating Linguistic Uncertainty in LLMs

Srivarshinee Sridhar, Raghav Kaushik Ravi, Kripabandhu Ghosh

TL;DR

The paper investigates how linguistic uncertainty, encoded through epistemic modality, is internally represented in LLMs. It introduces Model Sensitivity to Uncertainty (MSU), a layerwise activation-difference metric, and a 3,114-pair dataset to probe how uncertainty cues shift internal representations. Results show MSU increases with depth across multiple models, with later layers encoding more epistemic information and PCA analyses revealing clustering and geometric inversions in deep layers. These findings suggest a distributed, late-emergent encoding of epistemic cues, with important implications for interpretability and reliability in clinical contexts.

Abstract

Large Language Models (LLMs) are increasingly used in clinical settings, where sensitivity to linguistic uncertainty can influence diagnostic interpretation and decision-making. Yet little is known about where such epistemic cues are internally represented within these models. Distinct from uncertainty quantification, which measures output confidence, this work examines input-side representational sensitivity to linguistic uncertainty in medical text. We curate a contrastive dataset of clinical statements varying in epistemic modality (e.g., 'is consistent with' vs. 'may be consistent with') and propose Model Sensitivity to Uncertainty (MSU), a layerwise probing metric that quantifies activation-level shifts induced by uncertainty cues. Our results show that LLMs exhibit structured, depth-dependent sensitivity to clinical uncertainty, suggesting that epistemic information is progressively encoded in deeper layers. These findings reveal how linguistic uncertainty is internally represented in LLMs, offering insight into their interpretability and epistemic reliability.

Mapping Clinical Doubt: Locating Linguistic Uncertainty in LLMs

TL;DR

The paper investigates how linguistic uncertainty, encoded through epistemic modality, is internally represented in LLMs. It introduces Model Sensitivity to Uncertainty (MSU), a layerwise activation-difference metric, and a 3,114-pair dataset to probe how uncertainty cues shift internal representations. Results show MSU increases with depth across multiple models, with later layers encoding more epistemic information and PCA analyses revealing clustering and geometric inversions in deep layers. These findings suggest a distributed, late-emergent encoding of epistemic cues, with important implications for interpretability and reliability in clinical contexts.

Abstract

Large Language Models (LLMs) are increasingly used in clinical settings, where sensitivity to linguistic uncertainty can influence diagnostic interpretation and decision-making. Yet little is known about where such epistemic cues are internally represented within these models. Distinct from uncertainty quantification, which measures output confidence, this work examines input-side representational sensitivity to linguistic uncertainty in medical text. We curate a contrastive dataset of clinical statements varying in epistemic modality (e.g., 'is consistent with' vs. 'may be consistent with') and propose Model Sensitivity to Uncertainty (MSU), a layerwise probing metric that quantifies activation-level shifts induced by uncertainty cues. Our results show that LLMs exhibit structured, depth-dependent sensitivity to clinical uncertainty, suggesting that epistemic information is progressively encoded in deeper layers. These findings reveal how linguistic uncertainty is internally represented in LLMs, offering insight into their interpretability and epistemic reliability.

Paper Structure

This paper contains 13 sections, 1 equation, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Although the prompt pairs differ only in epistemic modality (should vs could), the responses vary: those prompted with could tend to offer a broader range of medical possibilities and are more open-ended compared to those ending with should. This could imply how the model interprets linguistic uncertainty.
  • Figure 2: Paired inputs with linguistic certainty and uncertainty are passed through the model. Layerwise activation differences ($\Delta$) are used to compute MSU, capturing the model's sensitivity to uncertainty across layers.
  • Figure 3: PCA plots of the last token activations of layers 10 and 17 for Qwen2.5-0.5B-Instruct model. A geometric inversion can be observed in the Projections for the Uncertain and Certain input activations.
  • Figure 4: Layer-wise MSU scores for Qwen2.5-0.5B-Instruct, indicate progressively increasing scores across layers suggesting that later layers are responsible for encoding uncertainties in language
  • Figure 5: Layer-wise MSU scores for Qwen2.5-0.5B-Chat. Among all variants, the Chat model exhibits the highest sensitivity to linguistic uncertainty, with a steep increase in MSU across deeper layers.
  • ...and 5 more figures