Table of Contents
Fetching ...

Beyond Uncertainty Quantification: Learning Uncertainty for Trust-Informed Neural Network Decisions - A Case Study in COVID-19 Classification

Hassan Gharoun, Mohammad Sadegh Khorshidi, Fang Chen, Amir H. Gandomi

TL;DR

This paper tackles the trust deficit in AI-assisted medical diagnosis by addressing confidently incorrect predictions that arise under standard uncertainty quantification. It introduces Uncertainty-Aware Stacked Neural Networks (U-SNN), a two-tier framework where a base model provides predictions with uncertainty estimates and a meta-model learns to flag trustworthiness based on input features and the base uncertainty, using a trust flag defined by $z_i=1$ if the prediction is correct and confidently so. Uncertainty is quantified via Monte-Carlo Dropout, with prediction entropy $PE$ serving as the key uncertainty metric, and a meta-model trained on the base outputs plus $e_i$ to decide whether to trust a prediction. Across the COVIDx CXR-4 dataset, U-SNN consistently reduces confidently incorrect predictions (lower FCR and CE) and unnecessary referrals (lower RR) relative to a traditional threshold-based approach, especially at practical lower confidence thresholds, demonstrating improved trust and efficiency for high-stakes decision support in medical imaging. The study also shows that including PE as meta-input enhances performance, and suggests future work on further calibrating the base model to reduce referral burden while maintaining reliability.

Abstract

Reliable uncertainty quantification is critical in high-stakes applications, such as medical diagnosis, where confidently incorrect predictions can erode trust in automated decision-making systems. Traditional uncertainty quantification methods rely on a predefined confidence threshold to classify predictions as confident or uncertain. However, this approach assumes that predictions exceeding the threshold are trustworthy, while those below it are uncertain, without explicitly assessing the correctness of high-confidence predictions. As a result, confidently incorrect predictions may still occur, leading to misleading uncertainty assessments. To address this limitation, this study proposed an uncertainty-aware stacked neural network, which extends conventional uncertainty quantification by learning when predictions should be trusted. The framework consists of a two-tier model: the base model generates predictions with uncertainty estimates, while the meta-model learns to assign a trust flag, distinguishing confidently correct cases from those requiring expert review. The proposed approach is evaluated against the traditional threshold-based method across multiple confidence thresholds and pre-trained architectures using the COVIDx CXR-4 dataset. Results demonstrate that the proposed framework significantly reduces confidently incorrect predictions, offering a more trustworthy and efficient decision-support system for high-stakes domains.

Beyond Uncertainty Quantification: Learning Uncertainty for Trust-Informed Neural Network Decisions - A Case Study in COVID-19 Classification

TL;DR

This paper tackles the trust deficit in AI-assisted medical diagnosis by addressing confidently incorrect predictions that arise under standard uncertainty quantification. It introduces Uncertainty-Aware Stacked Neural Networks (U-SNN), a two-tier framework where a base model provides predictions with uncertainty estimates and a meta-model learns to flag trustworthiness based on input features and the base uncertainty, using a trust flag defined by if the prediction is correct and confidently so. Uncertainty is quantified via Monte-Carlo Dropout, with prediction entropy serving as the key uncertainty metric, and a meta-model trained on the base outputs plus to decide whether to trust a prediction. Across the COVIDx CXR-4 dataset, U-SNN consistently reduces confidently incorrect predictions (lower FCR and CE) and unnecessary referrals (lower RR) relative to a traditional threshold-based approach, especially at practical lower confidence thresholds, demonstrating improved trust and efficiency for high-stakes decision support in medical imaging. The study also shows that including PE as meta-input enhances performance, and suggests future work on further calibrating the base model to reduce referral burden while maintaining reliability.

Abstract

Reliable uncertainty quantification is critical in high-stakes applications, such as medical diagnosis, where confidently incorrect predictions can erode trust in automated decision-making systems. Traditional uncertainty quantification methods rely on a predefined confidence threshold to classify predictions as confident or uncertain. However, this approach assumes that predictions exceeding the threshold are trustworthy, while those below it are uncertain, without explicitly assessing the correctness of high-confidence predictions. As a result, confidently incorrect predictions may still occur, leading to misleading uncertainty assessments. To address this limitation, this study proposed an uncertainty-aware stacked neural network, which extends conventional uncertainty quantification by learning when predictions should be trusted. The framework consists of a two-tier model: the base model generates predictions with uncertainty estimates, while the meta-model learns to assign a trust flag, distinguishing confidently correct cases from those requiring expert review. The proposed approach is evaluated against the traditional threshold-based method across multiple confidence thresholds and pre-trained architectures using the COVIDx CXR-4 dataset. Results demonstrate that the proposed framework significantly reduces confidently incorrect predictions, offering a more trustworthy and efficient decision-support system for high-stakes domains.
Paper Structure (16 sections, 3 equations, 5 figures, 6 tables)

This paper contains 16 sections, 3 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Uncertainty-aware Stacked Neural Networks (U-SNN) Schema.
  • Figure 2: Examples images from COVIDx CXR-4 dataset.
  • Figure 3: Comparison of Uncertainty-Informed Criteria Across Pre-trained Models at a Confidence Threshold of 0.1, U-SNN and Tradiditonal Threshold-based method.
  • Figure 4: Trend Analysis of Uncertainty-Informed Criteria Across Various Pre-trained Models at Incremental Confidence Thresholds from 0.05 to 0.4.
  • Figure 5: Comparative Performance of U-SNN With and Without PE as Input at Confidence Threshold 0.1.