Expert-aware uncertainty estimation for quality control of neural-based blood typing

Ekaterina Zaychenkova; Dmitrii Iarchuk; Sergey Korchagin; Alexey Zaitsev; Egor Ershov

Expert-aware uncertainty estimation for quality control of neural-based blood typing

Ekaterina Zaychenkova, Dmitrii Iarchuk, Sergey Korchagin, Alexey Zaitsev, Egor Ershov

TL;DR

This work tackles the challenge of uncertainty estimation in neural networks for medical second opinions by introducing expert-aware uncertainty quantification (EAUQ) that fuses ground-truth labels with expert assessments of case complexity. It formalizes uncertainty as a decomposition $UQ(x) = UQ_{\mathfrak{a}}(x) + UQ_{\mathfrak{e}}(x)$ and leverages ensemble standard deviation $\mathrm{STD}(x)$ alongside an expert-derived metric $MP(x) = 1 - \max(\overline{e}(x), 1 - \overline{e}(x))$ to capture aleatoric and epistemic components, respectively. The authors implement a dual-path pipeline: a 20‑net ensemble (CE) augmented by $MP$, and a deterministic Expert-Aware Network (EAN) trained to emulate average expert responses, extended to an Expert-Aware Ensemble (EAE). A new BloodyWell dataset of 3139 serology images with six expert assessments enables evaluation of uncertainty estimation in blood typing (ABO, RH, KELL) and demonstrates a 2.5× improvement in uncertainty calibration with expert labels and a 35% gain when using neural-based expert consensus, highlighting the practical impact of combining expert insights with ensemble approaches for safe, reliable medical AI.

Abstract

In medical diagnostics, accurate uncertainty estimation for neural-based models is essential for complementing second-opinion systems. Despite neural network ensembles' proficiency in this problem, a gap persists between actual uncertainties and predicted estimates. A major difficulty here is the lack of labels on the hardness of examples: a typical dataset includes only ground truth target labels, making the uncertainty estimation problem almost unsupervised. Our novel approach narrows this gap by integrating expert assessments of case complexity into the neural network's learning process, utilizing both definitive target labels and supplementary complexity ratings. We validate our methodology for blood typing, leveraging a new dataset "BloodyWell" unique in augmenting labeled reaction images with complexity scores from six medical specialists. Experiments demonstrate enhancement of our approach in uncertainty prediction, achieving a 2.5-fold improvement with expert labels and a 35% increase in performance with estimates of neural-based expert consensus.

Expert-aware uncertainty estimation for quality control of neural-based blood typing

TL;DR

and leverages ensemble standard deviation

alongside an expert-derived metric

to capture aleatoric and epistemic components, respectively. The authors implement a dual-path pipeline: a 20‑net ensemble (CE) augmented by

, and a deterministic Expert-Aware Network (EAN) trained to emulate average expert responses, extended to an Expert-Aware Ensemble (EAE). A new BloodyWell dataset of 3139 serology images with six expert assessments enables evaluation of uncertainty estimation in blood typing (ABO, RH, KELL) and demonstrates a 2.5× improvement in uncertainty calibration with expert labels and a 35% gain when using neural-based expert consensus, highlighting the practical impact of combining expert insights with ensemble approaches for safe, reliable medical AI.

Abstract

Paper Structure (12 sections, 2 equations, 5 figures, 1 table)

This paper contains 12 sections, 2 equations, 5 figures, 1 table.

Introduction
Literature review
Expert-aware uncertainty estimation
Uncertainty decomposition.
Ensemble uncertainty estimates.
Experts' assessment of uncertainty.
Combining expert-based and ensemble uncertainty estimates.
BloodyWell dataset
Experimental results
Experimental setup.
Results analysis.
Conclusions

Figures (5)

Figure 1: The feature space of a classification neural network with uncertainty estimation by a classical approach and by an expert-aware approach proposed by us.
Figure 2: Proposed expert-aware uncertainty estimation training and inference pipeline.
Figure 3: Examples of classification errors with uncertainty values of the experts' ensemble and the NN ensemble. Symbols plus and minus states for agglutination GT.
Figure 4: Structure of prepared serological plate with agglutination reactions. One sample of the dataset is a cut out image with markup from two sources: information about the type of agglutination obtained from the blood donor's medical record (presence or absence) and an alternative assessment of agglutination by six medical experts.
Figure 5: Accuracy-rejection curves for the main methods used. $\mathrm{EXP_{MP}}$ and $\mathrm{EXP_{MP}} + \mathrm{CE_{STD}}$ use expert uncertainty labels during inference

Expert-aware uncertainty estimation for quality control of neural-based blood typing

TL;DR

Abstract

Expert-aware uncertainty estimation for quality control of neural-based blood typing

Authors

TL;DR

Abstract

Table of Contents

Figures (5)