Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council

Luyang Luo; Xin Huang; Minghao Wang; Zhuoyue Wan; Hao Chen

Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council

Luyang Luo, Xin Huang, Minghao Wang, Zhuoyue Wan, Hao Chen

TL;DR

Ada-ABC introduces a one-stage debiasing framework that learns a biased council to capture dataset bias and a debiasing model that adaptively agrees or disagrees with the council, guided by an adaptive loss without requiring bias labels. The method is supported by theoretical analysis showing the debiasing model can learn target features when the bias model captures the bias, and it is validated on a novel medical debiasing benchmark with seven bias scenarios across four datasets. Empirically, Ada-ABC consistently outperforms competitors in mitigating bias while maintaining high overall accuracy, and qualitative saliency analyses suggest decisions are based on clinically relevant features rather than spurious cues. The work contributes a practical, label-agnostic debiasing framework and a standardized benchmark for trustworthy medical image analysis.

Abstract

Deep learning could be prone to learning shortcuts raised by dataset bias and result in inaccurate, unreliable, and unfair models, which impedes its adoption in real-world clinical applications. Despite its significance, there is a dearth of research in the medical image classification domain to address dataset bias. Furthermore, the bias labels are often agnostic, as identifying biases can be laborious and depend on post-hoc interpretation. This paper proposes learning Adaptive Agreement from a Biased Council (Ada-ABC), a debiasing framework that does not rely on explicit bias labels to tackle dataset bias in medical images. Ada-ABC develops a biased council consisting of multiple classifiers optimized with generalized cross entropy loss to learn the dataset bias. A debiasing model is then simultaneously trained under the guidance of the biased council. Specifically, the debiasing model is required to learn adaptive agreement with the biased council by agreeing on the correctly predicted samples and disagreeing on the wrongly predicted samples by the biased council. In this way, the debiasing model could learn the target attribute on the samples without spurious correlations while also avoiding ignoring the rich information in samples with spurious correlations. We theoretically demonstrated that the debiasing model could learn the target features when the biased model successfully captures dataset bias. Moreover, to our best knowledge, we constructed the first medical debiasing benchmark from four datasets containing seven different bias scenarios. Our extensive experiments practically showed that our proposed Ada-ABC outperformed competitive approaches, verifying its effectiveness in mitigating dataset bias for medical image classification. The codes and organized benchmark datasets will be made publicly available.

Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council

TL;DR

Abstract

Paper Structure (25 sections, 1 theorem, 13 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 1 theorem, 13 equations, 5 figures, 3 tables, 1 algorithm.

Introduction
Related Works
Dataset Bias in Medical Images
Deep Debiased Learning
Methodology
Problem Setup
Learning Adaptive Agreement
Learning the Bias Council
Holistic Training of Ada-ABC
Experiments
Medical Debiasing Benchmark
Source-biased Pneumonia Classification (SbP)
Gender-biased Pneumothorax Classification (GbP)
Chest Drain-biased Pneumothorax Classification (DbP)
Age-biased Ischemic Heart Disease Prognosis (OL3I)
...and 10 more sections

Key Result

Theorem 1

(Eq. loss:ad encourages learning the target pattern.) Given a joint data distribution $\mathcal{D}$ of triplets of random variables $(T, B, Y)$ taking values into $\{0,\ 1\}^3$, where $T$ represents the target feature and $B$ represents the bias feature. Assuming that an ERM model learned the poster

Figures (5)

Figure 1: Dataset bias in medical image classification could lead to inaccurate and untrustworthy results. Here, the source of data and whether the patient contains pneumonia are spuriously correlated. A biased model would make decisions based on the data source while ignoring the patterns of the lesions. Our goal is to learn a robust model that can make bias-invariant decisions from the biased training set.
Figure 2: The Framework of Ada-ABC. The goal is to develop a debiasing model which is robust to dataset biases (e.g., caused by spurious correlation between the data source and health condition). A bias council with multiple classification heads is trained with empirical risk minimization, e.g., minimization of generalized cross entropy loss. A second model is simultaneously trained and required to agree with the correct predictions made by the ERM model and disagree with the wrong predictions. Under such an adaptive agreement learning scheme, a different decision-making rule can be learned from the samples w/o spurious correlations, while rich information from the samples w/ spurious correlation can be preserved as well.
Figure 3: Effects of the hyper-parameters $\lambda$ and number of heads. The first row shows the results on SbP dataset with $\rho=99\%$: (a) The changes of aligned AUC and conflicting AUC w.r.t. the change of $\lambda$ (# heads = 16). (b) The changes of overall AUC and balanced AUC w.r.t. the change of $\lambda$ (# heads = 16). (c) The changes of overall AUC and balanced AUC w.r.t. the change of number of heads ($\lambda$ = 100). The second row shows the results on OL3I dataset: (d) The changes of aligned AUC and conflicting AUC w.r.t. the change of $\lambda$ (# heads = 8). (e) The changes of overall AUC and balanced AUC w.r.t. the change of $\lambda$ (# heads = 8). (f) The changes of overall AUC and balanced AUC w.r.t. the change of number of heads ($\lambda$ = 300).
Figure 4: The decision boundaries by (a) an ERM model that learns a simple solution; another model that learns to (b) purely agree with the ERM model, or (c) purely disagree with the ERM model, (d) or adaptively agree or disagree with the ERM model. Details are best appreciated when enlarged.
Figure 5: The saliency maps by the ERM model (2nd row) and the debiasing model by Ada-ABC (3rd row). Samples from columns 1-4 are from SbP, GbP, DbP, and OL3I, respectively. Both models made correct predictions but were looking for different reasons.

Theorems & Definitions (2)

Theorem 1
proof

Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council

TL;DR

Abstract

Medical Image Debiasing by Learning Adaptive Agreement from a Biased Council

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (2)