Table of Contents
Fetching ...

MIMM-X: Disentangling Spurious Correlations for Medical Image Analysis

Louisa Fay, Hajer Reguigui, Bin Yang, Sergios Gatidis, Thomas Küstner

TL;DR

The paper tackles shortcut learning in medical imaging caused by dataset heterogeneity by introducing MIMM-X, a framework that disentangles a primary task from multiple spurious correlations via mutual information minimization. It extends prior MI-based disentanglement with a Confounder Attention Weighter and GradNorm-inspired dynamic loss scaling to handle several spurions simultaneously. Validation across brain MRI (NAKO/UKB) and chest X-ray (CheXpert) demonstrates improved generalization under induced and natural distribution shifts and effective disentanglement of causal and spurious features. This work advances causal representation learning in clinical imaging, offering a scalable approach to robust, fair predictions without altering data distributions.

Abstract

Deep learning models can excel on medical tasks, yet often experience spurious correlations, known as shortcut learning, leading to poor generalization in new environments. Particularly in medical imaging, where multiple spurious correlations can coexist, misclassifications can have severe consequences. We propose MIMM-X, a framework that disentangles causal features from multiple spurious correlations by minimizing their mutual information. It enables predictions based on true underlying causal relationships rather than dataset-specific shortcuts. We evaluate MIMM-X on three datasets (UK Biobank, NAKO, CheXpert) across two imaging modalities (MRI and X-ray). Results demonstrate that MIMM-X effectively mitigates shortcut learning of multiple spurious correlations.

MIMM-X: Disentangling Spurious Correlations for Medical Image Analysis

TL;DR

The paper tackles shortcut learning in medical imaging caused by dataset heterogeneity by introducing MIMM-X, a framework that disentangles a primary task from multiple spurious correlations via mutual information minimization. It extends prior MI-based disentanglement with a Confounder Attention Weighter and GradNorm-inspired dynamic loss scaling to handle several spurions simultaneously. Validation across brain MRI (NAKO/UKB) and chest X-ray (CheXpert) demonstrates improved generalization under induced and natural distribution shifts and effective disentanglement of causal and spurious features. This work advances causal representation learning in clinical imaging, offering a scalable approach to robust, fair predictions without altering data distributions.

Abstract

Deep learning models can excel on medical tasks, yet often experience spurious correlations, known as shortcut learning, leading to poor generalization in new environments. Particularly in medical imaging, where multiple spurious correlations can coexist, misclassifications can have severe consequences. We propose MIMM-X, a framework that disentangles causal features from multiple spurious correlations by minimizing their mutual information. It enables predictions based on true underlying causal relationships rather than dataset-specific shortcuts. We evaluate MIMM-X on three datasets (UK Biobank, NAKO, CheXpert) across two imaging modalities (MRI and X-ray). Results demonstrate that MIMM-X effectively mitigates shortcut learning of multiple spurious correlations.

Paper Structure

This paper contains 14 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: (a) Causal graph with multiple spurious correlations present. The aim is to avoid predictions based on shortcuts introduced by spurious correlations. (b) Our MIMM-X model promotes causal feature learning by minimizing the MI between the desired primary task $y$ and $N$ spurious correlations $Z$.
  • Figure 2: (a) Summary of experiments, primary tasks $y$ and spurious correlations $(z_i)_{i=1}^{N}$. (b) Experiment 1: Data distributions (absolute sample counts). (c) Experiment 3: Training data composition with synthetic and natural correlations.
  • Figure 3: Experiment 1: t-SNE of $f_y$ colored by spurious factors $z_1$: sex, $z_2$: dataset. Ideally, $f_y$ is independent of $z_1$/$z_2$, meaning no visual class separation. While reference methods show clusters, MIMM-X is free from spurious information.