Table of Contents
Fetching ...

Debiased Noise Editing on Foundation Models for Fair Medical Image Classification

Ruinan Jin, Wenlong Deng, Minghui Chen, Xiaoxiao Li

TL;DR

This work tackles bias in medical image classification when using pre-trained foundation-model (FM) APIs by addressing spurious correlations between sensitive attributes and disease, which can be encoded in embeddings. It introduces Debiased Noise Editing (DNE), a learnable perturbation $\epsilon$ added to input images to mask $\mathcal{A}$-related information while preserving disease content, trained via a frozen SA classifier $g$ operating in the FM embedding space $z=\phi(x)$. For black-box FM APIs, it proposes Greedy Zeroth-Order Optimization (GeZO) to update $\epsilon$ using perturbation-based gradient estimates and a velocity term, without access to model gradients. Empirical results on CheXpert across Pleural Effusion, Pneumonia, and Edema demonstrate that DNE substantially improves fairness metrics (e.g., equal opportunity and $|1-\text{DI}|$) while maintaining or improving accuracy, with GeZO achieving competitive performance in black-box settings; code is released to facilitate adoption. Overall, DNE offers a practical, scalable pathway to fairer medical imaging diagnostics when relying on pre-trained FM embeddings.

Abstract

In the era of Foundation Models' (FMs) rising prominence in AI, our study addresses the challenge of biases in medical images while the model operates in black-box (e.g., using FM API), particularly spurious correlations between pixels and sensitive attributes. Traditional methods for bias mitigation face limitations due to the restricted access to web-hosted FMs and difficulties in addressing the underlying bias encoded within the FM API. We propose a D(ebiased) N(oise) E(diting) strategy, termed DNE, which generates DNE noise to mask such spurious correlation. DNE is capable of mitigating bias both within the FM API embedding and the images themselves. Furthermore, DNE is suitable for both white-box and black-box FM APIs, where we introduced G(reedy) (Z)eroth-O(rder) (GeZO) optimization for it when the gradient is inaccessible in black-box APIs. Our whole pipeline enables fairness-aware image editing that can be applied across various medical contexts without requiring direct model manipulation or significant computational resources. Our empirical results demonstrate the method's effectiveness in maintaining fairness and utility across different patient groups and diseases. In the era of AI-driven medicine, this work contributes to making healthcare diagnostics more equitable, showcasing a practical solution for bias mitigation in pre-trained image FMs. Our code is provided at https://github.com/ubc-tea/DNE-foundation-model-fairness.

Debiased Noise Editing on Foundation Models for Fair Medical Image Classification

TL;DR

This work tackles bias in medical image classification when using pre-trained foundation-model (FM) APIs by addressing spurious correlations between sensitive attributes and disease, which can be encoded in embeddings. It introduces Debiased Noise Editing (DNE), a learnable perturbation added to input images to mask -related information while preserving disease content, trained via a frozen SA classifier operating in the FM embedding space . For black-box FM APIs, it proposes Greedy Zeroth-Order Optimization (GeZO) to update using perturbation-based gradient estimates and a velocity term, without access to model gradients. Empirical results on CheXpert across Pleural Effusion, Pneumonia, and Edema demonstrate that DNE substantially improves fairness metrics (e.g., equal opportunity and ) while maintaining or improving accuracy, with GeZO achieving competitive performance in black-box settings; code is released to facilitate adoption. Overall, DNE offers a practical, scalable pathway to fairer medical imaging diagnostics when relying on pre-trained FM embeddings.

Abstract

In the era of Foundation Models' (FMs) rising prominence in AI, our study addresses the challenge of biases in medical images while the model operates in black-box (e.g., using FM API), particularly spurious correlations between pixels and sensitive attributes. Traditional methods for bias mitigation face limitations due to the restricted access to web-hosted FMs and difficulties in addressing the underlying bias encoded within the FM API. We propose a D(ebiased) N(oise) E(diting) strategy, termed DNE, which generates DNE noise to mask such spurious correlation. DNE is capable of mitigating bias both within the FM API embedding and the images themselves. Furthermore, DNE is suitable for both white-box and black-box FM APIs, where we introduced G(reedy) (Z)eroth-O(rder) (GeZO) optimization for it when the gradient is inaccessible in black-box APIs. Our whole pipeline enables fairness-aware image editing that can be applied across various medical contexts without requiring direct model manipulation or significant computational resources. Our empirical results demonstrate the method's effectiveness in maintaining fairness and utility across different patient groups and diseases. In the era of AI-driven medicine, this work contributes to making healthcare diagnostics more equitable, showcasing a practical solution for bias mitigation in pre-trained image FMs. Our code is provided at https://github.com/ubc-tea/DNE-foundation-model-fairness.
Paper Structure (18 sections, 2 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 18 sections, 2 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of debias noise editing pipeline. (a) We eliminate the spurious correlation by breaking the connection between SA $\mathcal{A}$ and image $\mathcal{X}$, ensuring the model relies solely on disease-related information $\mathcal{Y}$. (b) Training DNE noise deceives the pre-trained SA classifier in $\mathcal{A}$ using a frozen FM API and SA classifier, suitable for both white-box and black-box API scenarios based on gradient accessibility. (c) We demonstrate the use of DNE noise: users augment their images with this noise, extract embeddings via the FM API, and proceed to train fair disease classifiers. (d) The (G)reedy (Z)eroth-(O)rder (GeZO) black-box editing method, selects the optimal perturbation via the Greedy Gradient process when gradients is inaccessible. It tracks the best perturbation using velocity in each local iteration to update the DNE noise $\epsilon$.
  • Figure 2: Ablation study of our methods: (a) Effect of different $\lambda$; (b) Effect of different local epochs using GeZO.
  • Figure 3: Visualization of chest X-rays and DNE noise patterns (with Gaussian smoothing applied) to interpret gender-discriminative image regions. (a) The normalized UDE noise map, with larger noise highlighted by brighter color, reveals gender-discriminative features. The large noise circled in orange corresponds to the breast. The large noise circled in yellow reflects artifacts on X-ray, such as text notations. (b) A female chest X-ray. (c) A male chest X-ray.