Table of Contents
Fetching ...

Rethinking Bias in Generative Data Augmentation for Medical AI: a Frequency Recalibration Method

Chi Liu, Jincheng Liu, Congcong Zhu, Minghao Wang, Sheng Shen, Jia Gu, Tianqing Zhu, Wanlei Zhou

TL;DR

This work investigates why generative data augmentation can be unreliable in medical imaging due to frequency misalignment between real and AI-generated images. It introduces FreRec, a two-step post-processing pipeline consisting of SHR to coarse-align high-frequency content and RHM to reconstruct high-frequency details on a learned real frequency manifold, built upon a Restormer-based architecture with a spectral loss. Across brain MRI, chest X-ray, and fundus datasets, FreRec consistently improves downstream disease classification when used with various generative models, and remains model-agnostic and plug-and-play. The method reduces frequency-domain and feature-space domain biases, achieves practical inference overhead, and shows potential for non-medical domains as a general approach to stabilize GDA.

Abstract

Developing Medical AI relies on large datasets and easily suffers from data scarcity. Generative data augmentation (GDA) using AI generative models offers a solution to synthesize realistic medical images. However, the bias in GDA is often underestimated in medical domains, with concerns about the risk of introducing detrimental features generated by AI and harming downstream tasks. This paper identifies the frequency misalignment between real and synthesized images as one of the key factors underlying unreliable GDA and proposes the Frequency Recalibration (FreRec) method to reduce the frequency distributional discrepancy and thus improve GDA. FreRec involves (1) Statistical High-frequency Replacement (SHR) to roughly align high-frequency components and (2) Reconstructive High-frequency Mapping (RHM) to enhance image quality and reconstruct high-frequency details. Extensive experiments were conducted in various medical datasets, including brain MRIs, chest X-rays, and fundus images. The results show that FreRec significantly improves downstream medical image classification performance compared to uncalibrated AI-synthesized samples. FreRec is a standalone post-processing step that is compatible with any generative model and can integrate seamlessly with common medical GDA pipelines.

Rethinking Bias in Generative Data Augmentation for Medical AI: a Frequency Recalibration Method

TL;DR

This work investigates why generative data augmentation can be unreliable in medical imaging due to frequency misalignment between real and AI-generated images. It introduces FreRec, a two-step post-processing pipeline consisting of SHR to coarse-align high-frequency content and RHM to reconstruct high-frequency details on a learned real frequency manifold, built upon a Restormer-based architecture with a spectral loss. Across brain MRI, chest X-ray, and fundus datasets, FreRec consistently improves downstream disease classification when used with various generative models, and remains model-agnostic and plug-and-play. The method reduces frequency-domain and feature-space domain biases, achieves practical inference overhead, and shows potential for non-medical domains as a general approach to stabilize GDA.

Abstract

Developing Medical AI relies on large datasets and easily suffers from data scarcity. Generative data augmentation (GDA) using AI generative models offers a solution to synthesize realistic medical images. However, the bias in GDA is often underestimated in medical domains, with concerns about the risk of introducing detrimental features generated by AI and harming downstream tasks. This paper identifies the frequency misalignment between real and synthesized images as one of the key factors underlying unreliable GDA and proposes the Frequency Recalibration (FreRec) method to reduce the frequency distributional discrepancy and thus improve GDA. FreRec involves (1) Statistical High-frequency Replacement (SHR) to roughly align high-frequency components and (2) Reconstructive High-frequency Mapping (RHM) to enhance image quality and reconstruct high-frequency details. Extensive experiments were conducted in various medical datasets, including brain MRIs, chest X-rays, and fundus images. The results show that FreRec significantly improves downstream medical image classification performance compared to uncalibrated AI-synthesized samples. FreRec is a standalone post-processing step that is compatible with any generative model and can integrate seamlessly with common medical GDA pipelines.

Paper Structure

This paper contains 27 sections, 9 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: The average spectra of real and AI-synthetic images with different diseases and image types. The higher-frequency differences are discernible (e.g. as red circles indicate). Synthetic images are generated by different models: MRIs from FastGAN, fundus images from VC-Diffusion and X-rays from StyleGAN3.
  • Figure 2: The workflow of Statistic High-frequency Replacement.
  • Figure 3: A conceptual explanation of the Reconstructive High-frequency Mapping. Images are first transformed into the same starting space during the initial alignment by SHR. Then the synthetic images can be further calibrated by mapping onto the natural frequency manifold following the same reconstruction path learned from real images.
  • Figure 4: The details of the denoising auto-encoder used in Reconstructive High-frequency Mapping.
  • Figure 5: Examples of original real images (above) and synthetic images (bottom) from three datasets. The synthetic images maintain high visual quality and fidelity.
  • ...and 6 more figures