Rethinking Bias in Generative Data Augmentation for Medical AI: a Frequency Recalibration Method
Chi Liu, Jincheng Liu, Congcong Zhu, Minghao Wang, Sheng Shen, Jia Gu, Tianqing Zhu, Wanlei Zhou
TL;DR
This work investigates why generative data augmentation can be unreliable in medical imaging due to frequency misalignment between real and AI-generated images. It introduces FreRec, a two-step post-processing pipeline consisting of SHR to coarse-align high-frequency content and RHM to reconstruct high-frequency details on a learned real frequency manifold, built upon a Restormer-based architecture with a spectral loss. Across brain MRI, chest X-ray, and fundus datasets, FreRec consistently improves downstream disease classification when used with various generative models, and remains model-agnostic and plug-and-play. The method reduces frequency-domain and feature-space domain biases, achieves practical inference overhead, and shows potential for non-medical domains as a general approach to stabilize GDA.
Abstract
Developing Medical AI relies on large datasets and easily suffers from data scarcity. Generative data augmentation (GDA) using AI generative models offers a solution to synthesize realistic medical images. However, the bias in GDA is often underestimated in medical domains, with concerns about the risk of introducing detrimental features generated by AI and harming downstream tasks. This paper identifies the frequency misalignment between real and synthesized images as one of the key factors underlying unreliable GDA and proposes the Frequency Recalibration (FreRec) method to reduce the frequency distributional discrepancy and thus improve GDA. FreRec involves (1) Statistical High-frequency Replacement (SHR) to roughly align high-frequency components and (2) Reconstructive High-frequency Mapping (RHM) to enhance image quality and reconstruct high-frequency details. Extensive experiments were conducted in various medical datasets, including brain MRIs, chest X-rays, and fundus images. The results show that FreRec significantly improves downstream medical image classification performance compared to uncalibrated AI-synthesized samples. FreRec is a standalone post-processing step that is compatible with any generative model and can integrate seamlessly with common medical GDA pipelines.
