BSDA: Bayesian Random Semantic Data Augmentation for Medical Image Classification
Yaoyao Zhu, Xiuding Cai, Xueyao Wang, Xiaoqing Chen, Yu Yao, Zhongliang Fu
TL;DR
Medical image datasets are often limited, and traditional image-level data augmentation can be domain- and compute-intensive. BSDA presents a feature-space semantic augmentation method that uses variational inference to estimate a label-preserving semantic magnitude distributed as q_phi_m(m|a) and applies random semantic directions to the deep feature a, with a reconstruction term ensuring faithful mapping and a masking mechanism to avoid altering zero-feature components. Across nine 2D and five 3D MedMNIST+ datasets, BSDA consistently improves ACC and AUC over strong baselines and demonstrates compatibility with CNNs and Transformer architectures, while maintaining modest compute overhead. This yields practical gains in medical image classification and suggests BSDA as a broadly applicable, efficient augmentation module for clinical AI pipelines.
Abstract
Data augmentation is a crucial regularization technique for deep neural networks, particularly in medical image classification. Mainstream data augmentation (DA) methods are usually applied at the image level. Due to the specificity and diversity of medical imaging, expertise is often required to design effective DA strategies, and improper augmentation operations can degrade model performance. Although automatic augmentation methods exist, they are computationally intensive. Semantic data augmentation can implemented by translating features in feature space. However, over-translation may violate the image label. To address these issues, we propose \emph{Bayesian Random Semantic Data Augmentation} (BSDA), a computationally efficient and handcraft-free feature-level DA method. BSDA uses variational Bayesian to estimate the distribution of the augmentable magnitudes, and then a sample from this distribution is added to the original features to perform semantic data augmentation. We performed experiments on nine 2D and five 3D medical image datasets. Experimental results show that BSDA outperforms current DA methods. Additionally, BSDA can be easily assembled into CNNs or Transformers as a plug-and-play module, improving the network's performance. The code is available online at \url{https://github.com/YaoyaoZhu19/BSDA}.
