Table of Contents
Fetching ...

BSDA: Bayesian Random Semantic Data Augmentation for Medical Image Classification

Yaoyao Zhu, Xiuding Cai, Xueyao Wang, Xiaoqing Chen, Yu Yao, Zhongliang Fu

TL;DR

Medical image datasets are often limited, and traditional image-level data augmentation can be domain- and compute-intensive. BSDA presents a feature-space semantic augmentation method that uses variational inference to estimate a label-preserving semantic magnitude distributed as q_phi_m(m|a) and applies random semantic directions to the deep feature a, with a reconstruction term ensuring faithful mapping and a masking mechanism to avoid altering zero-feature components. Across nine 2D and five 3D MedMNIST+ datasets, BSDA consistently improves ACC and AUC over strong baselines and demonstrates compatibility with CNNs and Transformer architectures, while maintaining modest compute overhead. This yields practical gains in medical image classification and suggests BSDA as a broadly applicable, efficient augmentation module for clinical AI pipelines.

Abstract

Data augmentation is a crucial regularization technique for deep neural networks, particularly in medical image classification. Mainstream data augmentation (DA) methods are usually applied at the image level. Due to the specificity and diversity of medical imaging, expertise is often required to design effective DA strategies, and improper augmentation operations can degrade model performance. Although automatic augmentation methods exist, they are computationally intensive. Semantic data augmentation can implemented by translating features in feature space. However, over-translation may violate the image label. To address these issues, we propose \emph{Bayesian Random Semantic Data Augmentation} (BSDA), a computationally efficient and handcraft-free feature-level DA method. BSDA uses variational Bayesian to estimate the distribution of the augmentable magnitudes, and then a sample from this distribution is added to the original features to perform semantic data augmentation. We performed experiments on nine 2D and five 3D medical image datasets. Experimental results show that BSDA outperforms current DA methods. Additionally, BSDA can be easily assembled into CNNs or Transformers as a plug-and-play module, improving the network's performance. The code is available online at \url{https://github.com/YaoyaoZhu19/BSDA}.

BSDA: Bayesian Random Semantic Data Augmentation for Medical Image Classification

TL;DR

Medical image datasets are often limited, and traditional image-level data augmentation can be domain- and compute-intensive. BSDA presents a feature-space semantic augmentation method that uses variational inference to estimate a label-preserving semantic magnitude distributed as q_phi_m(m|a) and applies random semantic directions to the deep feature a, with a reconstruction term ensuring faithful mapping and a masking mechanism to avoid altering zero-feature components. Across nine 2D and five 3D MedMNIST+ datasets, BSDA consistently improves ACC and AUC over strong baselines and demonstrates compatibility with CNNs and Transformer architectures, while maintaining modest compute overhead. This yields practical gains in medical image classification and suggests BSDA as a broadly applicable, efficient augmentation module for clinical AI pipelines.

Abstract

Data augmentation is a crucial regularization technique for deep neural networks, particularly in medical image classification. Mainstream data augmentation (DA) methods are usually applied at the image level. Due to the specificity and diversity of medical imaging, expertise is often required to design effective DA strategies, and improper augmentation operations can degrade model performance. Although automatic augmentation methods exist, they are computationally intensive. Semantic data augmentation can implemented by translating features in feature space. However, over-translation may violate the image label. To address these issues, we propose \emph{Bayesian Random Semantic Data Augmentation} (BSDA), a computationally efficient and handcraft-free feature-level DA method. BSDA uses variational Bayesian to estimate the distribution of the augmentable magnitudes, and then a sample from this distribution is added to the original features to perform semantic data augmentation. We performed experiments on nine 2D and five 3D medical image datasets. Experimental results show that BSDA outperforms current DA methods. Additionally, BSDA can be easily assembled into CNNs or Transformers as a plug-and-play module, improving the network's performance. The code is available online at \url{https://github.com/YaoyaoZhu19/BSDA}.
Paper Structure (19 sections, 7 equations, 8 figures, 6 tables, 1 algorithm)

This paper contains 19 sections, 7 equations, 8 figures, 6 tables, 1 algorithm.

Figures (8)

  • Figure 1: Example of semantic data augmentation. \ref{['fig: semantic_dir']} is a non-exact example of different semantic directions in the feature space along which moving can change the semantics. subigure \ref{['fig: semantic_magn']} shows a non-exact example of semantic data augmentation in feature space
  • Figure 2: Imprecisely example of augmentable semantic magnitude. For instance, in the tumor grading task, if the centermost value (white) is an input sample, the curly brackets indicate the range that does not change the original label, and the augmented samples beyond that range change the original label.
  • Figure 3: Illustrates of BSDA as a plug-in into the nerual network.
  • Figure 4: Samples of MedMNIST+. For convenience, we removed the MNIST suffix from the dataset.
  • Figure 5: Ablation study for BSDA on BreastMNIST dataset using ResNet-18. Under the same hyper-parameters, if BSDA improves performance relative to the baseline (AUC=$89.6\%$, ACC=$81.0\%$), the heatmap displays red(AUC) or blue(ACC). In each subplot, the horizontal axis represents the sampling rate $U$ of BSDA, while the vertical axis denotes the probability $\lambda$ of random direction selection. \ref{['fig:sen_a']} and \ref{['fig:sen_b']} adds original features relative to \ref{['fig:sen_c']} and \ref{['fig:sen_d']}.
  • ...and 3 more figures