Table of Contents
Fetching ...

Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation

Tianhao Li, Tianyu Zeng, Yujia Zheng, Chulong Zhang, Jingyu Lu, Haotian Huang, Chuangxin Chu, Fang-Fang Yin, Zhenyu Yang

TL;DR

This work addresses the risk that synthetic MRI data generated by GANs can contaminate training for brain tumor segmentation with U-Net. By generating synthetic T1-Ce MRIs from CT data and injecting them at varying proportions into training sets, the authors demonstrate that segmentation performance degrades as synthetic content increases, with the Dice coefficient dropping from about 0.89 to 0.75 when synthetic data rises from 33% to 83%. The findings emphasize the need for rigorous quality control and carefully regulated augmentation to maintain reliability in AI-driven medical imaging. The study provides practical guidance for developing safer, more trustworthy segmentation pipelines that minimize the adverse effects of synthetic data.

Abstract

Deep learning-based medical image segmentation models, such as U-Net, rely on high-quality annotated datasets to achieve accurate predictions. However, the increasing use of generative models for synthetic data augmentation introduces potential risks, particularly in the absence of rigorous quality control. In this paper, we investigate the impact of synthetic MRI data on the robustness and segmentation accuracy of U-Net models for brain tumor segmentation. Specifically, we generate synthetic T1-contrast-enhanced (T1-Ce) MRI scans using a GAN-based model with a shared encoding-decoding framework and shortest-path regularization. To quantify the effect of synthetic data contamination, we train U-Net models on progressively "poisoned" datasets, where synthetic data proportions range from 16.67% to 83.33%. Experimental results on a real MRI validation set reveal a significant performance degradation as synthetic data increases, with Dice coefficients dropping from 0.8937 (33.33% synthetic) to 0.7474 (83.33% synthetic). Accuracy and sensitivity exhibit similar downward trends, demonstrating the detrimental effect of synthetic data on segmentation robustness. These findings underscore the importance of quality control in synthetic data integration and highlight the risks of unregulated synthetic augmentation in medical image analysis. Our study provides critical insights for the development of more reliable and trustworthy AI-driven medical imaging systems.

Synthetic Poisoning Attacks: The Impact of Poisoned MRI Image on U-Net Brain Tumor Segmentation

TL;DR

This work addresses the risk that synthetic MRI data generated by GANs can contaminate training for brain tumor segmentation with U-Net. By generating synthetic T1-Ce MRIs from CT data and injecting them at varying proportions into training sets, the authors demonstrate that segmentation performance degrades as synthetic content increases, with the Dice coefficient dropping from about 0.89 to 0.75 when synthetic data rises from 33% to 83%. The findings emphasize the need for rigorous quality control and carefully regulated augmentation to maintain reliability in AI-driven medical imaging. The study provides practical guidance for developing safer, more trustworthy segmentation pipelines that minimize the adverse effects of synthetic data.

Abstract

Deep learning-based medical image segmentation models, such as U-Net, rely on high-quality annotated datasets to achieve accurate predictions. However, the increasing use of generative models for synthetic data augmentation introduces potential risks, particularly in the absence of rigorous quality control. In this paper, we investigate the impact of synthetic MRI data on the robustness and segmentation accuracy of U-Net models for brain tumor segmentation. Specifically, we generate synthetic T1-contrast-enhanced (T1-Ce) MRI scans using a GAN-based model with a shared encoding-decoding framework and shortest-path regularization. To quantify the effect of synthetic data contamination, we train U-Net models on progressively "poisoned" datasets, where synthetic data proportions range from 16.67% to 83.33%. Experimental results on a real MRI validation set reveal a significant performance degradation as synthetic data increases, with Dice coefficients dropping from 0.8937 (33.33% synthetic) to 0.7474 (83.33% synthetic). Accuracy and sensitivity exhibit similar downward trends, demonstrating the detrimental effect of synthetic data on segmentation robustness. These findings underscore the importance of quality control in synthetic data integration and highlight the risks of unregulated synthetic augmentation in medical image analysis. Our study provides critical insights for the development of more reliable and trustworthy AI-driven medical imaging systems.

Paper Structure

This paper contains 14 sections, 4 equations, 4 figures, 1 table, 1 algorithm.

Figures (4)

  • Figure 1: Overall workflow.
  • Figure 2: Workflow of generative model xie2023unpaired.
  • Figure 3: A sample of segmentation results of the ET region from the same MRI scan using the U-Net model at varying poisoning rates.
  • Figure 4: Case studies of synthetic MRI from real CT.