Table of Contents
Fetching ...

On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models

Hashmat Shadab Malik, Numan Saeed, Asif Hanif, Muzammal Naseer, Mohammad Yaqub, Salman Khan, Fahad Shahbaz Khan

TL;DR

This work aims to empirically examine the adversarial robustness across current volumetric segmentation architectures, encompassing Convolutional, Transformer, and Mamba-based models, and shows transformer-based models show higher robustness than convolution-based models with Mamba-based models being the most vulnerable.

Abstract

Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks in recent years. However, their vulnerability to adversarial attacks remains largely unexplored, raising serious concerns regarding the real-world deployment of tools employing such models in the healthcare sector. This underscores the importance of investigating the robustness of existing models. In this context, our work aims to empirically examine the adversarial robustness across current volumetric segmentation architectures, encompassing Convolutional, Transformer, and Mamba-based models. We extend this investigation across four volumetric segmentation datasets, evaluating robustness under both white box and black box adversarial attacks. Overall, we observe that while both pixel and frequency-based attacks perform reasonably well under \emph{white box} setting, the latter performs significantly better under transfer-based black box attacks. Across our experiments, we observe transformer-based models show higher robustness than convolution-based models with Mamba-based models being the most vulnerable. Additionally, we show that large-scale training of volumetric segmentation models improves the model's robustness against adversarial attacks. The code and robust models are available at https://github.com/HashmatShadab/Robustness-of-Volumetric-Medical-Segmentation-Models.

On Evaluating Adversarial Robustness of Volumetric Medical Segmentation Models

TL;DR

This work aims to empirically examine the adversarial robustness across current volumetric segmentation architectures, encompassing Convolutional, Transformer, and Mamba-based models, and shows transformer-based models show higher robustness than convolution-based models with Mamba-based models being the most vulnerable.

Abstract

Volumetric medical segmentation models have achieved significant success on organ and tumor-based segmentation tasks in recent years. However, their vulnerability to adversarial attacks remains largely unexplored, raising serious concerns regarding the real-world deployment of tools employing such models in the healthcare sector. This underscores the importance of investigating the robustness of existing models. In this context, our work aims to empirically examine the adversarial robustness across current volumetric segmentation architectures, encompassing Convolutional, Transformer, and Mamba-based models. We extend this investigation across four volumetric segmentation datasets, evaluating robustness under both white box and black box adversarial attacks. Overall, we observe that while both pixel and frequency-based attacks perform reasonably well under \emph{white box} setting, the latter performs significantly better under transfer-based black box attacks. Across our experiments, we observe transformer-based models show higher robustness than convolution-based models with Mamba-based models being the most vulnerable. Additionally, we show that large-scale training of volumetric segmentation models improves the model's robustness against adversarial attacks. The code and robust models are available at https://github.com/HashmatShadab/Robustness-of-Volumetric-Medical-Segmentation-Models.
Paper Structure (15 sections, 2 equations, 9 figures, 14 tables)

This paper contains 15 sections, 2 equations, 9 figures, 14 tables.

Figures (9)

  • Figure 1: LPIPS scores for adversarial examples crafted on different segmentation models.
  • Figure 2: Comparing multi-organ segmentation across various models under transfer-based black box attacks, where adversarial examples are generated on UNet and transferred to other unseen models.
  • Figure 3: Frequency Analysis (VAFA): Low frequency components of adversarial perturbation cause significant performance degradation(DSC score is reported).
  • Figure 4: Evaluating SAM-Med3D against transfer-based black box attacks.
  • Figure 5: White Box Attack Ablation: Evaluating robustness of volumetric segmentation models on white box attacks. For pixel-based attacks results are reported for $\epsilon=\frac{4}{255}$ and $\epsilon=\frac{8}{255}$ indicated by attack names followed by the suffixes $-4$ or $-8$, respectively. Regarding frequency-based attack VAFA, the results are reported with a constraint on $q_{\text{max}}$ set to $10$ and $30$, denoted as VAFA-10 and VAFA-30, respectively. DSC score (lower is better) and LPIPS score (higher is better) are reported on the generated adversarial examples.
  • ...and 4 more figures