Table of Contents
Fetching ...

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attacks on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard, Leonid Sigal, Ilker Hacihaliloglu

TL;DR

The paper addresses the vulnerability of breast ultrasound classifiers to adversarial attacks in data-scarce medical settings by introducing Prompt2Perturb (P2P), a text-guided attack that optimizes text embeddings within a frozen latent diffusion model to generate semantically meaningful perturbations, focusing on early denoising steps for efficiency ($t \le 50$). It demonstrates that P2P achieves competitive attacker success with superior perceptual fidelity (lower FID and LPIPS) across three datasets, while preserving ultrasound realism compared with FGSM, PGD, and Diff-PGD. By leveraging clinical vocabulary and avoiding diffusion-model retraining, P2P offers a practical, data-efficient pathway for generating realistic adversarial examples in medical imaging and informs defenses against such attacks.

Abstract

Deep neural networks (DNNs) offer significant promise for improving breast cancer diagnosis in medical imaging. However, these models are highly susceptible to adversarial attacks--small, imperceptible changes that can mislead classifiers--raising critical concerns about their reliability and security. Traditional attacks rely on fixed-norm perturbations, misaligning with human perception. In contrast, diffusion-based attacks require pre-trained models, demanding substantial data when these models are unavailable, limiting practical use in data-scarce scenarios. In medical imaging, however, this is often unfeasible due to the limited availability of datasets. Building on recent advancements in learnable prompts, we propose Prompt2Perturb (P2P), a novel language-guided attack method capable of generating meaningful attack examples driven by text instructions. During the prompt learning phase, our approach leverages learnable prompts within the text encoder to create subtle, yet impactful, perturbations that remain imperceptible while guiding the model towards targeted outcomes. In contrast to current prompt learning-based approaches, our P2P stands out by directly updating text embeddings, avoiding the need for retraining diffusion models. Further, we leverage the finding that optimizing only the early reverse diffusion steps boosts efficiency while ensuring that the generated adversarial examples incorporate subtle noise, thus preserving ultrasound image quality without introducing noticeable artifacts. We show that our method outperforms state-of-the-art attack techniques across three breast ultrasound datasets in FID and LPIPS. Moreover, the generated images are both more natural in appearance and more effective compared to existing adversarial attacks. Our code will be publicly available https://github.com/yasamin-med/P2P.

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attacks on Breast Ultrasound Images

TL;DR

The paper addresses the vulnerability of breast ultrasound classifiers to adversarial attacks in data-scarce medical settings by introducing Prompt2Perturb (P2P), a text-guided attack that optimizes text embeddings within a frozen latent diffusion model to generate semantically meaningful perturbations, focusing on early denoising steps for efficiency (). It demonstrates that P2P achieves competitive attacker success with superior perceptual fidelity (lower FID and LPIPS) across three datasets, while preserving ultrasound realism compared with FGSM, PGD, and Diff-PGD. By leveraging clinical vocabulary and avoiding diffusion-model retraining, P2P offers a practical, data-efficient pathway for generating realistic adversarial examples in medical imaging and informs defenses against such attacks.

Abstract

Deep neural networks (DNNs) offer significant promise for improving breast cancer diagnosis in medical imaging. However, these models are highly susceptible to adversarial attacks--small, imperceptible changes that can mislead classifiers--raising critical concerns about their reliability and security. Traditional attacks rely on fixed-norm perturbations, misaligning with human perception. In contrast, diffusion-based attacks require pre-trained models, demanding substantial data when these models are unavailable, limiting practical use in data-scarce scenarios. In medical imaging, however, this is often unfeasible due to the limited availability of datasets. Building on recent advancements in learnable prompts, we propose Prompt2Perturb (P2P), a novel language-guided attack method capable of generating meaningful attack examples driven by text instructions. During the prompt learning phase, our approach leverages learnable prompts within the text encoder to create subtle, yet impactful, perturbations that remain imperceptible while guiding the model towards targeted outcomes. In contrast to current prompt learning-based approaches, our P2P stands out by directly updating text embeddings, avoiding the need for retraining diffusion models. Further, we leverage the finding that optimizing only the early reverse diffusion steps boosts efficiency while ensuring that the generated adversarial examples incorporate subtle noise, thus preserving ultrasound image quality without introducing noticeable artifacts. We show that our method outperforms state-of-the-art attack techniques across three breast ultrasound datasets in FID and LPIPS. Moreover, the generated images are both more natural in appearance and more effective compared to existing adversarial attacks. Our code will be publicly available https://github.com/yasamin-med/P2P.

Paper Structure

This paper contains 11 sections, 1 equation, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of P2P in an adversarial attack against Diff-PGD; note there is no exhibited change of image semantics in our method.
  • Figure 2: Overall framework of the proposed method. Image adapted from lin2023sd
  • Figure 3: Visual comparison of different attack methods on a benign image from the BUSI dataset, using DenseNet121 as the classifier. The second row displays the perturbations, calculated as the difference between the original image and the attacked example.
  • Figure 4: Comparison of original and P2P-attacked ultrasound images from BUS-BRA Dataset, using DenseNet121 as the classifier. The top row shows the original images with their diagnostic labels, while the bottom row displays the same images after applying the P2P attack. Green boxes indicate the true labels, while red boxes show the labels predicted by the classifier after the attack.
  • Figure 5: t-SNE visualization of last-layer ResNet34 features on the BUSI dataset for FGSM, PGD, Diff-PGD, and P2P (Ours). Clean examples are shown in blue, and adversarial examples in orange.
  • ...and 2 more figures