CoRPA: Adversarial Image Generation for Chest X-rays Using Concept Vector Perturbations and Generative Models
Amy Rafferty, Rishi Ramaesh, Ajitha Rajan
TL;DR
The paper addresses robustness gaps in AI-assisted radiology by introducing CoRPA, a clinically grounded black-box adversarial attack that perturbs clinical concepts within radiology reports and uses a text-to-image diffusion model to synthesize adversarial chest X-rays. By labeling MIMIC-CXR-JPG with 17 clinical concepts and evaluating seven backbone architectures, the authors show CoRPA reveals vulnerabilities not exposed by standard attacks, particularly for outer-class perturbations that introduce features from a second pathology. The findings emphasize the need for domain-aware robustness testing and potential defenses, such as adversarial training with CoRPA-generated data, to ensure safe deployment of medical AI in high-stakes settings. The approach also provides a foundation for extending clinically focused adversarial evaluation to other medical imaging modalities and tasks.
Abstract
Deep learning models for medical image classification tasks are becoming widely implemented in AI-assisted diagnostic tools, aiming to enhance diagnostic accuracy, reduce clinician workloads, and improve patient outcomes. However, their vulnerability to adversarial attacks poses significant risks to patient safety. Current attack methodologies use general techniques such as model querying or pixel value perturbations to generate adversarial examples designed to fool a model. These approaches may not adequately address the unique characteristics of clinical errors stemming from missed or incorrectly identified clinical features. We propose the Concept-based Report Perturbation Attack (CoRPA), a clinically-focused black-box adversarial attack framework tailored to the medical imaging domain. CoRPA leverages clinical concepts to generate adversarial radiological reports and images that closely mirror realistic clinical misdiagnosis scenarios. We demonstrate the utility of CoRPA using the MIMIC-CXR-JPG dataset of chest X-rays and radiological reports. Our evaluation reveals that deep learning models exhibiting strong resilience to conventional adversarial attacks are significantly less robust when subjected to CoRPA's clinically-focused perturbations. This underscores the importance of addressing domain-specific vulnerabilities in medical AI systems. By introducing a specialized adversarial attack framework, this study provides a foundation for developing robust, real-world-ready AI models in healthcare, ensuring their safe and reliable deployment in high-stakes clinical environments.
