Table of Contents
Fetching ...

CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models

Xiang Chen, Fangfang Yang, Chunlei Meng, Chengyin Hu, Ang Li, Yiwei Wei, Jiahuan Long, Jiujiang Guo

Abstract

Medical vision--language models (MVLMs) are increasingly used as perceptual backbones in radiology pipelines and as the visual front end of multimodal assistants, yet their reliability under real clinical workflows remains underexplored. Prior robustness evaluations often assume clean, curated inputs or study isolated corruptions, overlooking routine acquisition, reconstruction, display, and delivery operations that preserve clinical readability while shifting image statistics. To address this gap, we propose CoDA, a chain-of-distribution framework that constructs clinically plausible pipeline shifts by composing acquisition-like shading, reconstruction and display remapping, and delivery and export degradations. Under masked structural-similarity constraints, CoDA jointly optimizes stage compositions and parameters to induce failures while preserving visual plausibility. Across brain MRI, chest X-ray, and abdominal CT, CoDA substantially degrades the zero-shot performance of CLIP-style MVLMs, with chained compositions consistently more damaging than any single stage. We also evaluate multimodal large language models (MLLMs) as technical-authenticity auditors of imaging realism and quality rather than pathology. Proprietary multimodal models show degraded auditing reliability and persistent high-confidence errors on CoDA-shifted samples, while the medical-specific MLLMs we test exhibit clear deficiencies in medical image quality auditing. Finally, we introduce a post-hoc repair strategy based on teacher-guided token-space adaptation with patch-level alignment, which improves accuracy on archived CoDA outputs. Overall, our findings characterize a clinically grounded threat surface for MVLM deployment and show that lightweight alignment improves robustness in deployment.

CoDA: Exploring Chain-of-Distribution Attacks and Post-Hoc Token-Space Repair for Medical Vision-Language Models

Abstract

Medical vision--language models (MVLMs) are increasingly used as perceptual backbones in radiology pipelines and as the visual front end of multimodal assistants, yet their reliability under real clinical workflows remains underexplored. Prior robustness evaluations often assume clean, curated inputs or study isolated corruptions, overlooking routine acquisition, reconstruction, display, and delivery operations that preserve clinical readability while shifting image statistics. To address this gap, we propose CoDA, a chain-of-distribution framework that constructs clinically plausible pipeline shifts by composing acquisition-like shading, reconstruction and display remapping, and delivery and export degradations. Under masked structural-similarity constraints, CoDA jointly optimizes stage compositions and parameters to induce failures while preserving visual plausibility. Across brain MRI, chest X-ray, and abdominal CT, CoDA substantially degrades the zero-shot performance of CLIP-style MVLMs, with chained compositions consistently more damaging than any single stage. We also evaluate multimodal large language models (MLLMs) as technical-authenticity auditors of imaging realism and quality rather than pathology. Proprietary multimodal models show degraded auditing reliability and persistent high-confidence errors on CoDA-shifted samples, while the medical-specific MLLMs we test exhibit clear deficiencies in medical image quality auditing. Finally, we introduce a post-hoc repair strategy based on teacher-guided token-space adaptation with patch-level alignment, which improves accuracy on archived CoDA outputs. Overall, our findings characterize a clinically grounded threat surface for MVLM deployment and show that lightweight alignment improves robustness in deployment.
Paper Structure (17 sections, 16 equations, 5 figures, 3 tables)

This paper contains 17 sections, 16 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Qualitative visualization of CoDA composition families. Representative clean and adversarial samples for MRI, X-ray, and CT under the CoDA shift space in Eq. \ref{['eq:families_compact']}.
  • Figure 2: Auditing performance under clean vs. CoDA shifts. We compare proprietary and medical-specific MLLMs across MRI, X-ray, and CT, reporting clean accuracy, overall accuracy, and accuracy on CoDA-shifted inputs.
  • Figure 3: Post-hoc token-space repair. (a) A frozen teacher optionally guides a lightweight student token adapter (rotW) trained on clean data with task, view-consistency, and distillation losses. (b) Accuracy on archived CoDA outputs for the baseline and repaired models, comparing unguided vs. teacher-guided adaptation across MRI, CT, and X-ray.
  • Figure 4: Ablation of optimization iterations. Attack success rate under CoDA as a function of the optimization iteration budget, reported for MRI, X-ray, and CT across BioMed-CLIP, UniMed-CLIP, BMC-CLIP, and Rad-CLIP.
  • Figure 5: Ablation of repair hyperparameters. Robust accuracy under CoDA shifts as a function of $\lambda_{\mathit{cons}}$ and $\lambda_{\mathit{dist}}$, compared against the clean baseline (gray) and the attack baseline (red dashed), reported for MRI/X-ray/CT.