Table of Contents
Fetching ...

MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection

Ximiao Zhang, Min Xu, Dehui Qiu, Ruixin Yan, Ning Lang, Xiuzhuang Zhou

TL;DR

MediCLIP tackles few-shot medical image anomaly detection by adapting CLIP through learnable prompts and adapters, augmented with multi-task anomaly synthesis to generate diverse synthetic abnormalities. The method aligns multi-scale visual features with text-derived normal and anomaly prompts to produce pixel-level anomaly maps, enabling detection and localization with limited normal data. Across CheXpert, BrainMRI, and BUSI, MediCLIP achieves roughly a 10% boost in Image-AUROC over strong baselines and approaches full-shot performance on CheXpert with under 1% of training images, also demonstrating zero-shot generalization. This approach offers a cost-efficient, versatile, potentially unified framework for medical anomaly detection and localization in clinical settings.

Abstract

In the field of medical decision-making, precise anomaly detection in medical imaging plays a pivotal role in aiding clinicians. However, previous work is reliant on large-scale datasets for training anomaly detection models, which increases the development cost. This paper first focuses on the task of medical image anomaly detection in the few-shot setting, which is critically significant for the medical field where data collection and annotation are both very expensive. We propose an innovative approach, MediCLIP, which adapts the CLIP model to few-shot medical image anomaly detection through self-supervised fine-tuning. Although CLIP, as a vision-language model, demonstrates outstanding zero-/fewshot performance on various downstream tasks, it still falls short in the anomaly detection of medical images. To address this, we design a series of medical image anomaly synthesis tasks to simulate common disease patterns in medical imaging, transferring the powerful generalization capabilities of CLIP to the task of medical image anomaly detection. When only few-shot normal medical images are provided, MediCLIP achieves state-of-the-art performance in anomaly detection and location compared to other methods. Extensive experiments on three distinct medical anomaly detection tasks have demonstrated the superiority of our approach. The code is available at https://github.com/cnulab/MediCLIP.

MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection

TL;DR

MediCLIP tackles few-shot medical image anomaly detection by adapting CLIP through learnable prompts and adapters, augmented with multi-task anomaly synthesis to generate diverse synthetic abnormalities. The method aligns multi-scale visual features with text-derived normal and anomaly prompts to produce pixel-level anomaly maps, enabling detection and localization with limited normal data. Across CheXpert, BrainMRI, and BUSI, MediCLIP achieves roughly a 10% boost in Image-AUROC over strong baselines and approaches full-shot performance on CheXpert with under 1% of training images, also demonstrating zero-shot generalization. This approach offers a cost-efficient, versatile, potentially unified framework for medical anomaly detection and localization in clinical settings.

Abstract

In the field of medical decision-making, precise anomaly detection in medical imaging plays a pivotal role in aiding clinicians. However, previous work is reliant on large-scale datasets for training anomaly detection models, which increases the development cost. This paper first focuses on the task of medical image anomaly detection in the few-shot setting, which is critically significant for the medical field where data collection and annotation are both very expensive. We propose an innovative approach, MediCLIP, which adapts the CLIP model to few-shot medical image anomaly detection through self-supervised fine-tuning. Although CLIP, as a vision-language model, demonstrates outstanding zero-/fewshot performance on various downstream tasks, it still falls short in the anomaly detection of medical images. To address this, we design a series of medical image anomaly synthesis tasks to simulate common disease patterns in medical imaging, transferring the powerful generalization capabilities of CLIP to the task of medical image anomaly detection. When only few-shot normal medical images are provided, MediCLIP achieves state-of-the-art performance in anomaly detection and location compared to other methods. Extensive experiments on three distinct medical anomaly detection tasks have demonstrated the superiority of our approach. The code is available at https://github.com/cnulab/MediCLIP.
Paper Structure (11 sections, 5 equations, 4 figures, 5 tables)

This paper contains 11 sections, 5 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: The overall pipeline of our proposed MediCLIP framework.
  • Figure 2: Synthetic anomaly image examples from diverse tasks.
  • Figure 3: Zero-shot anomaly detection performance of MediCLIP with support set size $k = 16$.
  • Figure 4: Qualitative results of MediCLIP. Each group contains the anomaly image and the predicted anomaly map.