Table of Contents
Fetching ...

DINO-AD: Unsupervised Anomaly Detection with Frozen DINO-V3 Features

Jiayu Huo, Jingyuan Hong, Liyun Chen

TL;DR

This work tackles unsupervised anomaly detection in medical imaging by introducing a training-free framework, DINO-AD, that leverages frozen DINO-V3 features. It combines embedding similarity matching to pick a semantically aligned normal support image with a foreground-aware K-means clustering of normal feature prototypes, and computes anomaly maps via cosine similarity to these centroids. The approach achieves state-of-the-art performance on Brain and Liver datasets, with AUROC up to 98.71 and clear qualitative improvements in localization, validating its robustness and generalizability. The method offers a scalable solution that avoids dataset-specific fine-tuning, making it well-suited for label-efficient clinical deployment and cross-domain application.

Abstract

Unsupervised anomaly detection (AD) in medical images aims to identify abnormal regions without relying on pixel-level annotations, which is crucial for scalable and label-efficient diagnostic systems. In this paper, we propose a novel anomaly detection framework based on DINO-V3 representations, termed DINO-AD, which leverages self-supervised visual features for precise and interpretable anomaly localization. Specifically, we introduce an embedding similarity matching strategy to select a semantically aligned support image and a foreground-aware K-means clustering module to model the distribution of normal features. Anomaly maps are then computed by comparing the query features with clustered normal embeddings through cosine similarity. Experimental results on both the Brain and Liver datasets demonstrate that our method achieves superior quantitative performance compared with state-of-the-art approaches, achieving AUROC scores of up to 98.71. Qualitative results further confirm that our framework produces clearer and more accurate anomaly localization. Extensive ablation studies validate the effectiveness of each proposed component, highlighting the robustness and generalizability of our approach.

DINO-AD: Unsupervised Anomaly Detection with Frozen DINO-V3 Features

TL;DR

This work tackles unsupervised anomaly detection in medical imaging by introducing a training-free framework, DINO-AD, that leverages frozen DINO-V3 features. It combines embedding similarity matching to pick a semantically aligned normal support image with a foreground-aware K-means clustering of normal feature prototypes, and computes anomaly maps via cosine similarity to these centroids. The approach achieves state-of-the-art performance on Brain and Liver datasets, with AUROC up to 98.71 and clear qualitative improvements in localization, validating its robustness and generalizability. The method offers a scalable solution that avoids dataset-specific fine-tuning, making it well-suited for label-efficient clinical deployment and cross-domain application.

Abstract

Unsupervised anomaly detection (AD) in medical images aims to identify abnormal regions without relying on pixel-level annotations, which is crucial for scalable and label-efficient diagnostic systems. In this paper, we propose a novel anomaly detection framework based on DINO-V3 representations, termed DINO-AD, which leverages self-supervised visual features for precise and interpretable anomaly localization. Specifically, we introduce an embedding similarity matching strategy to select a semantically aligned support image and a foreground-aware K-means clustering module to model the distribution of normal features. Anomaly maps are then computed by comparing the query features with clustered normal embeddings through cosine similarity. Experimental results on both the Brain and Liver datasets demonstrate that our method achieves superior quantitative performance compared with state-of-the-art approaches, achieving AUROC scores of up to 98.71. Qualitative results further confirm that our framework produces clearer and more accurate anomaly localization. Extensive ablation studies validate the effectiveness of each proposed component, highlighting the robustness and generalizability of our approach.
Paper Structure (11 sections, 6 equations, 2 figures, 3 tables)

This paper contains 11 sections, 6 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: Overview of DINO-AD framework. Our pipeline is training-free, which allows it to generalize across diverse datasets and anomaly types without requiring labeled data or prior model adaptation.
  • Figure 2: Qualitative results of the generated anomaly maps. Here we use the AnomalyDINO as the baseline model.