Table of Contents
Fetching ...

Semantic Iterative Reconstruction: One-Shot Universal Anomaly Detection

Ning Zhu

Abstract

Unsupervised medical anomaly detection is severely limited by the scarcity of normal training samples. Existing methods typically train dedicated models for each dataset or disease, requiring hundreds of normal images per task and lacking cross-modality generalization. We propose Semantic Iterative Reconstruction (SIR), a framework that enables a single universal model to detect anomalies across diverse medical domains using extremely few normal samples. SIR leverages a pretrained teacher encoder to extract multi-scale deep features and employs a compact up-then-down decoder with multi-loop iterative refinement to enforce robust normality priors in deep feature space. The framework adopts a one-shot universal design: a single model is trained by mixing exactly one normal sample from each of nine heterogeneous datasets, enabling effective anomaly detection on all corresponding test sets without task-specific retraining. Extensive experiments on nine medical benchmarks demonstrate that SIR achieves state-of-the-art under all four settings -- one-shot universal, full-shot universal, one-shot specialized, and full-shot specialized -- consistently outperforming previous methods. SIR offers an efficient and scalable solution for multi-domain clinical anomaly detection. Code is available at https://github.com/jusufzn212427/sir4ad.

Semantic Iterative Reconstruction: One-Shot Universal Anomaly Detection

Abstract

Unsupervised medical anomaly detection is severely limited by the scarcity of normal training samples. Existing methods typically train dedicated models for each dataset or disease, requiring hundreds of normal images per task and lacking cross-modality generalization. We propose Semantic Iterative Reconstruction (SIR), a framework that enables a single universal model to detect anomalies across diverse medical domains using extremely few normal samples. SIR leverages a pretrained teacher encoder to extract multi-scale deep features and employs a compact up-then-down decoder with multi-loop iterative refinement to enforce robust normality priors in deep feature space. The framework adopts a one-shot universal design: a single model is trained by mixing exactly one normal sample from each of nine heterogeneous datasets, enabling effective anomaly detection on all corresponding test sets without task-specific retraining. Extensive experiments on nine medical benchmarks demonstrate that SIR achieves state-of-the-art under all four settings -- one-shot universal, full-shot universal, one-shot specialized, and full-shot specialized -- consistently outperforming previous methods. SIR offers an efficient and scalable solution for multi-domain clinical anomaly detection. Code is available at https://github.com/jusufzn212427/sir4ad.
Paper Structure (13 sections, 5 equations, 2 figures, 5 tables)

This paper contains 13 sections, 5 equations, 2 figures, 5 tables.

Figures (2)

  • Figure 1: Overview of the SIR framework. The architecture consists of a frozen pre-trained teacher encoder $E$ and student decoder $D$. During training, the teacher encoder processes one normal image from each of multiple source domains, extracting multi-scale deep features. The student decoder then performs iterative semantic reconstruction on the most compressed features through recurrent loops. At inference time, for any previously unseen target domain, the same single model receives only a test image $x$, and the anomaly score and pixel-level anomaly map are finally obtained from the reconstruction discrepancies accumulated across all iterative loops.
  • Figure 2: Qualitative visualization of anomaly maps generated by SIR under the one-shot universal detection setting. For normal samples, all four maps remain uniformly clean across loops. For abnormal samples, the anomaly regions become progressively sharper and more localized as the number of loops increases, with the final fused map providing the clearest boundary and highest contrast.