Table of Contents
Fetching ...

Training-Free Zero-Shot Anomaly Detection in 3D Brain MRI with 2D Foundation Models

Tai Le-Gia, Jaehyun Ahn

TL;DR

A fully training-free framework for ZSAD in 3D brain MRI that constructs localized volumetric tokens by aggregating multi-axis slices processed by 2D foundation models that provide compact 3D representations that are practical to compute on standard GPUs and require no fine-tuning, prompts, or supervision.

Abstract

Zero-shot anomaly detection (ZSAD) has gained increasing attention in medical imaging as a way to identify abnormalities without task-specific supervision, but most advances remain limited to 2D datasets. Extending ZSAD to 3D medical images has proven challenging, with existing methods relying on slice-wise features and vision-language models, which fail to capture volumetric structure. In this paper, we introduce a fully training-free framework for ZSAD in 3D brain MRI that constructs localized volumetric tokens by aggregating multi-axis slices processed by 2D foundation models. These 3D patch tokens restore cubic spatial context and integrate directly with distance-based, batch-level anomaly detection pipelines. The framework provides compact 3D representations that are practical to compute on standard GPUs and require no fine-tuning, prompts, or supervision. Our results show that training-free, batch-based ZSAD can be effectively extended from 2D encoders to full 3D MRI volumes, offering a simple and robust approach for volumetric anomaly detection.

Training-Free Zero-Shot Anomaly Detection in 3D Brain MRI with 2D Foundation Models

TL;DR

A fully training-free framework for ZSAD in 3D brain MRI that constructs localized volumetric tokens by aggregating multi-axis slices processed by 2D foundation models that provide compact 3D representations that are practical to compute on standard GPUs and require no fine-tuning, prompts, or supervision.

Abstract

Zero-shot anomaly detection (ZSAD) has gained increasing attention in medical imaging as a way to identify abnormalities without task-specific supervision, but most advances remain limited to 2D datasets. Extending ZSAD to 3D medical images has proven challenging, with existing methods relying on slice-wise features and vision-language models, which fail to capture volumetric structure. In this paper, we introduce a fully training-free framework for ZSAD in 3D brain MRI that constructs localized volumetric tokens by aggregating multi-axis slices processed by 2D foundation models. These 3D patch tokens restore cubic spatial context and integrate directly with distance-based, batch-level anomaly detection pipelines. The framework provides compact 3D representations that are practical to compute on standard GPUs and require no fine-tuning, prompts, or supervision. Our results show that training-free, batch-based ZSAD can be effectively extended from 2D encoders to full 3D MRI volumes, offering a simple and robust approach for volumetric anomaly detection.
Paper Structure (39 sections, 1 theorem, 13 equations, 1 figure, 11 tables)

This paper contains 39 sections, 1 theorem, 13 equations, 1 figure, 11 tables.

Key Result

lemma 1

Let $\mathbf{x}^N$ and $\mathbf{x}^A$ be two volumes that differ only within a patch of $p$ slices, and let $\mathcal{A} \subset \mathcal{P}$ denote the anomalous subset with fraction $\alpha = |\mathcal{A}| / p$. Assume the encoder features satisfy: Then the patch-level feature difference satisfies In particular, if $\alpha \Delta_0 >> (1-\alpha)\varepsilon$, the anomalous patch token is strict

Figures (1)

  • Figure 1: Anomaly segmentation on 3D T2-weighted BraTS scans. Yellow contours show ground-truth boundaries, and blue regions denote predicted anomaly masks. Rows illustrate high-, medium-, and low-Dice cases for CoDeGraph3D and APRIL-GAN baselines.

Theorems & Definitions (2)

  • lemma 1: Patch-Level Sensitivity to Local Anomalies
  • proof