Table of Contents
Fetching ...

IterMask3D: Unsupervised Anomaly Detection and Segmentation with Test-Time Iterative Mask Refinement in 3D Brain MR

Ziyun Liang, Xiaoqing Guo, Wentian Xu, Yasin Ibrahim, Natalie Voets, Pieter M Pretorius, J. Alison Noble, Konstantinos Kamnitsas

TL;DR

This work tackles unsupervised anomaly segmentation in 3D brain MRI by reframing reconstruction-based detection with IterMask3D, an iterative mask-refinement framework guided by high-frequency structural information. The method combines iterative mask shrinking, Fourier-based high-frequency conditioning, and a subject-specific thresholding strategy to balance sensitivity and precision without relying on anomaly priors. Empirical results across synthetic and real artifacts, plus 3D pathology datasets (BraTS, ISLES), show competitive or superior performance to strong baselines, especially in artifact detection and robust segmentation without validation-data thresholds. The approach offers practical value for scan quality control and incidental lesion screening, while revealing avenues for handling domain shifts and intensity variability in clinical MRI.

Abstract

Unsupervised anomaly detection and segmentation methods train a model to learn the training distribution as `normal'. In the testing phase, they identify patterns that deviate from this normal distribution as `anomalies'. To learn the `normal' distribution, prevailing methods corrupt the images and train a model to reconstruct them. During testing, the model attempts to reconstruct corrupted inputs based on the learned `normal' distribution. Deviations from this distribution lead to high reconstruction errors, which indicate potential anomalies. However, corrupting an input image inevitably causes information loss even in normal regions, leading to suboptimal reconstruction and an increased risk of false positives. To alleviate this, we propose $\rm{IterMask3D}$, an iterative spatial mask-refining strategy designed for 3D brain MRI. We iteratively spatially mask areas of the image as corruption and reconstruct them, then shrink the mask based on reconstruction error. This process iteratively unmasks `normal' areas to the model, whose information further guides reconstruction of `normal' patterns under the mask to be reconstructed accurately, reducing false positives. In addition, to achieve better reconstruction performance, we also propose using high-frequency image content as additional structural information to guide the reconstruction of the masked area. Extensive experiments on the detection of both synthetic and real-world imaging artifacts, as well as segmentation of various pathological lesions across multiple MRI sequences, consistently demonstrate the effectiveness of our proposed method. Code is available at https://github.com/ZiyunLiang/IterMask3D.

IterMask3D: Unsupervised Anomaly Detection and Segmentation with Test-Time Iterative Mask Refinement in 3D Brain MR

TL;DR

This work tackles unsupervised anomaly segmentation in 3D brain MRI by reframing reconstruction-based detection with IterMask3D, an iterative mask-refinement framework guided by high-frequency structural information. The method combines iterative mask shrinking, Fourier-based high-frequency conditioning, and a subject-specific thresholding strategy to balance sensitivity and precision without relying on anomaly priors. Empirical results across synthetic and real artifacts, plus 3D pathology datasets (BraTS, ISLES), show competitive or superior performance to strong baselines, especially in artifact detection and robust segmentation without validation-data thresholds. The approach offers practical value for scan quality control and incidental lesion screening, while revealing avenues for handling domain shifts and intensity variability in clinical MRI.

Abstract

Unsupervised anomaly detection and segmentation methods train a model to learn the training distribution as `normal'. In the testing phase, they identify patterns that deviate from this normal distribution as `anomalies'. To learn the `normal' distribution, prevailing methods corrupt the images and train a model to reconstruct them. During testing, the model attempts to reconstruct corrupted inputs based on the learned `normal' distribution. Deviations from this distribution lead to high reconstruction errors, which indicate potential anomalies. However, corrupting an input image inevitably causes information loss even in normal regions, leading to suboptimal reconstruction and an increased risk of false positives. To alleviate this, we propose , an iterative spatial mask-refining strategy designed for 3D brain MRI. We iteratively spatially mask areas of the image as corruption and reconstruct them, then shrink the mask based on reconstruction error. This process iteratively unmasks `normal' areas to the model, whose information further guides reconstruction of `normal' patterns under the mask to be reconstructed accurately, reducing false positives. In addition, to achieve better reconstruction performance, we also propose using high-frequency image content as additional structural information to guide the reconstruction of the masked area. Extensive experiments on the detection of both synthetic and real-world imaging artifacts, as well as segmentation of various pathological lesions across multiple MRI sequences, consistently demonstrate the effectiveness of our proposed method. Code is available at https://github.com/ZiyunLiang/IterMask3D.

Paper Structure

This paper contains 20 sections, 20 equations, 8 figures, 7 tables, 1 algorithm.

Figures (8)

  • Figure 1: Sensitivity–Precision Trade-off in reconstruction-based methods: When corruption (e.g., noise or compression) is lower, the reconstruction error over anomalous regions (the hyper-intense tumor) remains low, often resulting in missed detections. As corruption increases, errors in abnormal regions become more pronounced, improving sensitivity—but at the cost of higher errors in normal regions, which can increase false positives and reduce precision.
  • Figure 2: Overview of the proposed approach. (a) Frequency-based masking: Extract structural information $x_f$ with a high-pass filter to isolate the high-frequency components of the image. (b) Training process: Input image $x$ is masked using random spatial masking, and the model learns to reconstruct the masked area with $x_f$ generated in (a) as an auxiliary input. (c) Iterative mask refinement: Iterative refinement of the spatial mask, where the mask $m_t$ at iteration $t$ gradually shrinks towards the anomalous region. This refinement is guided by the spatially unmasked portions of the image $x_m$ and $x_f$ from (a). (d) Subject-Specific Thresholding: This component illustrates how the threshold $\tau$ is determined for iterative mask refinement on a per-sample basis. The mask is progressively shrunk by 1% of the brain volume at each time step $t$ until it fully disappears. We model $\tau(t)$ using a fitted function, and compute its derivative to identify the point where the threshold begins to change abruptly. This point is selected as the sample-specific threshold $\tau$ for the current sample.
  • Figure 3: Illustration of threshold dynamics with fixed shrinking speed of the mask. The figures shown are plotted from four randomly chosen samples, two from the BraTS FLAIR sequence (left) and two from the BraTS T2 sequence (right). For each sample, the mask is progressively shrunk by 1% of the brain volume per iteration until it fully disappears. The purple points represent the threshold required to produce the current shrunken mask based on the reconstruction error map, with the red curve showing the fitted function. Orange points indicate the corresponding Dice scores between the mask and the ground truth anomaly at each iteration. As the mask approaches the anomaly, the Dice score increases and peaks when the anomaly is optimally exposed. Correspondingly, the corresponding threshold rises gradually, then sharply increases near the anomaly boundary—indicating a distinct shift in reconstruction error and serving as a marker for finding the optimal stopping threshold.
  • Figure 4: Illustration of the datasets used in the format: dataset name (sequence). First row, left to right, ADNI (FLAIR), OASIS (T2), Private Artifact dataset (T2); Second row, left to right, BraTS (FLAIR), BraTS (T2), BraTS (T1ce); Third row, left to right, BraTS (T1), ISLES (FLAIR) (with small lesion), ISLES (FLAIR) (with big lesion); Fourth row, three samples from ADNI (FLAIR) with problematic skull-stripping.
  • Figure 5: Visualization of different extents of synthetic anomaly added to the image, along with $AUC$ score of the model's anomaly detection performance.
  • ...and 3 more figures