Table of Contents
Fetching ...

AnoRefiner: Anomaly-Aware Group-Wise Refinement for Zero-Shot Industrial Anomaly Detection

Dayou Huang, Feng Xue, Xurui Li, Yu Zhou

TL;DR

This work tackles the coarse localization problem in zero-shot industrial anomaly detection by introducing AnoRefiner, a plug-and-play refinement framework. It combines an Anomaly Refinement Decoder (ARD) with a Progressive Group-wise Test-Time Training (PGT) procedure to achieve pixel-level anomaly segmentation without real anomaly labels. ARD leverages anomaly score maps via anomaly-attention and bidirectional refinement to suppress background and enhance anomalous cues, while PGT simulates production-like groupwise adaptation. Experiments on MVTec AD and VisA show consistent pixel-level gains across multiple ZSAD backbones (up to 5.2% pixel-AP) and demonstrate robustness to pseudo-normal contamination, suggesting strong practical impact for automated industrial inspection.

Abstract

Zero-shot industrial anomaly detection (ZSAD) methods typically yield coarse anomaly maps as vision transformers (ViTs) extract patch-level features only. To solve this, recent solutions attempt to predict finer anomalies using features from ZSAD, but they still struggle to recover fine-grained anomalies without missed detections, mainly due to the gap between randomly synthesized training anomalies and real ones. We observe that anomaly score maps exactly provide complementary spatial cues that are largely absent from ZSAD's image features, a fact overlooked before. Inspired by this, we propose an anomaly-aware refiner (AnoRefiner) that can be plugged into most ZSAD models and improve patch-level anomaly maps to the pixel level. First, we design an anomaly refinement decoder (ARD) that progressively enhances image features using anomaly score maps, reducing the reliance on synthetic anomaly data. Second, motivated by the mass production paradigm, we propose a progressive group-wise test-time training (PGT) strategy that trains ARD in each product group for the refinement process in the next group, while staying compatible with any ZSAD method. Experiments on the MVTec AD and VisA datasets show that AnoRefiner boosts various ZSAD models by up to a 5.2\% gain in pixel-AP metrics, which can also be directly observed in many visualizations. The code will be available at https://github.com/HUST-SLOW/AnoRefiner.

AnoRefiner: Anomaly-Aware Group-Wise Refinement for Zero-Shot Industrial Anomaly Detection

TL;DR

This work tackles the coarse localization problem in zero-shot industrial anomaly detection by introducing AnoRefiner, a plug-and-play refinement framework. It combines an Anomaly Refinement Decoder (ARD) with a Progressive Group-wise Test-Time Training (PGT) procedure to achieve pixel-level anomaly segmentation without real anomaly labels. ARD leverages anomaly score maps via anomaly-attention and bidirectional refinement to suppress background and enhance anomalous cues, while PGT simulates production-like groupwise adaptation. Experiments on MVTec AD and VisA show consistent pixel-level gains across multiple ZSAD backbones (up to 5.2% pixel-AP) and demonstrate robustness to pseudo-normal contamination, suggesting strong practical impact for automated industrial inspection.

Abstract

Zero-shot industrial anomaly detection (ZSAD) methods typically yield coarse anomaly maps as vision transformers (ViTs) extract patch-level features only. To solve this, recent solutions attempt to predict finer anomalies using features from ZSAD, but they still struggle to recover fine-grained anomalies without missed detections, mainly due to the gap between randomly synthesized training anomalies and real ones. We observe that anomaly score maps exactly provide complementary spatial cues that are largely absent from ZSAD's image features, a fact overlooked before. Inspired by this, we propose an anomaly-aware refiner (AnoRefiner) that can be plugged into most ZSAD models and improve patch-level anomaly maps to the pixel level. First, we design an anomaly refinement decoder (ARD) that progressively enhances image features using anomaly score maps, reducing the reliance on synthetic anomaly data. Second, motivated by the mass production paradigm, we propose a progressive group-wise test-time training (PGT) strategy that trains ARD in each product group for the refinement process in the next group, while staying compatible with any ZSAD method. Experiments on the MVTec AD and VisA datasets show that AnoRefiner boosts various ZSAD models by up to a 5.2\% gain in pixel-AP metrics, which can also be directly observed in many visualizations. The code will be available at https://github.com/HUST-SLOW/AnoRefiner.

Paper Structure

This paper contains 22 sections, 4 equations, 10 figures, 14 tables.

Figures (10)

  • Figure 1: Comparison of anomaly maps generated by APRIL-GAN chen2023april, VCP-CLIP qu2024vcp, and MuSc li2024musc. (a) Pseudo-anomaly images, (b) Coarse anomaly maps from ZSAD methods, (c) Refinement using DeSTseg zhang2023destseg decoder, (d) Refinement using RealNet zhang2024realnet decoder, and (e) Refinement result from our ARD.
  • Figure 2: (a) CLIP-based methods operate on single test images. (b) Inter-image comparison methods leverage the entire test set for analysis. (c) Our AnoRefiner utilizes the current group to refine the anomaly localization in the next group.
  • Figure 3: Overview and intermediate results of our Anomaly Refinement Decoder (ARD). The ARD progressively refines image features using anomaly score maps through two Anomaly-aware Refinement (AR) Blocks and one Bidirectional Perception and Interaction (BI) Block. This process enables fine-grained anomaly localization without requiring real anomaly data.
  • Figure 4: Anomaly-Attention Module architecture and its intermediate feature visualizations.
  • Figure 5: Visualization of feature maps before and after Bidirectional Perception and Interaction operation.
  • ...and 5 more figures