HLGFA: High-Low Resolution Guided Feature Alignment for Unsupervised Anomaly Detection
Han Zhou, Yuxuan Gao, Yinchao Du, Xuezhe Zheng
TL;DR
HLGFA addresses unsupervised industrial anomaly detection by exploiting cross-resolution feature consistency between high- and low-resolution views. It employs a frozen backbone and a structure-detail decoupled guidance mechanism to refine low-resolution features under high-resolution supervision, turning cross-resolution misalignment into anomaly cues. A noise-aware data augmentation strategy enhances robustness to nuisance industrial patterns. On the MVTec AD benchmark, HLGFA achieves strong pixel- and image-level AUROC and robust localization without reconstruction or memory banks, highlighting its practicality for real-world inspection and scalability to diverse defect types.
Abstract
Unsupervised industrial anomaly detection (UAD) is essential for modern manufacturing inspection, where defect samples are scarce and reliable detection is required. In this paper, we propose HLGFA, a high-low resolution guided feature alignment framework that learns normality by modeling cross-resolution feature consistency between high-resolution and low-resolution representations of normal samples, instead of relying on pixel-level reconstruction. Dual-resolution inputs are processed by a shared frozen backbone to extract multi-level features, and high-resolution representations are decomposed into structure and detail priors to guide the refinement of low-resolution features through conditional modulation and gated residual correction. During inference, anomalies are naturally identified as regions where cross-resolution alignment breaks down. In addition, a noise-aware data augmentation strategy is introduced to suppress nuisance-induced responses commonly observed in industrial environments. Extensive experiments on standard benchmarks demonstrate the effectiveness of HLGFA, achieving 97.9% pixel-level AUROC and 97.5% image-level AUROC on the MVTec AD dataset, outperforming representative reconstruction-based and feature-based methods.
