Patch-aware Vector Quantized Codebook Learning for Unsupervised Visual Defect Detection
Qisen Cheng, Shuhui Qu, Janghwan Lee
TL;DR
The paper tackles unsupervised visual defect detection by learning a normality memory that encodes typical patterns into a discrete codebook with $K$ codes. PVQAE extends VQ-VAE with patch-aware dynamic code allocation across a resolution set $R$, governed by a Dynamic Routing Module and a multi-loss objective including a progressive budget term. Normal budget priors are learned via a Budget Prior Transformer to predict typical budget patterns, which constrain reconstruction in potential defect regions. On MVTecAD, BTAD, and MTSD, PVQAE achieves state-of-the-art or competitive performance for image- and pixel-level defect detection, while avoiding excessive memory or computation compared with some baseline methods.
Abstract
Unsupervised visual defect detection is critical in industrial applications, requiring a representation space that captures normal data features while detecting deviations. Achieving a balance between expressiveness and compactness is challenging; an overly expressive space risks inefficiency and mode collapse, impairing detection accuracy. We propose a novel approach using an enhanced VQ-VAE framework optimized for unsupervised defect detection. Our model introduces a patch-aware dynamic code assignment scheme, enabling context-sensitive code allocation to optimize spatial representation. This strategy enhances normal-defect distinction and improves detection accuracy during inference. Experiments on MVTecAD, BTAD, and MTSD datasets show our method achieves state-of-the-art performance.
