ToCoAD: Two-Stage Contrastive Learning for Industrial Anomaly Detection
Yun Liang, Zhiguang Hu, Junjie Huang, Donglin Di, Anyang Su, Lei Fan
TL;DR
This work tackles the domain gap between pre-trained feature extractors and industrial anomaly data in unsupervised anomaly detection. It introduces ToCoAD, a two-stage training framework where a discriminative network is first trained with synthetic anomalies to coarse-locates defects, and then jointly fine-tunes the feature extractor via negative-guided bootstrap contrastive learning guided by the discriminative network, complemented by a memory-bank-based localization mechanism. Empirical results across MVTec AD, VisA, and BTAD demonstrate competitive pixel-level and image-level AUROC, with particular strength when using Perlin-noise–generated anomalies and a SimSiam-based contrastive loss. Ablation analyses confirm the importance of the two-stage design, the choice of anomaly generator, and the role of memory-coreset memory in robust industrial anomaly localization.
Abstract
Current unsupervised anomaly detection approaches perform well on public datasets but struggle with specific anomaly types due to the domain gap between pre-trained feature extractors and target-specific domains. To tackle this issue, this paper presents a two-stage training strategy, called \textbf{ToCoAD}. In the first stage, a discriminative network is trained by using synthetic anomalies in a self-supervised learning manner. This network is then utilized in the second stage to provide a negative feature guide, aiding in the training of the feature extractor through bootstrap contrastive learning. This approach enables the model to progressively learn the distribution of anomalies specific to industrial datasets, effectively enhancing its generalizability to various types of anomalies. Extensive experiments are conducted to demonstrate the effectiveness of our proposed two-stage training strategy, and our model produces competitive performance, achieving pixel-level AUROC scores of 98.21\%, 98.43\% and 97.70\% on MVTec AD, VisA and BTAD respectively.
