Domain-independent detection of known anomalies
Jonas Bühler, Jonas Fehrenbach, Lucas Steinmann, Christian Nauck, Marios Koulakis
TL;DR
This work addresses industrial anomaly detection under the constraint of sparse, known anomaly types across unseen objects by introducing a domain-generalization-on-sparse-classes task. It creates three cross-domain datasets derived from MVTec AD (hole, cut, color) to benchmark performance and introduces two embedding-based methods, Labeled PatchCore and SEMLP, alongside strong baselines. SEMLP achieves the best average image-level AUROC (87.2%) and often outperforms MIRO and PatchCore, highlighting the effectiveness of per-embedding MLP classifiers over coreset-based distance scoring in this setting. The open datasets and proposed approaches offer a practical pathway for deploying robust anomaly detection in diverse industrial contexts, with future work focusing on threshold-free operation and deeper analysis of failure cases.
Abstract
One persistent obstacle in industrial quality inspection is the detection of anomalies. In real-world use cases, two problems must be addressed: anomalous data is sparse and the same types of anomalies need to be detected on previously unseen objects. Current anomaly detection approaches can be trained with sparse nominal data, whereas domain generalization approaches enable detecting objects in previously unseen domains. Utilizing those two observations, we introduce the hybrid task of domain generalization on sparse classes. To introduce an accompanying dataset for this task, we present a modification of the well-established MVTec AD dataset by generating three new datasets. In addition to applying existing methods for benchmark, we design two embedding-based approaches, Spatial Embedding MLP (SEMLP) and Labeled PatchCore. Overall, SEMLP achieves the best performance with an average image-level AUROC of 87.2 % vs. 80.4 % by MIRO. The new and openly available datasets allow for further research to improve industrial anomaly detection.
