The MVTec AD 2 Dataset: Advanced Scenarios for Unsupervised Anomaly Detection
Lars Heckler-Kram, Jan-Hendrik Neudeck, Ulla Scheler, Rebecca König, Carsten Steger
TL;DR
MVTec AD 2 introduces eight advanced 2D anomaly-detection scenarios totaling 8,004 high-resolution images to address saturation in existing benchmarks and to enable robust testing under real-world lighting distribution shifts. The dataset provides defect-free training, diverse test conditions including unseen lighting, and pixel-precise ground truth with a public evaluation server to ensure fair comparisons. A benchmark of seven state-of-the-art methods shows significant room for improvement, with threshold-independent AU-PRO$_{0.05}$ scores generally below 30-40% and even lower when considering boundary and small defects, despite occasional gains at larger image sizes. The work emphasizes the importance of robustness to distribution shifts and efficiency, offering a standardized, transparent platform for fair performance assessment and future methodological advances in unsupervised industrial anomaly detection.
Abstract
In recent years, performance on existing anomaly detection benchmarks like MVTec AD and VisA has started to saturate in terms of segmentation AU-PRO, with state-of-the-art models often competing in the range of less than one percentage point. This lack of discriminatory power prevents a meaningful comparison of models and thus hinders progress of the field, especially when considering the inherent stochastic nature of machine learning results. We present MVTec AD 2, a collection of eight anomaly detection scenarios with more than 8000 high-resolution images. It comprises challenging and highly relevant industrial inspection use cases that have not been considered in previous datasets, including transparent and overlapping objects, dark-field and back light illumination, objects with high variance in the normal data, and extremely small defects. We provide comprehensive evaluations of state-of-the-art methods and show that their performance remains below 60% average AU-PRO. Additionally, our dataset provides test scenarios with lighting condition changes to assess the robustness of methods under real-world distribution shifts. We host a publicly accessible evaluation server that holds the pixel-precise ground truth of the test set (https://benchmark.mvtec.com/). All image data is available at https://www.mvtec.com/company/research/datasets/mvtec-ad-2.
