Fractals as Pre-training Datasets for Anomaly Detection and Localization
C. I. Ugwu, S. Casarin, O. Lanz
TL;DR
This work investigates using synthetically generated fractal images as pre-training data for unsupervised anomaly detection and localization in industrial inspection. By freezing eight AD backbones trained with fractals and comparing them to ImageNet pre-training without fine-tuning on MVTec and VisA, the study highlights that fractals can achieve competitive performance, particularly for memory-based methods and data-efficient tasks. However, ImageNet remains the overall leader, and localization (AUPRO) generally suffers under fractal pre-training, suggesting future work in multi-instance training, data augmentation, and few-shot settings to close the gap. The findings imply a privacy-preserving, data-efficient route for feature extractors in anomaly detection, enabling synthetic datasets to supplement or, in some cases, substitute real data collection while maintaining strong detection capabilities.
Abstract
Anomaly detection is crucial in large-scale industrial manufacturing as it helps detect and localise defective parts. Pre-training feature extractors on large-scale datasets is a popular approach for this task. Stringent data security and privacy regulations and high costs and acquisition time hinder the availability and creation of such large datasets. While recent work in anomaly detection primarily focuses on the development of new methods built on such extractors, the importance of the data used for pre-training has not been studied. Therefore, we evaluated the performance of eight state-of-the-art methods pre-trained using dynamically generated fractal images on the famous benchmark datasets MVTec and VisA. In contrast to existing literature, which predominantly examines the transfer-learning capabilities of fractals, in this study, we compare models pre-trained with fractal images against those pre-trained with ImageNet, without subsequent fine-tuning. Although pre-training with ImageNet remains a clear winner, the results of fractals are promising considering that the anomaly detection task required features capable of discerning even minor visual variations. This opens up the possibility for a new research direction where feature extractors could be trained on synthetically generated abstract datasets reconciling the ever-increasing demand for data in machine learning while circumventing privacy and security concerns.
