Table of Contents
Fetching ...

ASBench: Image Anomalies Synthesis Benchmark for Anomaly Detection

Qunyi Zhang, Songan Zhang, Jinbao Wang, Xiaoning Lei, Guoyang Xie, Guannan Jiang, Zhichao Lu

TL;DR

ASBench tackles the data-scarcity problem in industrial anomaly detection by introducing a comprehensive benchmark dedicated to anomaly synthesis. It decouples synthesis from detection and evaluates methods across five industrial datasets, four detection pipelines, and twelve synthesis strategies using four evaluation dimensions, including cross-dataset generalization and synthesis-detection interactions. The study finds no single universally superior synthesis method, reveals non-linear effects of synthetic data ratios, and shows weak correlations between conventional image-quality metrics and detection performance, while demonstrating notable gains from hybrid synthesis approaches. These findings guide future development toward adaptable, diverse, and jointly optimized anomaly-synthesis methods, with standardized evaluation practices to enable robust cross-domain deployment.

Abstract

Anomaly detection plays a pivotal role in manufacturing quality control, yet its application is constrained by limited abnormal samples and high manual annotation costs. While anomaly synthesis offers a promising solution, existing studies predominantly treat anomaly synthesis as an auxiliary component within anomaly detection frameworks, lacking systematic evaluation of anomaly synthesis algorithms. Current research also overlook crucial factors specific to anomaly synthesis, such as decoupling its impact from detection, quantitative analysis of synthetic data and adaptability across different scenarios. To address these limitations, we propose ASBench, the first comprehensive benchmarking framework dedicated to evaluating anomaly synthesis methods. Our framework introduces four critical evaluation dimensions: (i) the generalization performance across different datasets and pipelines (ii) the ratio of synthetic to real data (iii) the correlation between intrinsic metrics of synthesis images and anomaly detection performance metrics , and (iv) strategies for hybrid anomaly synthesis methods. Through extensive experiments, ASBench not only reveals limitations in current anomaly synthesis methods but also provides actionable insights for future research directions in anomaly synthesis

ASBench: Image Anomalies Synthesis Benchmark for Anomaly Detection

TL;DR

ASBench tackles the data-scarcity problem in industrial anomaly detection by introducing a comprehensive benchmark dedicated to anomaly synthesis. It decouples synthesis from detection and evaluates methods across five industrial datasets, four detection pipelines, and twelve synthesis strategies using four evaluation dimensions, including cross-dataset generalization and synthesis-detection interactions. The study finds no single universally superior synthesis method, reveals non-linear effects of synthetic data ratios, and shows weak correlations between conventional image-quality metrics and detection performance, while demonstrating notable gains from hybrid synthesis approaches. These findings guide future development toward adaptable, diverse, and jointly optimized anomaly-synthesis methods, with standardized evaluation practices to enable robust cross-domain deployment.

Abstract

Anomaly detection plays a pivotal role in manufacturing quality control, yet its application is constrained by limited abnormal samples and high manual annotation costs. While anomaly synthesis offers a promising solution, existing studies predominantly treat anomaly synthesis as an auxiliary component within anomaly detection frameworks, lacking systematic evaluation of anomaly synthesis algorithms. Current research also overlook crucial factors specific to anomaly synthesis, such as decoupling its impact from detection, quantitative analysis of synthetic data and adaptability across different scenarios. To address these limitations, we propose ASBench, the first comprehensive benchmarking framework dedicated to evaluating anomaly synthesis methods. Our framework introduces four critical evaluation dimensions: (i) the generalization performance across different datasets and pipelines (ii) the ratio of synthetic to real data (iii) the correlation between intrinsic metrics of synthesis images and anomaly detection performance metrics , and (iv) strategies for hybrid anomaly synthesis methods. Through extensive experiments, ASBench not only reveals limitations in current anomaly synthesis methods but also provides actionable insights for future research directions in anomaly synthesis

Paper Structure

This paper contains 29 sections, 2 equations, 13 figures, 11 tables.

Figures (13)

  • Figure 1: The overall workflow of ASBench. We disentangle the different stage in the anomaly synthesis and anomaly detection pipeline, and discuss the impact of variables at each stage on anomaly detection in separate subsections of Section IV.
  • Figure 2: Performance comparison of different anomaly synthesis methods across various detection pipelines and all datasets. For datasets, results are computed via weighted averaging, where weights correspond to the number of subclasses per dataset. Axes represent the anomaly synthesis methods, lines correspond to the detection pipelines, and vertices quantify the performance of these synthesis methods across different detection pipelines.
  • Figure 3: Image-level AUROC performance comparison of different anomaly synthesis methods across various detection pipelines and different datasets.
  • Figure 4: Pixel-level AUPR performance comparison of different anomaly synthesis methods across various detection pipelines and different datasets.
  • Figure 5: Heatmaps of image-level AUROC and pixel-level AUPR performance for anomaly synthesis methods. The images (a) and (b) represent performance across different datasets. The images (c) and (d) represent performance across different detection pipelines. AnomalyDiffusion, DRAEM, and DestSeg establish a clear advantage in anomaly synthesis effects over other methods.
  • ...and 8 more figures