Table of Contents
Fetching ...

No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes

Blaž Rolih, Matic Fučka, Danijel Skočaj

TL;DR

SuperSimpleNet addresses the need for robust surface defect detection across unsupervised, weakly supervised, mixed, and fully supervised regimes. It combines latent-space synthetic anomaly generation, a lightweight yet discriminative classification head, and a Segmentation-Detection dual-branch module to enable efficient training and inference across all data annotation scenarios. The approach achieves state-of-the-art or competitive results on four challenging benchmarks (SensumSODF, KSDD2, MVTec AD, VisA) with inference times under 10 ms, demonstrating strong practical potential for industrial deployment. By unifying diverse supervision paradigms, the method reduces labeling effort while maintaining high accuracy, making it well-suited for real-world manufacturing challenges.

Abstract

Surface defect detection is a critical task across numerous industries, aimed at efficiently identifying and localising imperfections or irregularities on manufactured components. While numerous methods have been proposed, many fail to meet industrial demands for high performance, efficiency, and adaptability. Existing approaches are often constrained to specific supervision scenarios and struggle to adapt to the diverse data annotations encountered in real-world manufacturing processes, such as unsupervised, weakly supervised, mixed supervision, and fully supervised settings. To address these challenges, we propose SuperSimpleNet, a highly efficient and adaptable discriminative model built on the foundation of SimpleNet. SuperSimpleNet incorporates a novel synthetic anomaly generation process, an enhanced classification head, and an improved learning procedure, enabling efficient training in all four supervision scenarios, making it the first model capable of fully leveraging all available data annotations. SuperSimpleNet sets a new standard for performance across all scenarios, as demonstrated by its results on four challenging benchmark datasets. Beyond accuracy, it is very fast, achieving an inference time below 10 ms. With its ability to unify diverse supervision paradigms while maintaining outstanding speed and reliability, SuperSimpleNet represents a promising step forward in addressing real-world manufacturing challenges and bridging the gap between academic research and industrial applications. Code: https://github.com/blaz-r/SuperSimpleNet

No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes

TL;DR

SuperSimpleNet addresses the need for robust surface defect detection across unsupervised, weakly supervised, mixed, and fully supervised regimes. It combines latent-space synthetic anomaly generation, a lightweight yet discriminative classification head, and a Segmentation-Detection dual-branch module to enable efficient training and inference across all data annotation scenarios. The approach achieves state-of-the-art or competitive results on four challenging benchmarks (SensumSODF, KSDD2, MVTec AD, VisA) with inference times under 10 ms, demonstrating strong practical potential for industrial deployment. By unifying diverse supervision paradigms, the method reduces labeling effort while maintaining high accuracy, making it well-suited for real-world manufacturing challenges.

Abstract

Surface defect detection is a critical task across numerous industries, aimed at efficiently identifying and localising imperfections or irregularities on manufactured components. While numerous methods have been proposed, many fail to meet industrial demands for high performance, efficiency, and adaptability. Existing approaches are often constrained to specific supervision scenarios and struggle to adapt to the diverse data annotations encountered in real-world manufacturing processes, such as unsupervised, weakly supervised, mixed supervision, and fully supervised settings. To address these challenges, we propose SuperSimpleNet, a highly efficient and adaptable discriminative model built on the foundation of SimpleNet. SuperSimpleNet incorporates a novel synthetic anomaly generation process, an enhanced classification head, and an improved learning procedure, enabling efficient training in all four supervision scenarios, making it the first model capable of fully leveraging all available data annotations. SuperSimpleNet sets a new standard for performance across all scenarios, as demonstrated by its results on four challenging benchmark datasets. Beyond accuracy, it is very fast, achieving an inference time below 10 ms. With its ability to unify diverse supervision paradigms while maintaining outstanding speed and reliability, SuperSimpleNet represents a promising step forward in addressing real-world manufacturing challenges and bridging the gap between academic research and industrial applications. Code: https://github.com/blaz-r/SuperSimpleNet

Paper Structure

This paper contains 22 sections, 9 equations, 13 figures, 14 tables.

Figures (13)

  • Figure 1: Different supervision scenarios within manufacturing processes are illustrated. Images with a green border contain no anomalies, while images with a red border indicate the presence of an anomaly. For some images, the corresponding anomaly segmentation mask is also provided. The labelling effort required increases progressively from left to right. At present, only SuperSimpleNet supports training across all four scenarios.
  • Figure 2: SuperSimpleNet's architecture. Features are first extracted, upscaled, and, in the case of the segmentation branch, also adapted. During training, synthetic anomalies are generated in the latent space and are limited to regions defined by the binarised Perlin mask. The segmentation head predicts an anomaly mask $\mathrm{M}_o$, based on the perturbed feature map $\mathcal{PA}$, which is then used in combination with the perturbed feature map $\mathcal{PF}$ by the classification head to produce the anomaly score $s$. The anomaly score $s$, and the predicted map $\mathrm{M}_o$ are supervised by the anomaly mask $\mathrm{M}$, and the ground truth anomaly score $y$, where $y$ is set to 1 if the image contains an anomaly (synthetic or real) and to 0 otherwise. During inference, $\mathrm{M}_o$ and $s$ are produced directly, skipping the anomaly generation phase. The remaining parts of original SimpleNet are also shown in the image with green colour.
  • Figure 3: Synthetic anomaly generation. Synthetic anomaly masks $\mathrm{M}_{synth}$ are generated by removing actual anomalous regions (captured by ground truth mask $\mathrm{M}_{gt}$) from Perlin anomaly mask $\mathrm{M}_p$ (obtained by thresholding Perlin Noise perlin1985image). $\mathrm{M}_{synth}$ is then used to limit the Gaussian noise only to specific regions, producing final noise $\epsilon$, which is later added to the features to create synthetic anomalies. The final anomaly mask $\mathrm{M}$ is constructed from $\mathrm{M}_{synth}$ and $\mathrm{M}_{gt}$ indicates regions with synthetic and actual anomalies. Since $\mathrm{M}_{gt}$ is empty in the case of weakly supervised and unsupervised learning, $\mathrm{M}_p$ directly becomes $\mathrm{M}_{synth}$ and the final mask $\mathrm{M}$.
  • Figure 4: Detailed architecture of the segmentation-detection module. The design preserves the segmentation head from SimpleNet while introducing a new classification head with a wider kernel. This design allows for better contextual understanding, improving anomaly detection capabilities.
  • Figure 5: Qualitative comparison of anomaly maps produced in a fully supervised setting on SensumSODF and KSDD2. The first two rows display SensumSODF samples (capsule and softgel), while the last row shows KSDD2 examples. Each sample includes the input image, ground truth, and overlaid anomaly maps for each model. The anomaly scores are displayed in the top-right corner of each anomaly map.
  • ...and 8 more figures