Table of Contents
Fetching ...

DDPM-MoCo: Advancing Industrial Surface Defect Generation and Detection with Generative and Contrastive Learning

Yangfan He, Xinyan Wang, Tianyu Shi

TL;DR

DDPM-MoCo tackles data scarcity in industrial surface defect detection by generating high-quality synthetic defects with a denoising diffusion probabilistic model and leveraging unlabeled data through Momentum Contrast. It introduces a dataset-level contrastive loss to strengthen representation learning and demonstrates substantial improvements in defect detection AP on aluminum plate defects, while preserving image realism (low FID, high IS) in generated data. The framework offers a data-efficient, scalable approach for practical metal-processing visual inspection, combining diffusion-based augmentation with unsupervised contrastive learning and targeted supervised fine-tuning. This work holds promise for broader adoption in industrial quality control where labeled defect data are scarce.

Abstract

The task of industrial detection based on deep learning often involves solving two problems: (1) obtaining sufficient and effective data samples, (2) and using efficient and convenient model training methods. In this paper, we introduce a novel defect-generation method, named DDPM-MoCo, to address these issues. Firstly, we utilize the Denoising Diffusion Probabilistic Model (DDPM) to generate high-quality defect data samples, overcoming the problem of insufficient sample data for model learning. Furthermore, we utilize the unsupervised learning Momentum Contrast model (MoCo) with an enhanced batch contrastive loss function for training the model on unlabeled data, addressing the efficiency and consistency challenges in large-scale negative sample encoding during diffusion model training. The experimental results showcase an enhanced visual detection method for identifying defects on metal surfaces, covering the entire process, starting from generating unlabeled sample data for training the diffusion model, to utilizing the same labeled sample data for downstream detection tasks. This study offers valuable practical insights and application potential for visual detection in the metal processing industry.

DDPM-MoCo: Advancing Industrial Surface Defect Generation and Detection with Generative and Contrastive Learning

TL;DR

DDPM-MoCo tackles data scarcity in industrial surface defect detection by generating high-quality synthetic defects with a denoising diffusion probabilistic model and leveraging unlabeled data through Momentum Contrast. It introduces a dataset-level contrastive loss to strengthen representation learning and demonstrates substantial improvements in defect detection AP on aluminum plate defects, while preserving image realism (low FID, high IS) in generated data. The framework offers a data-efficient, scalable approach for practical metal-processing visual inspection, combining diffusion-based augmentation with unsupervised contrastive learning and targeted supervised fine-tuning. This work holds promise for broader adoption in industrial quality control where labeled defect data are scarce.

Abstract

The task of industrial detection based on deep learning often involves solving two problems: (1) obtaining sufficient and effective data samples, (2) and using efficient and convenient model training methods. In this paper, we introduce a novel defect-generation method, named DDPM-MoCo, to address these issues. Firstly, we utilize the Denoising Diffusion Probabilistic Model (DDPM) to generate high-quality defect data samples, overcoming the problem of insufficient sample data for model learning. Furthermore, we utilize the unsupervised learning Momentum Contrast model (MoCo) with an enhanced batch contrastive loss function for training the model on unlabeled data, addressing the efficiency and consistency challenges in large-scale negative sample encoding during diffusion model training. The experimental results showcase an enhanced visual detection method for identifying defects on metal surfaces, covering the entire process, starting from generating unlabeled sample data for training the diffusion model, to utilizing the same labeled sample data for downstream detection tasks. This study offers valuable practical insights and application potential for visual detection in the metal processing industry.
Paper Structure (19 sections, 8 equations, 9 figures, 2 tables)

This paper contains 19 sections, 8 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Examples of the three most common defect types on metal surfaces: (a) shows the aluminum plate with corrosion, (b) depicts a self-made aluminum plate with some dents, and (c) displays homemade scratched aluminum plate.
  • Figure 2: DDPM training process. We train diffusion models to generate new images by gradually adding Gaussian noise to simulate damage and create noise maps. Specifically, we input images with noise at time $t$ in UNet to predict the noise generated at time $t-1$, and update the UNet parameters based on the differences in Gaussian noise between time $t-1$ and $t$.
  • Figure 3: Data augmentation. We used four diffusion models to learn noise patterns across four image categories (dent, corrosion, scratch, smooth), utilizing limited samples of each defect type obtained from initially augmented images through horizontal flipping and edge padding for few-shot learning. Adjusting seed values, we generated a large quantity of random noise images and fed them into the four trained models to produce additional defect images across various categories.
  • Figure 4: Original images and images generated by the diffusion model of the (d) dented defect, (e) scratch defect, and (f) corrosion defect.
  • Figure 5: PR curve overview under orginal loss for Vitb-16
  • ...and 4 more figures