DDPM-MoCo: Advancing Industrial Surface Defect Generation and Detection with Generative and Contrastive Learning
Yangfan He, Xinyan Wang, Tianyu Shi
TL;DR
DDPM-MoCo tackles data scarcity in industrial surface defect detection by generating high-quality synthetic defects with a denoising diffusion probabilistic model and leveraging unlabeled data through Momentum Contrast. It introduces a dataset-level contrastive loss to strengthen representation learning and demonstrates substantial improvements in defect detection AP on aluminum plate defects, while preserving image realism (low FID, high IS) in generated data. The framework offers a data-efficient, scalable approach for practical metal-processing visual inspection, combining diffusion-based augmentation with unsupervised contrastive learning and targeted supervised fine-tuning. This work holds promise for broader adoption in industrial quality control where labeled defect data are scarce.
Abstract
The task of industrial detection based on deep learning often involves solving two problems: (1) obtaining sufficient and effective data samples, (2) and using efficient and convenient model training methods. In this paper, we introduce a novel defect-generation method, named DDPM-MoCo, to address these issues. Firstly, we utilize the Denoising Diffusion Probabilistic Model (DDPM) to generate high-quality defect data samples, overcoming the problem of insufficient sample data for model learning. Furthermore, we utilize the unsupervised learning Momentum Contrast model (MoCo) with an enhanced batch contrastive loss function for training the model on unlabeled data, addressing the efficiency and consistency challenges in large-scale negative sample encoding during diffusion model training. The experimental results showcase an enhanced visual detection method for identifying defects on metal surfaces, covering the entire process, starting from generating unlabeled sample data for training the diffusion model, to utilizing the same labeled sample data for downstream detection tasks. This study offers valuable practical insights and application potential for visual detection in the metal processing industry.
