Defect Image Sample Generation With Diffusion Prior for Steel Surface Defect Recognition
Yichun Tai, Kun Yang, Tao Peng, Zhenzhen Huang, Zhijiang Zhang
TL;DR
This work tackles the data scarcity problem in steel surface defect recognition by leveraging a diffusion prior via StableSDG. It introduces a two‑stage adaptation pipeline that aligns Stable Diffusion to defect distributions through token embedding optimization and low-rank network adaptation, followed by image‑oriented data generation that starts from partially perturbed real samples. The approach is guided by iterative quality evaluation using FID to produce high‑fidelity defect images, which are then used to train robust defect classifiers. Empirical results on NEU and CCBSD show superior generation fidelity and notable improvements in recognition accuracy compared to multiple baselines, highlighting the practical impact of diffusion priors for industrial data expansion.
Abstract
The task of steel surface defect recognition is an industrial problem with great industry values. The data insufficiency is the major challenge in training a robust defect recognition network. Existing methods have investigated to enlarge the dataset by generating samples with generative models. However, their generation quality is still limited by the insufficiency of defect image samples. To this end, we propose Stable Surface Defect Generation (StableSDG), which transfers the vast generation distribution embedded in Stable Diffusion model for steel surface defect image generation. To tackle with the distinctive distribution gap between steel surface images and generated images of the diffusion model, we propose two processes. First, we align the distribution by adapting parameters of the diffusion model, adopted both in the token embedding space and network parameter space. Besides, in the generation process, we propose image-oriented generation rather than from pure Gaussian noises. We conduct extensive experiments on steel surface defect dataset, demonstrating state-of-the-art performance on generating high-quality samples and training recognition models, and both designed processes are significant for the performance.
