DefectFill: Realistic Defect Generation with Inpainting Diffusion Model for Visual Inspection
Jaewoo Song, Daemin Park, Kanghyun Baek, Sangyub Lee, Jooyoung Choi, Eunji Kim, Sungroh Yoon
TL;DR
DefectFill tackles the data-scarcity challenge in visual inspection by learning realistic defect concepts from few reference image-mask pairs using a fine-tuned inpainting diffusion model. It introduces three defect-focused losses ($\mathcal{L}_{def}$, $\mathcal{L}_{obj}$, $\mathcal{L}_{attn}$) and a DefectFill objective, enabling precise, context-aware defect synthesis; Low-Fidelity Selection further filters high-quality samples. Empirical results on the MVTec AD dataset show state-of-the-art generation quality (KID and IC-LPIPS) and improved downstream tasks such as anomaly classification and localization when trained on synthesized defects. The approach demonstrates strong realism and transferability, making it especially suitable for industrial settings where defect data are scarce, though global-defect cases remain challenging.
Abstract
Developing effective visual inspection models remains challenging due to the scarcity of defect data. While image generation models have been used to synthesize defect images, producing highly realistic defects remains difficult. We propose DefectFill, a novel method for realistic defect generation that requires only a few reference defect images. It leverages a fine-tuned inpainting diffusion model, optimized with our custom loss functions incorporating defect, object, and attention terms. It enables precise capture of detailed, localized defect features and their seamless integration into defect-free objects. Additionally, our Low-Fidelity Selection method further enhances the defect sample quality. Experiments show that DefectFill generates high-quality defect images, enabling visual inspection models to achieve state-of-the-art performance on the MVTec AD dataset.
