Diffusion-based image inpainting with internal learning
Nicolas Cherel, Andrés Almansa, Yann Gousseau, Alasdair Newson
TL;DR
This paper introduces lightweight diffusion-based inpainting models trained on a single image or a few images (internal learning) to overcome the high computational cost of traditional diffusion approaches. By conditioning the reverse diffusion on observed regions and masks, and using a compact UNet that predicts $x_0$, the method achieves competitive realism across textures, line drawings, and SVBRDF with dramatically reduced training and inference time. Key contributions include a detailed framework for patch- and single-image training, a 160k-parameter architecture without attention, and strong empirical results showing state-of-the-art realism in constrained modalities with far lower resource requirements. The practical impact lies in enabling fast, interactive, modality-specific inpainting when large external datasets are unavailable or impractical to use.
Abstract
Diffusion models are now the undisputed state-of-the-art for image generation and image restoration. However, they require large amounts of computational power for training and inference. In this paper, we propose lightweight diffusion models for image inpainting that can be trained on a single image, or a few images. We show that our approach competes with large state-of-the-art models in specific cases. We also show that training a model on a single image is particularly relevant for image acquisition modality that differ from the RGB images of standard learning databases. We show results in three different contexts: texture images, line drawing images, and materials BRDF, for which we achieve state-of-the-art results in terms of realism, with a computational load that is greatly reduced compared to concurrent methods.
