Table of Contents
Fetching ...

Fill in the ____ (a Diffusion-based Image Inpainting Pipeline)

Eyoel Gebre, Krishna Saxena, Timothy Tran

TL;DR

This paper tackles the lack of explicit content control in diffusion-based image inpainting by extending the RePaint pipeline to accept a target image as guidance, enabling targeted content generation within a masked region. It introduces a target-conditioned denoising process, resolves border conflicts via a convex combination controlled by the parameter $\lambda_t$, and enhances boundary realism through resampling and a strategic jumping schedule. The authors conduct experiments across masking strategies and hyperparameters, finding that a heated mask and a scene border improve transitions, while a balanced $\lambda_t$ schedule (notably with $p=0.5$) yields favorable fidelity and creativity. The work provides practical guidelines for hyperparameters (e.g., $T=200$, $j=40$, $r=40$, $\lambda$ scheduling) and outlines future directions toward automation, richer mask design, and larger-scale validation, which collectively advance controllable, target-guided inpainting with diffusion models.

Abstract

Image inpainting is the process of taking an image and generating lost or intentionally occluded portions. Inpainting has countless applications including restoring previously damaged pictures, restoring the quality of images that have been degraded due to compression, and removing unwanted objects/text. Modern inpainting techniques have shown remarkable ability in generating sensible completions for images with mask occlusions. In our paper, an overview of the progress of inpainting techniques will be provided, along with identifying current leading approaches, focusing on their strengths and weaknesses. A critical gap in these existing models will be addressed, focusing on the ability to prompt and control what exactly is generated. We will additionally justify why we think this is the natural next progressive step that inpainting models must take, and provide multiple approaches to implementing this functionality. Finally, we will evaluate the results of our approaches by qualitatively checking whether they generate high-quality images that correctly inpaint regions with the objects that they are instructed to produce.

Fill in the ____ (a Diffusion-based Image Inpainting Pipeline)

TL;DR

This paper tackles the lack of explicit content control in diffusion-based image inpainting by extending the RePaint pipeline to accept a target image as guidance, enabling targeted content generation within a masked region. It introduces a target-conditioned denoising process, resolves border conflicts via a convex combination controlled by the parameter , and enhances boundary realism through resampling and a strategic jumping schedule. The authors conduct experiments across masking strategies and hyperparameters, finding that a heated mask and a scene border improve transitions, while a balanced schedule (notably with ) yields favorable fidelity and creativity. The work provides practical guidelines for hyperparameters (e.g., , , , scheduling) and outlines future directions toward automation, richer mask design, and larger-scale validation, which collectively advance controllable, target-guided inpainting with diffusion models.

Abstract

Image inpainting is the process of taking an image and generating lost or intentionally occluded portions. Inpainting has countless applications including restoring previously damaged pictures, restoring the quality of images that have been degraded due to compression, and removing unwanted objects/text. Modern inpainting techniques have shown remarkable ability in generating sensible completions for images with mask occlusions. In our paper, an overview of the progress of inpainting techniques will be provided, along with identifying current leading approaches, focusing on their strengths and weaknesses. A critical gap in these existing models will be addressed, focusing on the ability to prompt and control what exactly is generated. We will additionally justify why we think this is the natural next progressive step that inpainting models must take, and provide multiple approaches to implementing this functionality. Finally, we will evaluate the results of our approaches by qualitatively checking whether they generate high-quality images that correctly inpaint regions with the objects that they are instructed to produce.
Paper Structure (20 sections, 5 equations, 6 figures)

This paper contains 20 sections, 5 equations, 6 figures.

Figures (6)

  • Figure 1: Pipeline diagram of RePaint (without sampling or jumping) reproduced from [Lugmayr et. al., 2022] for illustrative purposes.
  • Figure 2: Core pipeline without resampling or jumping.
  • Figure 3: Hyperparameter search ($\lambda$, $t$, $j$), where $t$ is the number of noising timesteps in the forward pass and $j$ is both the number of jumps and jump length.
  • Figure 4: Schedule for $\lambda_t$ vs $t$
  • Figure 5: Images generated with $\lambda$ schedule for varying values of $p$, $T=100$, and $r, j=40$.
  • ...and 1 more figures