FaithFill: Faithful Inpainting for Object Completion Using a Single Reference Image
Rupayan Mallick, Amr Abdalla, Sarah Adel Bargal
TL;DR
FaithFill addresses faithful object completion from a single reference image by integrating segmentation, NeRF-based view synthesis, and LoRA-finetuned diffusion inpainting. By producing multiple views of the reference object and constraining the inpainting updates, the method preserves shape, texture, color, and background while filling occluded regions. Evaluations on DreamBooth and a dedicated FaithFill dataset show improvements across standard metrics, human judgments, and GPT-based assessments relative to state-of-the-art baselines. The work also contributes the FaithFill dataset and demonstrates data-efficient, faithful editing possible with diffusion models.
Abstract
We present FaithFill, a diffusion-based inpainting object completion approach for realistic generation of missing object parts. Typically, multiple reference images are needed to achieve such realistic generation, otherwise the generation would not faithfully preserve shape, texture, color, and background. In this work, we propose a pipeline that utilizes only a single input reference image -having varying lighting, background, object pose, and/or viewpoint. The singular reference image is used to generate multiple views of the object to be inpainted. We demonstrate that FaithFill produces faithful generation of the object's missing parts, together with background/scene preservation, from a single reference image. This is demonstrated through standard similarity metrics, human judgement, and GPT evaluation. Our results are presented on the DreamBooth dataset, and a novel proposed dataset.
