RealFill: Reference-Driven Generation for Authentic Image Completion
Luming Tang, Nataniel Ruiz, Qinghao Chu, Yuanzhen Li, Aleksander Holynski, David E. Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, Michael Rubinstein
TL;DR
RealFill tackles Authentic Image Completion by finetuning a pretrained inpainting diffusion model on a small set of reference images and a target, enabling the model to encode scene content, lighting, and style. It then completes missing regions via diffusion sampling, guided by a Correspondence-Based Seed Selection that ranks outputs by correspondences to the references. The authors introduce RealBench, a 33-scene dataset for inpainting and outpainting with ground-truth, and show that RealFill significantly outperforms prompt-based and reference-based baselines across multiple similarity metrics. The approach yields faithful reconstructions even with large viewpoint and appearance changes, highlighting its potential for authentic scene restoration in practical photography contexts. However, limitations include training speed, failure modes under extreme geometry gaps, and challenges with fine-grained details like text or faces, pointing to future improvements in speed and robustness.
Abstract
Recent advances in generative imagery have brought forth outpainting and inpainting models that can produce high-quality, plausible image content in unknown regions. However, the content these models hallucinate is necessarily inauthentic, since they are unaware of the true scene. In this work, we propose RealFill, a novel generative approach for image completion that fills in missing regions of an image with the content that should have been there. RealFill is a generative inpainting model that is personalized using only a few reference images of a scene. These reference images do not have to be aligned with the target image, and can be taken with drastically varying viewpoints, lighting conditions, camera apertures, or image styles. Once personalized, RealFill is able to complete a target image with visually compelling contents that are faithful to the original scene. We evaluate RealFill on a new image completion benchmark that covers a set of diverse and challenging scenarios, and find that it outperforms existing approaches by a large margin. Project page: https://realfill.github.io
