OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting
Yongsheng Yu, Ziyun Zeng, Haitian Zheng, Jiebo Luo
TL;DR
OmniPaint addresses the challenge of realistic object editing by unifying object removal and insertion as interdependent tasks. It leverages a pre-trained diffusion prior, a three-phase training pipeline, CycleFlow unpaired refinement, and a no-reference CFD metric to ensure geometric and physical consistency while reducing data requirements. Empirical results show substantial gains over state-of-the-art baselines in both removal and insertion, with CFD providing robust, reference-free evaluation of context coherence and hallucination. The work paves the way for practical, high-fidelity object-oriented editing with limited paired data, and introduces a flexible framework for future extension to more complex scenes and modalities.
Abstract
Diffusion-based generative models have revolutionized object-oriented image editing, yet their deployment in realistic object removal and insertion remains hampered by challenges such as the intricate interplay of physical effects and insufficient paired training data. In this work, we introduce OmniPaint, a unified framework that re-conceptualizes object removal and insertion as interdependent processes rather than isolated tasks. Leveraging a pre-trained diffusion prior along with a progressive training pipeline comprising initial paired sample optimization and subsequent large-scale unpaired refinement via CycleFlow, OmniPaint achieves precise foreground elimination and seamless object insertion while faithfully preserving scene geometry and intrinsic properties. Furthermore, our novel CFD metric offers a robust, reference-free evaluation of context consistency and object hallucination, establishing a new benchmark for high-fidelity image editing. Project page: https://yeates.github.io/OmniPaint-Page/
