EditP23: 3D Editing via Propagation of Image Prompts to Multi-View
Roi Bar-On, Dana Cohen-Bar, Daniel Cohen-Or
TL;DR
EditP23 tackles mask-free 3D editing by propagating a single 2D edit across a multi-view representation using a pre-trained diffusion backbone. It introduces an edit-aware denoising mechanism guided by an image pair (original and edited view) and employs correlated noise to isolate and propagate the edit while preserving object identity. The approach is training-free and feed-forward, delivering fast edits that maintain 3D consistency and outperform mask-free baselines in both quantitative metrics and user studies. The work demonstrates broad applicability across object categories and edit types, with ablations validating the core design choices and a reconstruction pipeline enabling final 3D assets.
Abstract
We present EditP23, a method for mask-free 3D editing that propagates 2D image edits to multi-view representations in a 3D-consistent manner. In contrast to traditional approaches that rely on text-based prompting or explicit spatial masks, EditP23 enables intuitive edits by conditioning on a pair of images: an original view and its user-edited counterpart. These image prompts are used to guide an edit-aware flow in the latent space of a pre-trained multi-view diffusion model, allowing the edit to be coherently propagated across views. Our method operates in a feed-forward manner, without optimization, and preserves the identity of the original object, in both structure and appearance. We demonstrate its effectiveness across a range of object categories and editing scenarios, achieving high fidelity to the source while requiring no manual masks.
