View-Consistent 3D Editing with Gaussian Splatting
Yuxuan Wang, Xuanyu Yi, Zike Wu, Na Zhao, Long Chen, Hanwang Zhang
TL;DR
VcEdit tackles multi-view inconsistency in image-guided 3DGS editing by integrating two consistency modules into an iterative editing pipeline. The Cross-attention Consistency Module aggregates cross-view attention in the diffusion backbone via inverse-rendering to a unified 3D map and re-renders it to 2D, while the Editing Consistency Module calibrates edited guidance through a fast 3DGS-based refinement and local blending. An iterative pattern further refines the 3DGS and guidance across cycles, yielding coherent edits across diverse scenes. Empirical results on qualitative and quantitative metrics, including a CLIP-based directional similarity and a human user study, show that VcEdit surpasses baselines such as DDS and GSEditor in view-consistent, high-fidelity 3D edits with reduced mode collapse.
Abstract
The advent of 3D Gaussian Splatting (3DGS) has revolutionized 3D editing, offering efficient, high-fidelity rendering and enabling precise local manipulations. Currently, diffusion-based 2D editing models are harnessed to modify multi-view rendered images, which then guide the editing of 3DGS models. However, this approach faces a critical issue of multi-view inconsistency, where the guidance images exhibit significant discrepancies across views, leading to mode collapse and visual artifacts of 3DGS. To this end, we introduce View-consistent Editing (VcEdit), a novel framework that seamlessly incorporates 3DGS into image editing processes, ensuring multi-view consistency in edited guidance images and effectively mitigating mode collapse issues. VcEdit employs two innovative consistency modules: the Cross-attention Consistency Module and the Editing Consistency Module, both designed to reduce inconsistencies in edited images. By incorporating these consistency modules into an iterative pattern, VcEdit proficiently resolves the issue of multi-view inconsistency, facilitating high-quality 3DGS editing across a diverse range of scenes. Further video results are shown in http://vcedit.github.io.
