Table of Contents
Fetching ...

GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting

Yiwen Chen, Zilong Chen, Chi Zhang, Feng Wang, Xiaofeng Yang, Yikai Wang, Zhongang Cai, Lei Yang, Huaping Liu, Guosheng Lin

TL;DR

GaussianEditor addresses slow, hard-to-control 3D editing by leveraging Gaussian Splatting with Gaussian Semantic Tracing to identify target Gaussians, and introduces Hierarchical Gaussian Splatting (HGS) to stabilize updates under stochastic diffusion guidance. It also provides a dedicated 3D inpainting pipeline for object removal and insertion, enabling edits in as little as $5$–$10$ minutes on a single RTX $A6000$ GPU. The approach delivers precise, area-restricted edits and superior controllability in face and scene edits, validated by both qualitative and quantitative comparisons against diffusion-guided and NeRF-based methods. The combination of fast GS rendering, dynamic semantic masking, and anchor-based generation control has practical impact for interactive 3D editing in gaming, virtual production, and metaverse workflows.

Abstract

3D editing plays a crucial role in many areas such as gaming and virtual reality. Traditional 3D editing methods, which rely on representations like meshes and point clouds, often fall short in realistically depicting complex scenes. On the other hand, methods based on implicit 3D representations, like Neural Radiance Field (NeRF), render complex scenes effectively but suffer from slow processing speeds and limited control over specific scene areas. In response to these challenges, our paper presents GaussianEditor, an innovative and efficient 3D editing algorithm based on Gaussian Splatting (GS), a novel 3D representation. GaussianEditor enhances precision and control in editing through our proposed Gaussian semantic tracing, which traces the editing target throughout the training process. Additionally, we propose Hierarchical Gaussian splatting (HGS) to achieve stabilized and fine results under stochastic generative guidance from 2D diffusion models. We also develop editing strategies for efficient object removal and integration, a challenging task for existing methods. Our comprehensive experiments demonstrate GaussianEditor's superior control, efficacy, and rapid performance, marking a significant advancement in 3D editing. Project Page: https://buaacyw.github.io/gaussian-editor/

GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting

TL;DR

GaussianEditor addresses slow, hard-to-control 3D editing by leveraging Gaussian Splatting with Gaussian Semantic Tracing to identify target Gaussians, and introduces Hierarchical Gaussian Splatting (HGS) to stabilize updates under stochastic diffusion guidance. It also provides a dedicated 3D inpainting pipeline for object removal and insertion, enabling edits in as little as minutes on a single RTX GPU. The approach delivers precise, area-restricted edits and superior controllability in face and scene edits, validated by both qualitative and quantitative comparisons against diffusion-guided and NeRF-based methods. The combination of fast GS rendering, dynamic semantic masking, and anchor-based generation control has practical impact for interactive 3D editing in gaming, virtual production, and metaverse workflows.

Abstract

3D editing plays a crucial role in many areas such as gaming and virtual reality. Traditional 3D editing methods, which rely on representations like meshes and point clouds, often fall short in realistically depicting complex scenes. On the other hand, methods based on implicit 3D representations, like Neural Radiance Field (NeRF), render complex scenes effectively but suffer from slow processing speeds and limited control over specific scene areas. In response to these challenges, our paper presents GaussianEditor, an innovative and efficient 3D editing algorithm based on Gaussian Splatting (GS), a novel 3D representation. GaussianEditor enhances precision and control in editing through our proposed Gaussian semantic tracing, which traces the editing target throughout the training process. Additionally, we propose Hierarchical Gaussian splatting (HGS) to achieve stabilized and fine results under stochastic generative guidance from 2D diffusion models. We also develop editing strategies for efficient object removal and integration, a challenging task for existing methods. Our comprehensive experiments demonstrate GaussianEditor's superior control, efficacy, and rapid performance, marking a significant advancement in 3D editing. Project Page: https://buaacyw.github.io/gaussian-editor/
Paper Structure (22 sections, 9 equations, 11 figures, 1 table)

This paper contains 22 sections, 9 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: Results of GaussianEditor. GaussianEditor offers swift, controllable, and versatile 3D editing. A single editing session only takes 5-10 minutes. Please note our precise editing control, where only the desired parts are modified. Taking the "Make the grass on fire" example from the first row of the figure, other objects in the scene such as the bench and tree remain unaffected.
  • Figure 2: Illustration of Gaussian semantic tracing. Prompt: Turn him into an old lady. The red mask in the images represents the projection of the Gaussians that will be updated and densified. The dynamic change of the masked area during the training process, as driven by the updating of Gaussians, ensures consistent effectiveness throughout the training duration. Despite starting with potentially inaccurate segmentation masks due to 2D segmentation errors, Gaussian semantic tracing still guarantees high-quality editing results.
  • Figure 3: 3D inpainting for object incorporation. GaussianEditor is capable of adding objects at specified locations in a scene, given a 2D inpainting mask and a text prompt from a single view. The whole process takes merely five minutes.
  • Figure 4: 3D inpainting for object removal. Typically, removing the target object based on a Gaussian semantic mask generates artifacts at the interface between the target object and the scene. To address this, we generate a repaired image using a 2D inpainting method and employ Mean Squared Error (MSE) loss for supervision. The whole process takes merely two minutes.
  • Figure 5: Qualitative comparison. It's important to note the level of control we maintain over the editing area (the whole body of the man). Background and other non-target regions are essentially unaffected, in contrast to Instruct-Nerf2Nerf haque2023instruct where the entire scene undergoes changes. GaussianEditor-DDS and GaussianEditor-iN2N indicate that we utilize delta denoising score hertz2023delta and Instruct-Nerf2Nerf haque2023instruct respectively, as guidance for editing.
  • ...and 6 more figures