3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

Ziyang Yan; Lei Li; Yihua Shao; Siyu Chen; Zongkai Wu; Jenq-Neng Hwang; Hao Zhao; Fabio Remondino

3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

Ziyang Yan, Lei Li, Yihua Shao, Siyu Chen, Zongkai Wu, Jenq-Neng Hwang, Hao Zhao, Fabio Remondino

TL;DR

3DSceneEditor proposes a fully 3D Gaussian-based editing framework for complex scenes, enabling text-guided, real-time edits by directly manipulating Gaussians. It combines Mask3D semantic labeling, a CLIP-based open-vocabulary grounding module, and Gaussian-centric edits (add/remove/move/recolor/replace) within ROI, formalized as $G_{out} = Edit(G_{in}, \tau)$. Compared with state-of-the-art diffusion- and 2D-projection–based methods, it delivers higher editing quality (CTIS/CIIS), faster turnaround (initial edits in 2–5 minutes, secondary edits <1 minute), and lower GPU memory usage on indoor ScanNet++ scenes. This 3D-only paradigm advances interactive 3D content creation by leveraging explicit Gaussian representations for fine-grained, semantically aware modifications.

Abstract

The creation of 3D scenes has traditionally been both labor-intensive and costly, requiring designers to meticulously configure 3D assets and environments. Recent advancements in generative AI, including text-to-3D and image-to-3D methods, have dramatically reduced the complexity and cost of this process. However, current techniques for editing complex 3D scenes continue to rely on generally interactive multi-step, 2D-to-3D projection methods and diffusion-based techniques, which often lack precision in control and hamper real-time performance. In this work, we propose 3DSceneEditor, a fully 3D-based paradigm for real-time, precise editing of intricate 3D scenes using Gaussian Splatting. Unlike conventional methods, 3DSceneEditor operates through a streamlined 3D pipeline, enabling direct manipulation of Gaussians for efficient, high-quality edits based on input prompts.The proposed framework (i) integrates a pre-trained instance segmentation model for semantic labeling; (ii) employs a zero-shot grounding approach with CLIP to align target objects with user prompts; and (iii) applies scene modifications, such as object addition, repositioning, recoloring, replacing, and deletion directly on Gaussians. Extensive experimental results show that 3DSceneEditor achieves superior editing precision and speed with respect to current SOTA 3D scene editing approaches, establishing a new benchmark for efficient and interactive 3D scene customization.

3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

TL;DR

Abstract

3DSceneEditor: Controllable 3D Scene Editing with Gaussian Splatting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)