Table of Contents
Fetching ...

GSEdit: Efficient Text-Guided Editing of 3D Objects via Gaussian Splatting

Francesco Palandra, Andrea Sanchietti, Daniele Baieri, Emanuele Rodolà

TL;DR

GSEdit introduces a fast, text-guided editing pipeline for 3D objects using Gaussian Splatting, enabling coherent shape and appearance edits with SDS-guided optimization and diffusion-based supervision. The method supports input from meshes or DreamGaussian Gaussians, and concludes with mesh extraction and texture refinement to deliver a ready-to-use edited 3D asset. Quantitative and qualitative results show superior CLIP-based alignment and substantial runtime speedups compared to baselines, highlighting practical impact for artists and production pipelines. Limitations stem from IP2P constraints, including perspective bias and restricted spatial transformations, suggesting future work on view-consistency and broader edit capabilities.

Abstract

We present GSEdit, a pipeline for text-guided 3D object editing based on Gaussian Splatting models. Our method enables the editing of the style and appearance of 3D objects without altering their main details, all in a matter of minutes on consumer hardware. We tackle the problem by leveraging Gaussian splatting to represent 3D scenes, and we optimize the model while progressively varying the image supervision by means of a pretrained image-based diffusion model. The input object may be given as a 3D triangular mesh, or directly provided as Gaussians from a generative model such as DreamGaussian. GSEdit ensures consistency across different viewpoints, maintaining the integrity of the original object's information. Compared to previously proposed methods relying on NeRF-like MLP models, GSEdit stands out for its efficiency, making 3D editing tasks much faster. Our editing process is refined via the application of the SDS loss, ensuring that our edits are both precise and accurate. Our comprehensive evaluation demonstrates that GSEdit effectively alters object shape and appearance following the given textual instructions while preserving their coherence and detail.

GSEdit: Efficient Text-Guided Editing of 3D Objects via Gaussian Splatting

TL;DR

GSEdit introduces a fast, text-guided editing pipeline for 3D objects using Gaussian Splatting, enabling coherent shape and appearance edits with SDS-guided optimization and diffusion-based supervision. The method supports input from meshes or DreamGaussian Gaussians, and concludes with mesh extraction and texture refinement to deliver a ready-to-use edited 3D asset. Quantitative and qualitative results show superior CLIP-based alignment and substantial runtime speedups compared to baselines, highlighting practical impact for artists and production pipelines. Limitations stem from IP2P constraints, including perspective bias and restricted spatial transformations, suggesting future work on view-consistency and broader edit capabilities.

Abstract

We present GSEdit, a pipeline for text-guided 3D object editing based on Gaussian Splatting models. Our method enables the editing of the style and appearance of 3D objects without altering their main details, all in a matter of minutes on consumer hardware. We tackle the problem by leveraging Gaussian splatting to represent 3D scenes, and we optimize the model while progressively varying the image supervision by means of a pretrained image-based diffusion model. The input object may be given as a 3D triangular mesh, or directly provided as Gaussians from a generative model such as DreamGaussian. GSEdit ensures consistency across different viewpoints, maintaining the integrity of the original object's information. Compared to previously proposed methods relying on NeRF-like MLP models, GSEdit stands out for its efficiency, making 3D editing tasks much faster. Our editing process is refined via the application of the SDS loss, ensuring that our edits are both precise and accurate. Our comprehensive evaluation demonstrates that GSEdit effectively alters object shape and appearance following the given textual instructions while preserving their coherence and detail.
Paper Structure (17 sections, 11 equations, 7 figures, 1 table)

This paper contains 17 sections, 11 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: The GSEdit pipeline. 1) It starts by taking a mesh a) or a pretrained Gaussian splatting model b) as input. During this stage, 20 different views of the shape are rendered, and if the input is a mesh, the renders are used to encode it in a GS model. 2) The editing phase consists of picking a camera, rendering the scene from that camera, running a step of Instruct-Pix2Pix brooks2023instructpix2pix to apply the edit, and optimizing the SDS poole2022dreamfusion loss. Once the editing is complete, the mesh extraction 3) and texture refinement 4) steps introduced in DreamGaussian tang2023dreamgaussian are carried out.
  • Figure 2: A collection of input/output pairs for our GSEdit pipeline. The results show that our method is flexible enough to handle various settings similarly well, such as changes of object identity (dog $\rightarrow$ wolf, gnome $\rightarrow$ robot) and visual style (realistic duck, golden ice-cream). GSEdit aims to perform the editing while preserving the input's features, such as pose, overall shape, or the hat and lantern of the gnome.
  • Figure 3: Examples of applications of GSEdit with fixed input mesh, varying the editing instruction. The results show the ability of our model to change object identity while keeping original features (cactus), adding simple features (hat), editing "artistic" style (lowpoly), and material (marble).
  • Figure 4: A visualization of how the edits performed by the U-Net of IP2P are exploited to guide the edit.
  • Figure 5: Multi-view renders of three meshes edited with GSEdit. The results show good consistency of the edited shapes with respect to the view direction.
  • ...and 2 more figures