Table of Contents
Fetching ...

ReCoGS: Real-time ReColoring for Gaussian Splatting scenes

Lorenzo Rutayisire, Nicola Capodieci, Fabio Pellacini

TL;DR

This work tackles the challenge of editing Gaussian Splatting scenes, specifically recoloring, by enabling pixel-level 2D selections that are unprojected to 3D, depth-estimated, and re-projected into training views for background optimization.The method, ReCoGS, combines a 2D-to-3D selection pipeline with depth estimation via PCVNet and selective updating of spherical harmonics coefficients to achieve real-time recoloring while preserving geometry.Key contributions include a novel interactive editor, a depth-aware unprojection approach, and a background optimization loop that recolors in the edited regions without full re-training of the geometry, demonstrated on consumer hardware.The approach offers a practical, diffusion-free alternative for in-place editing of 3D scenes, enabling fine-grained recolor operations with interactive performance and accessible code.

Abstract

Gaussian Splatting has emerged as a leading method for novel view synthesis, offering superior training efficiency and real-time inference compared to NeRF approaches, while still delivering high-quality reconstructions. Beyond view synthesis, this 3D representation has also been explored for editing tasks. Many existing methods leverage 2D diffusion models to generate multi-view datasets for training, but they often suffer from limitations such as view inconsistencies, lack of fine-grained control, and high computational demand. In this work, we focus specifically on the editing task of recoloring. We introduce a user-friendly pipeline that enables precise selection and recoloring of regions within a pre-trained Gaussian Splatting scene. To demonstrate the real-time performance of our method, we also present an interactive tool that allows users to experiment with the pipeline in practice. Code is available at https://github.com/loryruta/recogs.

ReCoGS: Real-time ReColoring for Gaussian Splatting scenes

TL;DR

This work tackles the challenge of editing Gaussian Splatting scenes, specifically recoloring, by enabling pixel-level 2D selections that are unprojected to 3D, depth-estimated, and re-projected into training views for background optimization.The method, ReCoGS, combines a 2D-to-3D selection pipeline with depth estimation via PCVNet and selective updating of spherical harmonics coefficients to achieve real-time recoloring while preserving geometry.Key contributions include a novel interactive editor, a depth-aware unprojection approach, and a background optimization loop that recolors in the edited regions without full re-training of the geometry, demonstrated on consumer hardware.The approach offers a practical, diffusion-free alternative for in-place editing of 3D scenes, enabling fine-grained recolor operations with interactive performance and accessible code.

Abstract

Gaussian Splatting has emerged as a leading method for novel view synthesis, offering superior training efficiency and real-time inference compared to NeRF approaches, while still delivering high-quality reconstructions. Beyond view synthesis, this 3D representation has also been explored for editing tasks. Many existing methods leverage 2D diffusion models to generate multi-view datasets for training, but they often suffer from limitations such as view inconsistencies, lack of fine-grained control, and high computational demand. In this work, we focus specifically on the editing task of recoloring. We introduce a user-friendly pipeline that enables precise selection and recoloring of regions within a pre-trained Gaussian Splatting scene. To demonstrate the real-time performance of our method, we also present an interactive tool that allows users to experiment with the pipeline in practice. Code is available at https://github.com/loryruta/recogs.

Paper Structure

This paper contains 19 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: ReCoGS editing pipeline. The user visits the scene through our editor (a), stops at an arbitrary view (b), selects pixels intended to edit (c), pixels are unprojected to a 3D pointcloud (d), the editing is applied in background (e).
  • Figure 2: After unprojecting the 2D selection mask (left), we have found helpful to remove statistical outliers (right); that is, to remove points that are "more isolated" than other points. In our implementation we calculate the average distance with $16$ neighbors, and use a scale factor for standard deviation threshold of $0.007$.
  • Figure 3: Top image is PCVNet prediction from libtorch. Bottom image is the prediction from the TensorRT engine, with FP16 enabled. Although the latter is more noisy, the unprojected pointcloud after outlier filtering is still acceptable.
  • Figure 4: Elapsed time and GPU memory consumption to perform edits on 3 MipNeRF360 scenes: bicycle, kitchen, and stump.
  • Figure 5: The unprojected 3D selection might not render as a continuous surface. The "holes" shown in this figure are either caused by errors in depth prediction or by the fact that - to perform depth testing between the scene and the 3D selection - we use the imprecise depth map obtained from gaussians to ensure real-time performance.