Table of Contents
Fetching ...

VIRGi: View-dependent Instant Recoloring of 3D Gaussians Splats

Alessio Mazzucchelli, Ivan Ojeda-Martin, Fernando Rivas-Manzaneque, Elena Garces, Adrian Penate-Sanchez, Francesc Moreno-Noguer

TL;DR

VIRGi is introduced, a novel approach for rapidly editing the color of scenes modeled by 3DGS while preserving viewdependent effects such as specular highlights and facilitating real-time interaction and providing control over the strength of the view-dependent effects.

Abstract

3D Gaussian Splatting (3DGS) has recently transformed the fields of novel view synthesis and 3D reconstruction due to its ability to accurately model complex 3D scenes and its unprecedented rendering performance. However, a significant challenge persists: the absence of an efficient and photorealistic method for editing the appearance of the scene's content. In this paper we introduce VIRGi, a novel approach for rapidly editing the color of scenes modeled by 3DGS while preserving view-dependent effects such as specular highlights. Key to our method are a novel architecture that separates color into diffuse and view-dependent components, and a multi-view training strategy that integrates image patches from multiple viewpoints. Improving over the conventional single-view batch training, our 3DGS representation provides more accurate reconstruction and serves as a solid representation for the recoloring task. For 3DGS recoloring, we then introduce a rapid scheme requiring only one manually edited image of the scene from the end-user. By fine-tuning the weights of a single MLP, alongside a module for single-shot segmentation of the editable area, the color edits are seamlessly propagated to the entire scene in just two seconds, facilitating real-time interaction and providing control over the strength of the view-dependent effects. An exhaustive validation on diverse datasets demonstrates significant quantitative and qualitative advancements over competitors based on Neural Radiance Fields representations.

VIRGi: View-dependent Instant Recoloring of 3D Gaussians Splats

TL;DR

VIRGi is introduced, a novel approach for rapidly editing the color of scenes modeled by 3DGS while preserving viewdependent effects such as specular highlights and facilitating real-time interaction and providing control over the strength of the view-dependent effects.

Abstract

3D Gaussian Splatting (3DGS) has recently transformed the fields of novel view synthesis and 3D reconstruction due to its ability to accurately model complex 3D scenes and its unprecedented rendering performance. However, a significant challenge persists: the absence of an efficient and photorealistic method for editing the appearance of the scene's content. In this paper we introduce VIRGi, a novel approach for rapidly editing the color of scenes modeled by 3DGS while preserving view-dependent effects such as specular highlights. Key to our method are a novel architecture that separates color into diffuse and view-dependent components, and a multi-view training strategy that integrates image patches from multiple viewpoints. Improving over the conventional single-view batch training, our 3DGS representation provides more accurate reconstruction and serves as a solid representation for the recoloring task. For 3DGS recoloring, we then introduce a rapid scheme requiring only one manually edited image of the scene from the end-user. By fine-tuning the weights of a single MLP, alongside a module for single-shot segmentation of the editable area, the color edits are seamlessly propagated to the entire scene in just two seconds, facilitating real-time interaction and providing control over the strength of the view-dependent effects. An exhaustive validation on diverse datasets demonstrates significant quantitative and qualitative advancements over competitors based on Neural Radiance Fields representations.
Paper Structure (13 sections, 7 equations, 10 figures, 3 tables)

This paper contains 13 sections, 7 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: VIRGi enables -- by editing a single view -- instant recolorings of one or multiple objects in scenes modeled with 3D Gaussian Splats. Our key idea of decomposing the reflectance of the scene into diffuse and specular components allows view-dependent consistent edits. Our separation can also be used to enhance or reduce material specularity.
  • Figure 2: Comparison of recoloring methods with a ground truth recolor. Our method VIRGi is the first one to propose a solution for 3DGS recoloring, outperforming one of the best methods in NeRFs recoloring ireneCVPR2023 (IReNe). We propose several contributions that make our method robust and better than a baseline (Vanilla-VIRGi) obtained by naïvely combining existing methods.
  • Figure 3: Overview of VIRGi. Our method comprises two main steps. In the initial step, we train a 3DGS architecture consisting of a Hashgrid $f$ representing the geometry and two MLPs for color modeling. We separate the observed color into a diffuse term $\textsc{MLP}_{\textrm{diff}}$, which solely relies on hashgrid features $f$, and a specular term $\textsc{MLP}_{\textrm{spec}}$, which incorporates the view direction $\theta$. Our training strategy involves leveraging multiple views of the same scene point in a single batch, as opposed to using a single image per batch, resulting in improved PSNR compared to previous methods. In the second step, given an edited 2D image, we fine-tune the last layer of the diffuse MLP using a soft-segmentation mask $\alpha$ to achieve the target color.
  • Figure 4: Specularity Comparison: Monoview vs. Multiview. Training with a single viewpoint poses challenges in learning specularities due to limited sample diversity, multiview training facilitates an easier understanding of specular effects.
  • Figure 5: Comparison between VIRGi (upper row per case) and Gaussian Editor chen2023gaussianeditor (bottom row per case). Gaussian Editor requires the user to initialize a mask using SAM Kirillov2023SegmentA before applying color edits based on provided prompts: "kitchen - make it magenta", "bicycle - make it yellow", and "bonsai - make it green". Gaussian Editor exhibits several drawbacks, such as color bleeding into surrounding areas in the first two scenes and not understanding the scene structure when recoloring the third scene. Notably, Gaussian Editor takes an average of 612 seconds to process these scenes, while VIRGi completes the task in just 2 seconds on average.
  • ...and 5 more figures