Table of Contents
Fetching ...

SGSST: Scaling Gaussian Splatting StyleTransfer

Bruno Galerne, Jianling Wang, Lara Raad, Jean-Michel Morel

TL;DR

This work introduces SGSST: Scaling Gaussian Splatting Style Transfer, an optimization-based method to apply style transfer to pretrained 3DGS scenes and pioneers 3D scene style transfer at such high image resolutions.

Abstract

Applying style transfer to a full 3D environment is a challenging task that has seen many developments since the advent of neural rendering. 3D Gaussian splatting (3DGS) has recently pushed further many limits of neural rendering in terms of training speed and reconstruction quality. This work introduces SGSST: Scaling Gaussian Splatting Style Transfer, an optimization-based method to apply style transfer to pretrained 3DGS scenes. We demonstrate that a new multiscale loss based on global neural statistics, that we name SOS for Simultaneously Optimized Scales, enables style transfer to ultra-high resolution 3D scenes. Not only SGSST pioneers 3D scene style transfer at such high image resolutions, it also produces superior visual quality as assessed by thorough qualitative, quantitative and perceptual comparisons.

SGSST: Scaling Gaussian Splatting StyleTransfer

TL;DR

This work introduces SGSST: Scaling Gaussian Splatting Style Transfer, an optimization-based method to apply style transfer to pretrained 3DGS scenes and pioneers 3D scene style transfer at such high image resolutions.

Abstract

Applying style transfer to a full 3D environment is a challenging task that has seen many developments since the advent of neural rendering. 3D Gaussian splatting (3DGS) has recently pushed further many limits of neural rendering in terms of training speed and reconstruction quality. This work introduces SGSST: Scaling Gaussian Splatting Style Transfer, an optimization-based method to apply style transfer to pretrained 3DGS scenes. We demonstrate that a new multiscale loss based on global neural statistics, that we name SOS for Simultaneously Optimized Scales, enables style transfer to ultra-high resolution 3D scenes. Not only SGSST pioneers 3D scene style transfer at such high image resolutions, it also produces superior visual quality as assessed by thorough qualitative, quantitative and perceptual comparisons.

Paper Structure

This paper contains 37 sections, 7 equations, 29 figures, 2 tables.

Figures (29)

  • Figure 1: Various ultra-high definition style transfers of a Gaussian splatting 3D scene. SGSST transfers a very large set of global style statistics of an image to a 3DGS scene by minimizing a tailored multiscale SOS loss, yielding 3D style transfer of superior quality and at unprecedented high resolution (images have size 5187$\times$3361).
  • Figure 2: Overview of SGSST. Starting from a pretrained realistic 3DGS scene Kerbl_etal_3D_Gaussian_splatting_for_real-time_radiance_field_rendering_SIGGRAPH2023, we optimize the colors of each Gaussian using the new multiscale SOS loss (involving $n_{\mathrm{s}}=3$ scales in the illustration). Computing the gradient w.r.t. the loss is feasible for UHR images thanks to the SPST partition-based implementation Galerne_etal_scaling_painting_style_transfer_EGSR2024. Multiscale gradient stacking is used at the node of the rendered image to perform only one backpropagation per iteration through the 3DGS rendering pipeline.
  • Figure 3: UHR 3DGS style transfer. SGSST allows for the multiscale style transfer of 3DGS scenes at UHR. From left to right: Style image, one UHR stylized view, three magnified details, and evolution of the SOS loss and each style transfer loss that contributes to it. We first optimize the transfer loss for the coarsest scale (yellow curve) for 10k iterations and then optimize for another 10k iterations the SOS loss (light blue curve), namely the mean of the four transfer losses. Images sizes are 5187$\times$3361 for content and 4230$\times$3361 for style.
  • Figure 4: Comparison of SGSST (ours, top) with StyleGaussian Liu_etal_StyleGaussian_instant_3D_style_transfer_with_Gaussian_splatting_ArXiv2024 (middle) and ARF Zhang_etal_arf_artistic_radiance_fields_ECCV2022 (bottom). From left to right the content resolutions are 980$\times$545 (train), 979$\times$546 (truck), and 3115$\times$2076 (counter). For the first two examples, the various outputs keep the resolution of the content, but for the HR counter scene, the output sizes are 3115$\times$2076 for SGSST, 1600$\times$1066 for StyleGaussian and 779$\times$519 for ARF (see supp. mat. Figure 28 for ARF results without downscaling). Thanks to its multiscale global VGG statistics, SGSST is the most faithful method regarding style consistency.
  • Figure 5: Influence of optimization parameters: Allowing more 3DGS parameters to be optimized when minimizing the SOS loss does not improve the stylization quality and can dramatically impact the geometry. From left to right: Style image, content, SGSST default (optimization of colors), results when optimizing all spherical harmonics, results when optimizing all parameters.
  • ...and 24 more figures