Reference-based Controllable Scene Stylization with Gaussian Splatting
Yiqun Mei, Jiacong Xu, Vishal M. Patel
TL;DR
This work tackles reference-based 3D scene stylization by leveraging pretrained 3D Gaussian Splatting (3DGS) to enable real-time stylized view synthesis. It introduces a texture-guided Gaussian control that adaptively densifies Gaussians in texture-rich regions and a depth-based regularization to preserve original geometry, complemented by view-consistent supervision via pseudo views and the Template Correspondence Matching loss. The approach yields state-of-the-art stylization quality with high-frequency texture fidelity while maintaining real-time rendering speeds, outperforming NeRF-based and prior 3D stylization methods. This advances practical applications in digital art, filmmaking, and immersive VR by enabling high-quality, interactive 3D appearance editing aligned to content-aligned references.
Abstract
Referenced-based scene stylization that edits the appearance based on a content-aligned reference image is an emerging research area. Starting with a pretrained neural radiance field (NeRF), existing methods typically learn a novel appearance that matches the given style. Despite their effectiveness, they inherently suffer from time-consuming volume rendering, and thus are impractical for many real-time applications. In this work, we propose ReGS, which adapts 3D Gaussian Splatting (3DGS) for reference-based stylization to enable real-time stylized view synthesis. Editing the appearance of a pretrained 3DGS is challenging as it uses discrete Gaussians as 3D representation, which tightly bind appearance with geometry. Simply optimizing the appearance as prior methods do is often insufficient for modeling continuous textures in the given reference image. To address this challenge, we propose a novel texture-guided control mechanism that adaptively adjusts local responsible Gaussians to a new geometric arrangement, serving for desired texture details. The proposed process is guided by texture clues for effective appearance editing, and regularized by scene depth for preserving original geometric structure. With these novel designs, we show ReGs can produce state-of-the-art stylization results that respect the reference texture while embracing real-time rendering speed for free-view navigation.
