InstantStyleGaussian: Efficient Art Style Transfer with 3D Gaussian Splatting
Xin-Yi Yu, Jun-Xin Yu, Li-Bo Zhou, Yan Wei, Lin-Lin Ou
TL;DR
InstantStyleGaussian addresses the need for fast, multi-view-consistent 3D style transfer on existing 3D Gaussian Splatting scenes. It fuses an image-conditioned diffusion model (InstantStyle) with an improved Iterative Dataset Update to edit rendered 2D views and propagate changes back to the 3DGS representation, preserving structure via edge maps and NNFM loss. The approach achieves high-quality stylization with significantly reduced editing time (roughly 20 minutes per scene) and superior multi-view consistency compared with prior 3D editing methods, demonstrated on Tanks & Temples and Mip-NeRF 360. This enables practical applications in content creation for games, VR, and AR, while outlining limitations to geometric edits and object insertion/removal for future work.
Abstract
We present InstantStyleGaussian, an innovative 3D style transfer method based on the 3D Gaussian Splatting (3DGS) scene representation. By inputting a target-style image, it quickly generates new 3D GS scenes. Our method operates on pre-reconstructed GS scenes, combining diffusion models with an improved iterative dataset update strategy. It utilizes diffusion models to generate target style images, adds these new images to the training dataset, and uses this dataset to iteratively update and optimize the GS scenes, significantly accelerating the style editing process while ensuring the quality of the generated scenes. Extensive experimental results demonstrate that our method ensures high-quality stylized scenes while offering significant advantages in style transfer speed and consistency.
