ArtNVG: Content-Style Separated Artistic Neighboring-View Gaussian Stylization
Zixiao Gu, Mengtian Li, Ruhua Chen, Zhongxia Ji, Sichen Guo, Zhenye Zhang, Guangnan Ye, Zuo Hu
TL;DR
ArtNVG addresses the challenge of stylizing 3D Gaussian Splatting scenes with a target style image while preserving content and ensuring local color/texture coherence. It introduces Content-Style Separated Control to decouple content and style influences and employs Attention-based Neighboring-View Alignment to enforce cross-view consistency during diffusion-driven stylization. The framework leverages 3DGS, CSGO-style/content projections, Tile ControlNet, and a Neighboring-View diffusion model, achieving fast optimization and high reconstruction quality. Empirical results on Tanks and Temples with WikiArt styles show superior content fidelity, style alignment, and multi-view consistency compared to StyleGaussian and InstantStyleGaussian, with a total stylization time around 20 minutes. This approach enables robust, scalable 3D style transfer suitable for production pipelines in film, gaming, and immersive media.
Abstract
As demand from the film and gaming industries for 3D scenes with target styles grows, the importance of advanced 3D stylization techniques increases. However, recent methods often struggle to maintain local consistency in color and texture throughout stylized scenes, which is essential for maintaining aesthetic coherence. To solve this problem, this paper introduces ArtNVG, an innovative 3D stylization framework that efficiently generates stylized 3D scenes by leveraging reference style images. Built on 3D Gaussian Splatting (3DGS), ArtNVG achieves rapid optimization and rendering while upholding high reconstruction quality. Our framework realizes high-quality 3D stylization by incorporating two pivotal techniques: Content-Style Separated Control and Attention-based Neighboring-View Alignment. Content-Style Separated Control uses the CSGO model and the Tile ControlNet to decouple the content and style control, reducing risks of information leakage. Concurrently, Attention-based Neighboring-View Alignment ensures consistency of local colors and textures across neighboring views, significantly improving visual quality. Extensive experiments validate that ArtNVG surpasses existing methods, delivering superior results in content preservation, style alignment, and local consistency.
