ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation
Pengzhi Li, Chengshuai Tang, Qinxuan Huang, Zhiheng Li
TL;DR
ART3D tackles 3D artistic scene generation from text by combining diffusion-based image synthesis with 3D Gaussian splatting. It introduces an image semantic transfer module to align artistic inputs with realistic depth cues, builds a view-consistent point cloud map, and uses a depth consistency module to ensure cross-view coherence. The final 3D scenes are rendered via 3D Gaussian splatting, trained with supervision from projected views and ignoring unreliable inpainted regions. Quantitative and qualitative experiments show superior content and structural consistency over baselines, highlighting its potential for high-quality AI-assisted art.
Abstract
In this paper, we explore the existing challenges in 3D artistic scene generation by introducing ART3D, a novel framework that combines diffusion models and 3D Gaussian splatting techniques. Our method effectively bridges the gap between artistic and realistic images through an innovative image semantic transfer algorithm. By leveraging depth information and an initial artistic image, we generate a point cloud map, addressing domain differences. Additionally, we propose a depth consistency module to enhance 3D scene consistency. Finally, the 3D scene serves as initial points for optimizing Gaussian splats. Experimental results demonstrate ART3D's superior performance in both content and structural consistency metrics when compared to existing methods. ART3D significantly advances the field of AI in art creation by providing an innovative solution for generating high-quality 3D artistic scenes.
