Table of Contents
Fetching ...

SplatFont3D: Structure-Aware Text-to-3D Artistic Font Generation with Part-Level Style Control

Ji Gan, Lingxu Chen, Jiaxu Leng, Xinbo Gao

TL;DR

This work tackles 3D artistic font generation under strict semantic and structural constraints and data scarcity. It introduces SplatFont3D, a structure-aware framework built on Glyph2Cloud initialization, 3D Gaussian Splatting (3DGS), and Score Distillation Sampling (SDS) with 2D diffusion priors to synthesize 3D fonts from text prompts. It adds Dynamic Component Assignment for explicit part-level style control, enabling careful partitioning of font components and reducing drift during optimization. Across quantitative and qualitative experiments, SplatFont3D achieves state-of-the-art style–text consistency, visual quality, and rendering efficiency, outperforming NeRF-based and other text-to-3D models, especially in part-level control and complex glyphs.

Abstract

Artistic font generation (AFG) can assist human designers in creating innovative artistic fonts. However, most previous studies primarily focus on 2D artistic fonts in flat design, leaving personalized 3D-AFG largely underexplored. 3D-AFG not only enables applications in immersive 3D environments such as video games and animations, but also may enhance 2D-AFG by rendering 2D fonts of novel views. Moreover, unlike general 3D objects, 3D fonts exhibit precise semantics with strong structural constraints and also demand fine-grained part-level style control. To address these challenges, we propose SplatFont3D, a novel structure-aware text-to-3D AFG framework with 3D Gaussian splatting, which enables the creation of 3D artistic fonts from diverse style text prompts with precise part-level style control. Specifically, we first introduce a Glyph2Cloud module, which progressively enhances both the shapes and styles of 2D glyphs (or components) and produces their corresponding 3D point clouds for Gaussian initialization. The initialized 3D Gaussians are further optimized through interaction with a pretrained 2D diffusion model using score distillation sampling. To enable part-level control, we present a dynamic component assignment strategy that exploits the geometric priors of 3D Gaussians to partition components, while alleviating drift-induced entanglement during 3D Gaussian optimization. Our SplatFont3D provides more explicit and effective part-level style control than NeRF, attaining faster rendering efficiency. Experiments show that our SplatFont3D outperforms existing 3D models for 3D-AFG in style-text consistency, visual quality, and rendering efficiency.

SplatFont3D: Structure-Aware Text-to-3D Artistic Font Generation with Part-Level Style Control

TL;DR

This work tackles 3D artistic font generation under strict semantic and structural constraints and data scarcity. It introduces SplatFont3D, a structure-aware framework built on Glyph2Cloud initialization, 3D Gaussian Splatting (3DGS), and Score Distillation Sampling (SDS) with 2D diffusion priors to synthesize 3D fonts from text prompts. It adds Dynamic Component Assignment for explicit part-level style control, enabling careful partitioning of font components and reducing drift during optimization. Across quantitative and qualitative experiments, SplatFont3D achieves state-of-the-art style–text consistency, visual quality, and rendering efficiency, outperforming NeRF-based and other text-to-3D models, especially in part-level control and complex glyphs.

Abstract

Artistic font generation (AFG) can assist human designers in creating innovative artistic fonts. However, most previous studies primarily focus on 2D artistic fonts in flat design, leaving personalized 3D-AFG largely underexplored. 3D-AFG not only enables applications in immersive 3D environments such as video games and animations, but also may enhance 2D-AFG by rendering 2D fonts of novel views. Moreover, unlike general 3D objects, 3D fonts exhibit precise semantics with strong structural constraints and also demand fine-grained part-level style control. To address these challenges, we propose SplatFont3D, a novel structure-aware text-to-3D AFG framework with 3D Gaussian splatting, which enables the creation of 3D artistic fonts from diverse style text prompts with precise part-level style control. Specifically, we first introduce a Glyph2Cloud module, which progressively enhances both the shapes and styles of 2D glyphs (or components) and produces their corresponding 3D point clouds for Gaussian initialization. The initialized 3D Gaussians are further optimized through interaction with a pretrained 2D diffusion model using score distillation sampling. To enable part-level control, we present a dynamic component assignment strategy that exploits the geometric priors of 3D Gaussians to partition components, while alleviating drift-induced entanglement during 3D Gaussian optimization. Our SplatFont3D provides more explicit and effective part-level style control than NeRF, attaining faster rendering efficiency. Experiments show that our SplatFont3D outperforms existing 3D models for 3D-AFG in style-text consistency, visual quality, and rendering efficiency.

Paper Structure

This paper contains 35 sections, 9 equations, 8 figures, 3 tables.

Figures (8)

  • Figure 1: Overview of SplatFont3D for structure-aware 3D-AFG. Both "Global Style Generation" and "Part-Level Style Control" generate 3D fonts from scratch, rather than editing existing 3D fonts.
  • Figure 2: Qualitative comparisons of global style generation and part-level style control.
  • Figure 3: Rendering time comparison.
  • Figure 4: Qualitative ablation results.
  • Figure 5: Glyph2Cloud for shape-style tradeoffs: (a) 2D results and (b) the final 3D fonts.
  • ...and 3 more figures