Table of Contents
Fetching ...

Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility

Yidi Li, Jun Xiao, Zhengda Lu, Yiqun Wang, Haiyong Jiang

TL;DR

The paper tackles the challenge of generating vector graphics that remain coherent across arbitrary viewpoints while correctly representing occlusions. It introduces Dream3DVG, a dual-branch framework combining a 3D Gaussian Splatting (3DGS) guidance branch with a 3DVG optimization branch, augmented by a visibility-aware rendering module that includes Importance Filtering and Antipodal-depth Visibility Voting. A coarse-to-fine guidance strategy enables progressive detail control, and joint optimization with LPIPS and CLIP losses aligns geometric structure with text prompts across multiple views. Experiments on 3D sketches and 3D iconographies demonstrate superior multi-view consistency, occlusion-aware stroke culling, and higher semantic alignment compared to baselines, suggesting practical utility for design and animation workflows.

Abstract

This work presents a novel text-to-vector graphics generation approach, Dream3DVG, allowing for arbitrary viewpoint viewing, progressive detail optimization, and view-dependent occlusion awareness. Our approach is a dual-branch optimization framework, consisting of an auxiliary 3D Gaussian Splatting optimization branch and a 3D vector graphics optimization branch. The introduced 3DGS branch can bridge the domain gaps between text prompts and vector graphics with more consistent guidance. Moreover, 3DGS allows for progressive detail control by scheduling classifier-free guidance, facilitating guiding vector graphics with coarse shapes at the initial stages and finer details at later stages. We also improve the view-dependent occlusions by devising a visibility-awareness rendering module. Extensive results on 3D sketches and 3D iconographies, demonstrate the superiority of the method on different abstraction levels of details, cross-view consistency, and occlusion-aware stroke culling.

Empowering Vector Graphics with Consistently Arbitrary Viewing and View-dependent Visibility

TL;DR

The paper tackles the challenge of generating vector graphics that remain coherent across arbitrary viewpoints while correctly representing occlusions. It introduces Dream3DVG, a dual-branch framework combining a 3D Gaussian Splatting (3DGS) guidance branch with a 3DVG optimization branch, augmented by a visibility-aware rendering module that includes Importance Filtering and Antipodal-depth Visibility Voting. A coarse-to-fine guidance strategy enables progressive detail control, and joint optimization with LPIPS and CLIP losses aligns geometric structure with text prompts across multiple views. Experiments on 3D sketches and 3D iconographies demonstrate superior multi-view consistency, occlusion-aware stroke culling, and higher semantic alignment compared to baselines, suggesting practical utility for design and animation workflows.

Abstract

This work presents a novel text-to-vector graphics generation approach, Dream3DVG, allowing for arbitrary viewpoint viewing, progressive detail optimization, and view-dependent occlusion awareness. Our approach is a dual-branch optimization framework, consisting of an auxiliary 3D Gaussian Splatting optimization branch and a 3D vector graphics optimization branch. The introduced 3DGS branch can bridge the domain gaps between text prompts and vector graphics with more consistent guidance. Moreover, 3DGS allows for progressive detail control by scheduling classifier-free guidance, facilitating guiding vector graphics with coarse shapes at the initial stages and finer details at later stages. We also improve the view-dependent occlusions by devising a visibility-awareness rendering module. Extensive results on 3D sketches and 3D iconographies, demonstrate the superiority of the method on different abstraction levels of details, cross-view consistency, and occlusion-aware stroke culling.

Paper Structure

This paper contains 28 sections, 7 equations, 22 figures, 3 tables.

Figures (22)

  • Figure 1: The overall architecture. The method takes as input a text prompt and outputs rendered 2D vector graphics (2DVG). The entire network consists of two branches: a 3DGS optimization branch (top row) to optimize a 3DGS with the text prompt and sample coarse-to-fine guidance images; a 3D vector graphics (3DVG) optimization branch (bottom row) that generates 3DVG and renders 2DVG with reasonable occlusion by a Visibility-awareness Rendering module.
  • Figure 2: Guidance samples during the 3DGS optimization, with prompt "Viking axe, fantasy, weapon". Our sampling method (bottom row) with scheduled CFG can maintain the semantics and generate effectively smoothed samples, compared with samples from 3DGS renderings (top row) and standard sampling from diffusion trajectory with fixed CFG (second row).
  • Figure 3: Illustration of Visibility-awareness Rendering. Note that we render the importance with trained opacity and non-visible curves with fixed low opacity for 2DVG rendering visualization.
  • Figure 4: Qualitative results of 3D sketch. "[*]" is "minimal 2d line drawing, on a white background, black and white" for Diff3DS. All methods are rendered with the same test camera poses. The non-visible curves are rendered with lower opacity in ours.
  • Figure 5: Qualitative results of sketch and iconography.
  • ...and 17 more figures