Table of Contents
Fetching ...

DAV-GSWT: Diffusion-Active-View Sampling for Data-Efficient Gaussian Splatting Wang Tiles

Rong Fu, Jiekai Wu, Haiyun Wei, Yee Tan Jia, Wenxin Zhang, Yang Li, Xiaowen Ma, Wangyu Wu, Simon Fong

TL;DR

This work tackles data efficiency for large-scale 3D terrain synthesis by marrying diffusion priors with active view sampling in Gaussian Splatting Wang Tiles. The DAV-GSWT framework selects informative viewpoints via image- and latent-space uncertainty, refines geometry and textures with diffusion-based priors, and stitches tiles with semantic-aware seam optimization and adaptive LOD rendering. Key contributions include a probabilistic uncertainty-driven view acquisition loop, a diffusion-refinement pipeline for tile boundaries, and a real-time renderer with uncertainty-guided caching that maintains interactivity under tight data budgets. Experiments on synthetic and real terrains demonstrate substantial reductions in required views while preserving high visual fidelity and interactive performance, enabling scalable, infinite-terrain production for immersive applications.

Abstract

The emergence of 3D Gaussian Splatting has fundamentally redefined the capabilities of photorealistic neural rendering by enabling high-throughput synthesis of complex environments. While procedural methods like Wang Tiles have recently been integrated to facilitate the generation of expansive landscapes, these systems typically remain constrained by a reliance on densely sampled exemplar reconstructions. We present DAV-GSWT, a data-efficient framework that leverages diffusion priors and active view sampling to synthesize high-fidelity Gaussian Splatting Wang Tiles from minimal input observations. By integrating a hierarchical uncertainty quantification mechanism with generative diffusion models, our approach autonomously identifies the most informative viewpoints while hallucinating missing structural details to ensure seamless tile transitions. Experimental results indicate that our system significantly reduces the required data volume while maintaining the visual integrity and interactive performance necessary for large-scale virtual environments.

DAV-GSWT: Diffusion-Active-View Sampling for Data-Efficient Gaussian Splatting Wang Tiles

TL;DR

This work tackles data efficiency for large-scale 3D terrain synthesis by marrying diffusion priors with active view sampling in Gaussian Splatting Wang Tiles. The DAV-GSWT framework selects informative viewpoints via image- and latent-space uncertainty, refines geometry and textures with diffusion-based priors, and stitches tiles with semantic-aware seam optimization and adaptive LOD rendering. Key contributions include a probabilistic uncertainty-driven view acquisition loop, a diffusion-refinement pipeline for tile boundaries, and a real-time renderer with uncertainty-guided caching that maintains interactivity under tight data budgets. Experiments on synthetic and real terrains demonstrate substantial reductions in required views while preserving high visual fidelity and interactive performance, enabling scalable, infinite-terrain production for immersive applications.

Abstract

The emergence of 3D Gaussian Splatting has fundamentally redefined the capabilities of photorealistic neural rendering by enabling high-throughput synthesis of complex environments. While procedural methods like Wang Tiles have recently been integrated to facilitate the generation of expansive landscapes, these systems typically remain constrained by a reliance on densely sampled exemplar reconstructions. We present DAV-GSWT, a data-efficient framework that leverages diffusion priors and active view sampling to synthesize high-fidelity Gaussian Splatting Wang Tiles from minimal input observations. By integrating a hierarchical uncertainty quantification mechanism with generative diffusion models, our approach autonomously identifies the most informative viewpoints while hallucinating missing structural details to ensure seamless tile transitions. Experimental results indicate that our system significantly reduces the required data volume while maintaining the visual integrity and interactive performance necessary for large-scale virtual environments.
Paper Structure (33 sections, 12 equations, 7 figures, 5 tables, 1 algorithm)

This paper contains 33 sections, 12 equations, 7 figures, 5 tables, 1 algorithm.

Figures (7)

  • Figure 1: Overview of the DAV-GSWT framework for data-efficient Gaussian Splatting and tiling. The pipeline begins with a coarse reconstruction $\mathcal{G}_0$ computed from sparse initial images $\mathcal{I}_{\mathrm{init}}$. During the active cycle, a pre-trained diffusion model generates $M$ stochastic latent samples $z_m(\theta)$ using attention dropout. These samples are evaluated by the uncertainty estimator, which computes a score $u(\theta)$ from image-space LPIPS gradients or the latent 2-Wasserstein divergence $W_2(\mathcal{Z})$. The top-$k$ poses $\Theta^{\ast}$ are selected for physical acquisition to refine the field into $\mathcal{G}_T$. In the synthesis stage, the refined field is partitioned into Wang tiles $\mathcal{T}$, and seam continuity is optimized through an uncertainty-adaptive graph cut that adjusts the semantic weight $\gamma(\bar{u})$ to maintain perceptual and geometric consistency.
  • Figure 2: Active-view uncertainty over a dense candidate viewing sphere. Each point represents a candidate camera pose parameterized by azimuth and elevation, colored by diffusion epistemic uncertainty. White star markers indicate the top-$k$ views selected for physical capture.
  • Figure 3: Iterative reconstruction evolution under DAV-GSWT. Top row shows rendered reconstructions at iterations $T=0,1,3$. Bottom row visualizes the corresponding absolute error maps with respect to the final reconstruction. Error magnitudes are normalized and share a common color scale.
  • Figure 4: Ablation study of uncertainty formulations for active view selection. From left to right and top to bottom: image-gradient-based uncertainty with LPIPS, Wasserstein-2 only, Wasserstein-2 combined with LPIPS, and ground truth. Seam-level LPIPS scores are reported for each variant.
  • Figure 5: Comparison of seam artifacts using color-only graph cuts versus semantic-aware cuts augmented with SAM. Semantic constraints significantly reduce seam density around object boundaries.
  • ...and 2 more figures