Table of Contents
Fetching ...

DiffGS: Functional Gaussian Splatting Diffusion

Junsheng Zhou, Weiqi Zhang, Yu-Shen Liu

TL;DR

DiffGS, a general Gaussian generator based on latent diffusion models, is proposed, a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization.

Abstract

3D Gaussian Splatting (3DGS) has shown convincing performance in rendering speed and fidelity, yet the generation of Gaussian Splatting remains a challenge due to its discreteness and unstructured nature. In this work, we propose DiffGS, a general Gaussian generator based on latent diffusion models. DiffGS is a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization. The key insight is to represent Gaussian Splatting in a disentangled manner via three novel functions to model Gaussian probabilities, colors and transforms. Through the novel disentanglement of 3DGS, we represent the discrete and unstructured 3DGS with continuous Gaussian Splatting functions, where we then train a latent diffusion model with the target of generating these Gaussian Splatting functions both unconditionally and conditionally. Meanwhile, we introduce a discretization algorithm to extract Gaussians at arbitrary numbers from the generated functions via octree-guided sampling and optimization. We explore DiffGS for various tasks, including unconditional generation, conditional generation from text, image, and partial 3DGS, as well as Point-to-Gaussian generation. We believe that DiffGS provides a new direction for flexibly modeling and generating Gaussian Splatting.

DiffGS: Functional Gaussian Splatting Diffusion

TL;DR

DiffGS, a general Gaussian generator based on latent diffusion models, is proposed, a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization.

Abstract

3D Gaussian Splatting (3DGS) has shown convincing performance in rendering speed and fidelity, yet the generation of Gaussian Splatting remains a challenge due to its discreteness and unstructured nature. In this work, we propose DiffGS, a general Gaussian generator based on latent diffusion models. DiffGS is a powerful and efficient 3D generative model which is capable of generating Gaussian primitives at arbitrary numbers for high-fidelity rendering with rasterization. The key insight is to represent Gaussian Splatting in a disentangled manner via three novel functions to model Gaussian probabilities, colors and transforms. Through the novel disentanglement of 3DGS, we represent the discrete and unstructured 3DGS with continuous Gaussian Splatting functions, where we then train a latent diffusion model with the target of generating these Gaussian Splatting functions both unconditionally and conditionally. Meanwhile, we introduce a discretization algorithm to extract Gaussians at arbitrary numbers from the generated functions via octree-guided sampling and optimization. We explore DiffGS for various tasks, including unconditional generation, conditional generation from text, image, and partial 3DGS, as well as Point-to-Gaussian generation. We believe that DiffGS provides a new direction for flexibly modeling and generating Gaussian Splatting.

Paper Structure

This paper contains 31 sections, 9 equations, 12 figures, 7 tables.

Figures (12)

  • Figure 1: The illustration of DiffGS. We fit 3DGS from multi-view images and then disentangle it into three Gaussian Splatting Functions. We train a Gaussian VAE with a latent diffusion model for generating these functions, followed by a Gaussian extraction algorithm to obtain the final generated Gaussians.
  • Figure 2: The overview of DiffGS. (a) We disentangle the fitted 3DGS into three Gaussian Splatting Functions to model the Gaussian probability, colors and transforms, respectively. We then train a Gaussian VAE with a conditional latent diffusion model for generating these functions. (b) During generation, we first extract Gaussian geometry from the generated GauPF, followed by the GauCF and GauTF to obtain the Gaussian attributes.
  • Figure 3: Gaussian geometry extractions from generated GauPF. The yellow and green regions indicate the high probability area and the low probability area judged by GauPF. (a),(b) and (c) show the progressively octree build process at depth 1,2 and $L$. (d) We sample proxy Gaussian centers from the octree at final depth $L$. (e) We optimize proxy centers to the exact geometry indicated in GauPF.
  • Figure 4: Visual comparisons with state-of-the-arts on unconditional generation of ShapeNet Chairs.
  • Figure 5: Visualization of conditional 3DGS generation results on ShapeNet. (a) Text conditional generation. (b) Image conditional generation. (c) Gaussian Splatting completion.
  • ...and 7 more figures