Table of Contents
Fetching ...

Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph

Donglin Di, Jiahui Yang, Chaofan Luo, Zhou Xue, Wei Chen, Xun Yang, Yue Gao

TL;DR

Hyper-3DG addresses the challenge of capturing high-order geometry-texture correlations in text-to-3D generation by introducing a Geometry and Texture Hypergraph Refiner (HGRefiner) that refines 3D Gaussians at a patch level via Patch-3DGS hypergraphs. The method couples a warm-up phase with a pre-trained 3D generator and a 2D diffusion model, followed by a high-order refine stage that patches the 3D Gaussians, extracts latent 2D features, and updates the 3D representation through hypergraph neural networks guided by the Interval Score Matching loss. Key contributions include the Patch-3DGS-HGNN, 3DGS-Patchify via K-Means, and dual hypergraphs in spatial and latent spaces, enabling more realistic geometry and texture without extra backbone cost. Empirical results across comparisons, ablations, and a user study demonstrate improved cross-view consistency, texture fidelity, and structural integrity, suggesting strong practical impact for differentiable 3D content creation in VR, gaming, and design workflows, with publicly available code for reproducibility.

Abstract

Text-to-3D generation represents an exciting field that has seen rapid advancements, facilitating the transformation of textual descriptions into detailed 3D models. However, current progress often neglects the intricate high-order correlation of geometry and texture within 3D objects, leading to challenges such as over-smoothness, over-saturation and the Janus problem. In this work, we propose a method named ``3D Gaussian Generation via Hypergraph (Hyper-3DG)'', designed to capture the sophisticated high-order correlations present within 3D objects. Our framework is anchored by a well-established mainflow and an essential module, named ``Geometry and Texture Hypergraph Refiner (HGRefiner)''. This module not only refines the representation of 3D Gaussians but also accelerates the update process of these 3D Gaussians by conducting the Patch-3DGS Hypergraph Learning on both explicit attributes and latent visual features. Our framework allows for the production of finely generated 3D objects within a cohesive optimization, effectively circumventing degradation. Extensive experimentation has shown that our proposed method significantly enhances the quality of 3D generation while incurring no additional computational overhead for the underlying framework. (Project code: https://github.com/yjhboy/Hyper3DG)

Hyper-3DG: Text-to-3D Gaussian Generation via Hypergraph

TL;DR

Hyper-3DG addresses the challenge of capturing high-order geometry-texture correlations in text-to-3D generation by introducing a Geometry and Texture Hypergraph Refiner (HGRefiner) that refines 3D Gaussians at a patch level via Patch-3DGS hypergraphs. The method couples a warm-up phase with a pre-trained 3D generator and a 2D diffusion model, followed by a high-order refine stage that patches the 3D Gaussians, extracts latent 2D features, and updates the 3D representation through hypergraph neural networks guided by the Interval Score Matching loss. Key contributions include the Patch-3DGS-HGNN, 3DGS-Patchify via K-Means, and dual hypergraphs in spatial and latent spaces, enabling more realistic geometry and texture without extra backbone cost. Empirical results across comparisons, ablations, and a user study demonstrate improved cross-view consistency, texture fidelity, and structural integrity, suggesting strong practical impact for differentiable 3D content creation in VR, gaming, and design workflows, with publicly available code for reproducibility.

Abstract

Text-to-3D generation represents an exciting field that has seen rapid advancements, facilitating the transformation of textual descriptions into detailed 3D models. However, current progress often neglects the intricate high-order correlation of geometry and texture within 3D objects, leading to challenges such as over-smoothness, over-saturation and the Janus problem. In this work, we propose a method named ``3D Gaussian Generation via Hypergraph (Hyper-3DG)'', designed to capture the sophisticated high-order correlations present within 3D objects. Our framework is anchored by a well-established mainflow and an essential module, named ``Geometry and Texture Hypergraph Refiner (HGRefiner)''. This module not only refines the representation of 3D Gaussians but also accelerates the update process of these 3D Gaussians by conducting the Patch-3DGS Hypergraph Learning on both explicit attributes and latent visual features. Our framework allows for the production of finely generated 3D objects within a cohesive optimization, effectively circumventing degradation. Extensive experimentation has shown that our proposed method significantly enhances the quality of 3D generation while incurring no additional computational overhead for the underlying framework. (Project code: https://github.com/yjhboy/Hyper3DG)
Paper Structure (26 sections, 3 equations, 14 figures, 2 algorithms)

This paper contains 26 sections, 3 equations, 14 figures, 2 algorithms.

Figures (14)

  • Figure 1: Examples showcase the capability of text-to-3D content generations with our framework "3D Gaussian Generation via Hypergraph (Hyper-3DG)", which achieves creating high-fidelity 3D objects from text input. Please zoom in for more geometry and textural details.
  • Figure 2: Illustration the challenges of the Janus Problem and Incoherence issues. We showcase the comparison of the current state-of-the-art method (denoted as "SOTA") and our proposed approach ("Hyper-3DG"). We zoom in the depth image (right part) to show the details. The textual prompts are respectively "A DSLR photo of a bald eagle" (left) and "A vase of red flowers" (right).
  • Figure 3: Illustration of the proposed method, 3D Gaussian Generation via Hypergraph (Hyper-3DG). Our method comprises a main flow as well as a designed hypergraph refiner module (Geometry and Texture Hypergraph Refiner). Given the text prompt as input, the "Warm up" stage can yield coarse 3D Gaussian by a pre-trained 3D generator and a 2D diffusion model. After $N_0$ steps of initialization, the "HGRefiner" further refines the geometry and texture of the coarse 3D Gaussian at the patch level, with an adjustable updated hypergraph structure. Following $N_1$ steps of high-order refinement, the final fine-generated 3D Object is obtained.
  • Figure 4: One example of DreamFusion poole2022dreamfusion, DreamGaussian tang2023dreamgaussian, GSGEN chen2023text, LucidDreamer liang2023luciddreamer, and our proposed method Hyper-3DG (the final two lines depicting contrasting perspectives, i.e., thophoric view and the overlook) with the same settings. The images in each column represent rendering results from an identical perspective. A few methods could not generate the back view known as the Janus problem. The results demonstrate the superiority of our approach in synthesizing highly realistic content, replete with intricate details. Please zoom in for the finer intricacies.
  • Figure 5: A comparison of experimental results among state-of-the-art methods and our approach under identical settings.
  • ...and 9 more figures