Table of Contents
Fetching ...

Generative Human Geometry Distribution

Xiangjun Tang, Biao Zhang, Peter Wonka

TL;DR

This work introduces Generative Human Geometry Distribution, a framework that models distributions of human geometries by encoding each geometry as a compact 2D feature map and using the SMPL domain as the learning space. It replaces the prior Gaussian prior with an SMPL-based source distribution and employs a two-stage diffusion-flow pipeline: first compressing geometry distributions into latent maps, then learning a distribution over these maps, both conditioned on SMPL. The method enables pose-conditioned random avatar generation and avatar-consistent novel pose synthesis, delivering substantial gains in geometry quality over state-of-the-art methods while remaining robust to pose variation and conditioning mismatches. This distribution-over-distribution approach reduces memory and computation barriers associated with single-geometry representations and has practical implications for scalable, high-fidelity 3D human synthesis and animation.

Abstract

Realistic human geometry generation is an important yet challenging task, requiring both the preservation of fine clothing details and the accurate modeling of clothing-body interactions. To tackle this challenge, we build upon Geometry distributions, a recently proposed representation that can model a single human geometry with high fidelity using a flow matching model. However, extending a single-geometry distribution to a dataset is non-trivial and inefficient for large-scale learning. To address this, we propose a new geometry distribution model by two key techniques: (1) encoding distributions as 2D feature maps rather than network parameters, and (2) using SMPL models as the domain instead of Gaussian and refining the associated flow velocity field. We then design a generative framework adopting a two staged training paradigm analogous to state-of-the-art image and 3D generative models. In the first stage, we compress geometry distributions into a latent space using a diffusion flow model; the second stage trains another flow model on this latent space. We validate our approach on two key tasks: pose-conditioned random avatar generation and avatar-consistent novel pose synthesis. Experimental results demonstrate that our method outperforms existing state-of-the-art methods, achieving a 57% improvement in geometry quality.

Generative Human Geometry Distribution

TL;DR

This work introduces Generative Human Geometry Distribution, a framework that models distributions of human geometries by encoding each geometry as a compact 2D feature map and using the SMPL domain as the learning space. It replaces the prior Gaussian prior with an SMPL-based source distribution and employs a two-stage diffusion-flow pipeline: first compressing geometry distributions into latent maps, then learning a distribution over these maps, both conditioned on SMPL. The method enables pose-conditioned random avatar generation and avatar-consistent novel pose synthesis, delivering substantial gains in geometry quality over state-of-the-art methods while remaining robust to pose variation and conditioning mismatches. This distribution-over-distribution approach reduces memory and computation barriers associated with single-geometry representations and has practical implications for scalable, high-fidelity 3D human synthesis and animation.

Abstract

Realistic human geometry generation is an important yet challenging task, requiring both the preservation of fine clothing details and the accurate modeling of clothing-body interactions. To tackle this challenge, we build upon Geometry distributions, a recently proposed representation that can model a single human geometry with high fidelity using a flow matching model. However, extending a single-geometry distribution to a dataset is non-trivial and inefficient for large-scale learning. To address this, we propose a new geometry distribution model by two key techniques: (1) encoding distributions as 2D feature maps rather than network parameters, and (2) using SMPL models as the domain instead of Gaussian and refining the associated flow velocity field. We then design a generative framework adopting a two staged training paradigm analogous to state-of-the-art image and 3D generative models. In the first stage, we compress geometry distributions into a latent space using a diffusion flow model; the second stage trains another flow model on this latent space. We validate our approach on two key tasks: pose-conditioned random avatar generation and avatar-consistent novel pose synthesis. Experimental results demonstrate that our method outperforms existing state-of-the-art methods, achieving a 57% improvement in geometry quality.

Paper Structure

This paper contains 25 sections, 7 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: Geometry distribution.
  • Figure 2: (a) Denoising process of a human geometry distribution. (b) Random avatar generation for a given pose. (c) Novel pose generation of a given avatar. Results are rendered from point clouds.
  • Figure 3: Aggregated samples $(\mathbf{x}_0',\mathbf{x}_1)$.
  • Figure 4: Overview of our method. (a) We encode a geometry into a feature map, which is decompressed with a SMPL vertex map. The decompressed feature serves as a condition for our denoising network. (b) The human generation task is formulated as the conditional generation of feature maps, guided by the SMPL vertex map, optionally incorporating additional conditioning inputs.
  • Figure 5: Single-geometry fitting visualization.
  • ...and 8 more figures