Table of Contents
Fetching ...

GEM3D: GEnerative Medial Abstractions for 3D Shape Synthesis

Dmitry Petrov, Pradyumn Goyal, Vikas Thamizharasan, Vladimir G. Kim, Matheus Gadelha, Melinos Averkiou, Siddhartha Chaudhuri, Evangelos Kalogerakis

TL;DR

GEM3D introduces a topology-aware 3D shape generator that first synthesizes skeletal medial abstractions via diffusion and then reconstructs surfaces with a skeleton-guided neural implicit. The model couples two diffusion stages—one for the medial elements and one for their latents—followed by a neural enveloping surface decoder that yields contiguous surfaces. It demonstrates superior surface fidelity and topology preservation on ShapeNet and Thingi10K across category-conditioned generation, point-cloud reconstruction, and skeleton-guided synthesis, with competitive or faster performance. By integrating explicit structural priors through the medial axis and directional envelopes, GEM3D enables more interpretable control and robust handling of complex topologies in 3D shape generation. The approach has practical impact for artists and engineers seeking topology-faithful, diverse 3D assets and improved reconstruction from sparse data.

Abstract

We introduce GEM3D -- a new deep, topology-aware generative model of 3D shapes. The key ingredient of our method is a neural skeleton-based representation encoding information on both shape topology and geometry. Through a denoising diffusion probabilistic model, our method first generates skeleton-based representations following the Medial Axis Transform (MAT), then generates surfaces through a skeleton-driven neural implicit formulation. The neural implicit takes into account the topological and geometric information stored in the generated skeleton representations to yield surfaces that are more topologically and geometrically accurate compared to previous neural field formulations. We discuss applications of our method in shape synthesis and point cloud reconstruction tasks, and evaluate our method both qualitatively and quantitatively. We demonstrate significantly more faithful surface reconstruction and diverse shape generation results compared to the state-of-the-art, also involving challenging scenarios of reconstructing and synthesizing structurally complex, high-genus shape surfaces from Thingi10K and ShapeNet.

GEM3D: GEnerative Medial Abstractions for 3D Shape Synthesis

TL;DR

GEM3D introduces a topology-aware 3D shape generator that first synthesizes skeletal medial abstractions via diffusion and then reconstructs surfaces with a skeleton-guided neural implicit. The model couples two diffusion stages—one for the medial elements and one for their latents—followed by a neural enveloping surface decoder that yields contiguous surfaces. It demonstrates superior surface fidelity and topology preservation on ShapeNet and Thingi10K across category-conditioned generation, point-cloud reconstruction, and skeleton-guided synthesis, with competitive or faster performance. By integrating explicit structural priors through the medial axis and directional envelopes, GEM3D enables more interpretable control and robust handling of complex topologies in 3D shape generation. The approach has practical impact for artists and engineers seeking topology-faithful, diverse 3D assets and improved reconstruction from sparse data.

Abstract

We introduce GEM3D -- a new deep, topology-aware generative model of 3D shapes. The key ingredient of our method is a neural skeleton-based representation encoding information on both shape topology and geometry. Through a denoising diffusion probabilistic model, our method first generates skeleton-based representations following the Medial Axis Transform (MAT), then generates surfaces through a skeleton-driven neural implicit formulation. The neural implicit takes into account the topological and geometric information stored in the generated skeleton representations to yield surfaces that are more topologically and geometrically accurate compared to previous neural field formulations. We discuss applications of our method in shape synthesis and point cloud reconstruction tasks, and evaluate our method both qualitatively and quantitatively. We demonstrate significantly more faithful surface reconstruction and diverse shape generation results compared to the state-of-the-art, also involving challenging scenarios of reconstructing and synthesizing structurally complex, high-genus shape surfaces from Thingi10K and ShapeNet.
Paper Structure (41 sections, 7 equations, 9 figures, 10 tables, 1 algorithm)

This paper contains 41 sections, 7 equations, 9 figures, 10 tables, 1 algorithm.

Figures (9)

  • Figure 1: GEM3D generative architecture: starting with Gaussian noise in $\mathbb R^3$, our first diffusion stage generates a point-based medial (skeletal) shape representation conditioned on a shape category embedding. Conditioned on this representation, our second diffusion stage generates latent codes capturing shape information around the medial points. In the last stage, our surface decoder decodes the medial latent codes and points to local neural implicit surface representations, which are then aggregated to create an output 3D shape.
  • Figure 2: A shape and its medial axis (purple) (a) medial balls (yellow) vs (b) enveloping primitives.
  • Figure 3: (Top) Using the closest medial point for queries yields wrong ball reconstructions. (Bottom) Using the closest envelope yields the right result. Surface reconstruction is shown in green for both cases.
  • Figure 4: Category-conditioned shape generation on ShapeNet. We show generated shapes from GEM3D for five categories: chair, lamp, airplane, table, bench and watercraft (from top to bottom). Odd rows are skeletons generated by our model; even rows are surfaces sampled from them.
  • Figure 5: Point cloud reconstruction on ShapeNet. GEM3D yields better reconstruction esp for thin and tubular parts, and with better connectivity.
  • ...and 4 more figures