Table of Contents
Fetching ...

TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation

Nikolai Kalischek, Torben Peters, Jan D. Wegner, Konrad Schindler

TL;DR

TetraDiffusion introduces a first-of-its-kind 3D denoising diffusion model that operates directly on a deformable tetrahedral grid to generate high-resolution, textured meshes. By designing tetrahedral-specific convolution operators and a differentiable marching tetrahedra pipeline, the method achieves fast sampling and scalable memory use, and it supports color attributes for textured assets. The approach demonstrates superior geometric detail and competitive texture quality against strong 3D mesh and diffusion baselines, while enabling conditioning, interpolation, and test-time guidance. This has practical impact for rapid, scalable creation of detailed 3D assets on consumer hardware, with potential extensions to conditional generation and scene-level synthesis.

Abstract

Probabilistic denoising diffusion models (DDMs) have set a new standard for 2D image generation. Extending DDMs for 3D content creation is an active field of research. Here, we propose TetraDiffusion, a diffusion model that operates on a tetrahedral partitioning of 3D space to enable efficient, high-resolution 3D shape generation. Our model introduces operators for convolution and transpose convolution that act directly on the tetrahedral partition, and seamlessly includes additional attributes such as color. Remarkably, TetraDiffusion enables rapid sampling of detailed 3D objects in nearly real-time with unprecedented resolution. It's also adaptable for generating 3D shapes conditioned on 2D images. Compared to existing 3D mesh diffusion techniques, our method is up to 200 times faster in inference speed, works on standard consumer hardware, and delivers superior results.

TetraDiffusion: Tetrahedral Diffusion Models for 3D Shape Generation

TL;DR

TetraDiffusion introduces a first-of-its-kind 3D denoising diffusion model that operates directly on a deformable tetrahedral grid to generate high-resolution, textured meshes. By designing tetrahedral-specific convolution operators and a differentiable marching tetrahedra pipeline, the method achieves fast sampling and scalable memory use, and it supports color attributes for textured assets. The approach demonstrates superior geometric detail and competitive texture quality against strong 3D mesh and diffusion baselines, while enabling conditioning, interpolation, and test-time guidance. This has practical impact for rapid, scalable creation of detailed 3D assets on consumer hardware, with potential extensions to conditional generation and scene-level synthesis.

Abstract

Probabilistic denoising diffusion models (DDMs) have set a new standard for 2D image generation. Extending DDMs for 3D content creation is an active field of research. Here, we propose TetraDiffusion, a diffusion model that operates on a tetrahedral partitioning of 3D space to enable efficient, high-resolution 3D shape generation. Our model introduces operators for convolution and transpose convolution that act directly on the tetrahedral partition, and seamlessly includes additional attributes such as color. Remarkably, TetraDiffusion enables rapid sampling of detailed 3D objects in nearly real-time with unprecedented resolution. It's also adaptable for generating 3D shapes conditioned on 2D images. Compared to existing 3D mesh diffusion techniques, our method is up to 200 times faster in inference speed, works on standard consumer hardware, and delivers superior results.
Paper Structure (31 sections, 10 equations, 22 figures, 3 tables, 1 algorithm)

This paper contains 31 sections, 10 equations, 22 figures, 3 tables, 1 algorithm.

Figures (22)

  • Figure 1: Details of generated meshes in high resolution.TetraDiffusion is able to output highly realistic shapes in a matter of seconds.
  • Figure 2: Generation time per sample vs. mesh quality for different methods. Clip-FID is averaged over the ShapeNet classes airplane, car and motorbike.
  • Figure 2: Quantitative comparison of generative texture quality, evaluated via renderings of colored meshes. We use our high resolution model.
  • Figure 2: Generated inner structure. Our ground truth preprocessing retains the inner structures of the original meshes, and TetraDiffusion is able to reproduce such structures.
  • Figure 3: Reverse tetrahedral diffusion sequence. Starting from a noisy tetrahedral grid, the model recovers a textured mesh in a few seconds. Note the shape of the initial grid: our method allows more targeted grid pruning than prior art.
  • ...and 17 more figures