Table of Contents
Fetching ...

GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation

Dingdong Yang, Yizhi Wang, Konrad Schindler, Ali Mahdavi Amiri, Hao Zhang

Abstract

We propose GALA, a novel representation of 3D shapes that (i) excels at capturing and reproducing complex geometry and surface details, (ii) is computationally efficient, and (iii) lends itself to 3D generative modelling with modern, diffusion-based schemes. The key idea of GALA is to exploit both the global sparsity of surfaces within a 3D volume and their local surface properties. Sparsity is promoted by covering only the 3D object boundaries, not empty space, with an ensemble of tree root voxels. Each voxel contains an octree to further limit storage and compute to regions that contain surfaces. Adaptivity is achieved by fitting one local and geometry-aware coordinate frame in each non-empty leaf node. Adjusting the orientation of the local grid, as well as the anisotropic scales of its axes, to the local surface shape greatly increases the amount of detail that can be stored in a given amount of memory, which in turn allows for quantization without loss of quality. With our optimized C++/CUDA implementation, GALA can be fitted to an object in less than 10 seconds. Moreover, the representation can efficiently be flattened and manipulated with transformer networks. We provide a cascaded generation pipeline capable of generating 3D shapes with great geometric detail.

GALA: Geometry-Aware Local Adaptive Grids for Detailed 3D Generation

Abstract

We propose GALA, a novel representation of 3D shapes that (i) excels at capturing and reproducing complex geometry and surface details, (ii) is computationally efficient, and (iii) lends itself to 3D generative modelling with modern, diffusion-based schemes. The key idea of GALA is to exploit both the global sparsity of surfaces within a 3D volume and their local surface properties. Sparsity is promoted by covering only the 3D object boundaries, not empty space, with an ensemble of tree root voxels. Each voxel contains an octree to further limit storage and compute to regions that contain surfaces. Adaptivity is achieved by fitting one local and geometry-aware coordinate frame in each non-empty leaf node. Adjusting the orientation of the local grid, as well as the anisotropic scales of its axes, to the local surface shape greatly increases the amount of detail that can be stored in a given amount of memory, which in turn allows for quantization without loss of quality. With our optimized C++/CUDA implementation, GALA can be fitted to an object in less than 10 seconds. Moreover, the representation can efficiently be flattened and manipulated with transformer networks. We provide a cascaded generation pipeline capable of generating 3D shapes with great geometric detail.

Paper Structure

This paper contains 36 sections, 4 equations, 24 figures, 5 tables, 2 algorithms.

Figures (24)

  • Figure 1: Given a watertight mesh (a), our representation, GALA, for geometry-aware local adaptive grids, distributes a set of root node voxels (coral) to cover the mesh surfaces. An octree subdivision is applied to each root, with a subset shown in (c). In each non-empty octree leaf node (green), a local grid (red dots) is oriented and anisotropically scaled to adapt to and tightly bound the local surface geometries. Only 277K parameters with 8-bit quantization yields an accurate representation (e).
  • Figure 2: GALA enables diverse and detailed conditional 3D shape generation, including Airplanes, Lamps, Tables and Chairs. Best viewed on screen with high magnification.
  • Figure 3: In our representation, GALA, tree root nodes, as voxels, are initialized over mesh surfaces (gray line), each with location $\mathbf{p}\in\mathbb{R}^3$ and scale $s\in\mathbb{R}$. Descendant node voxels are deduced recursively with each child voxel, with overlapping, and expanded at ratio $\alpha\in\mathbb{R}$ into depth $d$. Only at a non-empty leaf node subdivision (light green) would a local adaptive grid of resolution $m\in\mathbb{N^{+}}$ be extracted with location $\mathbf{p}_g\in\mathbb{R}^3$, orientation quaternion $\mathbf{q}\in\mathbb{R}^4$ , scales $\mathbf{s_g}\in\mathbb{R}^3$, and values $V\in\mathbb{R}^{m^3}$. $N_o,\alpha,m,d$ are hyperparameters. In contrast, Mosaic-SDF yariv2023mosaic (left) employs mosaic patches, each fully occupied by a single-level, axis-aligned, and isotropic grid.
  • Figure 4: Illustration of local adaptive grid extraction in 2D. Within each subdivision (blue square), (a) an OBBTree gottschalk1996obbtree determines the orientation of the local bounding box according to the convex hull of the subdivided local geometry. (b) Different from OBBTree, we determine the orientation of the local adaptive grid using PCA on bounded normal vectors ($\overrightarrow{n}$). (c) Moreover, we rescale the grid anisotropically to better capture the local geometry, as informed by the histogram (d) of sampled points (small dark gray dots) on triangle meshes projected along each axis of the grid. The •/• are grid samples with negative/positive signs.
  • Figure 5: GALA generates shapes in a cascaded manner by (1) Root voxel diffusion; (2) Local adaptive grid configuration diffusion; (3) Local adaptive grid value prediction by regression.
  • ...and 19 more figures