Deep Generative Models on 3D Representations: A Survey
Zifan Shi, Sida Peng, Yinghao Xu, Andreas Geiger, Yiyi Liao, Yujun Shen
TL;DR
This survey comprehensively maps 3D generative modeling by organizing work around 3D representations (voxels, point clouds, meshes, neural fields, depth maps) and supervision signals (2D vs 3D). It contrasts major generative-model families (GANs, VAEs, normalizing flows, diffusion models) and reviews how each representation pairs with these models. The paper details learning-from-3D-data approaches, learning-from-2D-data methods, and a spectrum of applications from shape editing to 3D reconstruction and representation learning. It also discusses persistent challenges—universality, controllability, efficiency, and stability—and outlines future directions to accelerate progress in 3D generation and rendering. Overall, the work serves as a foundational reference for researchers seeking to understand and advance 3D generative modeling across representations and supervision regimes.
Abstract
Generative models aim to learn the distribution of observed data by generating new instances. With the advent of neural networks, deep generative models, including variational autoencoders (VAEs), generative adversarial networks (GANs), and diffusion models (DMs), have progressed remarkably in synthesizing 2D images. Recently, researchers started to shift focus from 2D to 3D space, considering that 3D data is more closely aligned with our physical world and holds immense practical potential. However, unlike 2D images, which possess an inherent and efficient representation (\textit{i.e.}, a pixel grid), representing 3D data poses significantly greater challenges. Ideally, a robust 3D representation should be capable of accurately modeling complex shapes and appearances while being highly efficient in handling high-resolution data with high processing speeds and low memory requirements. Regrettably, existing 3D representations, such as point clouds, meshes, and neural fields, often fail to satisfy all of these requirements simultaneously. In this survey, we thoroughly review the ongoing developments of 3D generative models, including methods that employ 2D and 3D supervision. Our analysis centers on generative models, with a particular focus on the representations utilized in this context. We believe our survey will help the community to track the field's evolution and to spark innovative ideas to propel progress towards solving this challenging task.
