Progress and Prospects in 3D Generative AI: A Technical Overview including 3D human

Song Bai; Jie Li

Progress and Prospects in 3D Generative AI: A Technical Overview including 3D human

Song Bai, Jie Li

TL;DR

This paper surveys recent progress in AI-generated 3D content, organized around single 3D objects, 3D human models, 3D scenes, and human motion synthesis, with emphasis on 2023 papers. It argues that diffusion-based 2D-to-3D pipelines (via NeRF, 3DGS, and related representations) combined with control mechanisms (ControlNet, DreamBooth, LoRA) and anthropometric priors like SMPL(-X) are driving rapid gains in fidelity and consistency. It highlights high-profile methods (e.g., One-2-3-45++, Direct2.5, RichDreamer, SceneDreamer, Story2Motion) and datasets (ObjaverseXL) that enable high-quality, view-consistent outputs, at times with substantial compute. It also discusses persistent challenges in scene fidelity, background handling, evaluation metrics, and the need for scalable 3D scene datasets, while underscoring broad applicability to gaming, education, advertising, and AR/VR.

Abstract

While AI-generated text and 2D images continue to expand its territory, 3D generation has gradually emerged as a trend that cannot be ignored. Since the year 2023 an abundant amount of research papers has emerged in the domain of 3D generation. This growth encompasses not just the creation of 3D objects, but also the rapid development of 3D character and motion generation. Several key factors contribute to this progress. The enhanced fidelity in stable diffusion, coupled with control methods that ensure multi-view consistency, and realistic human models like SMPL-X, contribute synergistically to the production of 3D models with remarkable consistency and near-realistic appearances. The advancements in neural network-based 3D storing and rendering models, such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS), have accelerated the efficiency and realism of neural rendered models. Furthermore, the multimodality capabilities of large language models have enabled language inputs to transcend into human motion outputs. This paper aims to provide a comprehensive overview and summary of the relevant papers published mostly during the latter half year of 2023. It will begin by discussing the AI generated object models in 3D, followed by the generated 3D human models, and finally, the generated 3D human motions, culminating in a conclusive summary and a vision for the future.

Progress and Prospects in 3D Generative AI: A Technical Overview including 3D human

TL;DR

Abstract

Progress and Prospects in 3D Generative AI: A Technical Overview including 3D human

Authors

TL;DR

Abstract

Table of Contents

Figures (14)