AGG: Amortized Generative 3D Gaussians for Single Image to 3D
Dejia Xu, Ye Yuan, Morteza Mardani, Sifei Liu, Jiaming Song, Zhangyang Wang, Arash Vahdat
TL;DR
This work tackles single-image to 3D generation by proposing AGG, an amortized framework that directly predicts 3D Gaussian representations without per-object optimization. It introduces a coarse hybrid generator to predict Gaussian locations and texture via separate transformers, followed by a Gaussian super-resolution module that densifies the scene in latent space while integrating RGB cues. Training stabilizes through fixed Gaussian counts, canonical initialization, and warmup with pseudo labels, enabling zero-shot object generation with rendering-based supervision. Empirical results on OmniObject3D show competitive qualitative/quantitative performance with orders-of-magnitude faster inference compared to optimization-based 3D Gaussian methods and diffusion-based baselines, highlighting AGG’s practicality for real-time single-image to 3D content creation.
Abstract
Given the growing need for automatic 3D content creation pipelines, various 3D representations have been studied to generate 3D objects from a single image. Due to its superior rendering efficiency, 3D Gaussian splatting-based models have recently excelled in both 3D reconstruction and generation. 3D Gaussian splatting approaches for image to 3D generation are often optimization-based, requiring many computationally expensive score-distillation steps. To overcome these challenges, we introduce an Amortized Generative 3D Gaussian framework (AGG) that instantly produces 3D Gaussians from a single image, eliminating the need for per-instance optimization. Utilizing an intermediate hybrid representation, AGG decomposes the generation of 3D Gaussian locations and other appearance attributes for joint optimization. Moreover, we propose a cascaded pipeline that first generates a coarse representation of the 3D data and later upsamples it with a 3D Gaussian super-resolution module. Our method is evaluated against existing optimization-based 3D Gaussian frameworks and sampling-based pipelines utilizing other 3D representations, where AGG showcases competitive generation abilities both qualitatively and quantitatively while being several orders of magnitude faster. Project page: https://ir1d.github.io/AGG/
