Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction
Seungtae Nam, Xiangyu Sun, Gyeongjin Kang, Younggeun Lee, Seungjun Oh, Eunbyung Park
TL;DR
Generative Densification (GD) tackles the challenge of representing high-frequency details in generalized feed-forward Gaussian models for 3D reconstruction by densifying coarse Gaussians through up-sampling learned feature representations in a single forward pass. It selects the most informative Gaussians using view-space gradients, propagates refinement across $L$ layers with up-sampling, learnable masking, and a Gaussian head, and augments local features with global adaptive normalization via serialized attention. Integrated into LaRa (object-level) and MVSplat (scene-level), GD achieves state-of-the-art or competitive results on Gobjaverse and RE10K with substantially fewer parameters, and demonstrates robust cross-dataset generalization. Ablation studies confirm the value of gradient-guided selection and learnable masking for efficiency, while qualitative analyses show finer geometric details are captured by the generated fine Gaussians.
Abstract
Generalized feed-forward Gaussian models have achieved significant progress in sparse-view 3D reconstruction by leveraging prior knowledge from large multi-view datasets. However, these models often struggle to represent high-frequency details due to the limited number of Gaussians. While the densification strategy used in per-scene 3D Gaussian splatting (3D-GS) optimization can be adapted to the feed-forward models, it may not be ideally suited for generalized scenarios. In this paper, we propose Generative Densification, an efficient and generalizable method to densify Gaussians generated by feed-forward models. Unlike the 3D-GS densification strategy, which iteratively splits and clones raw Gaussian parameters, our method up-samples feature representations from the feed-forward models and generates their corresponding fine Gaussians in a single forward pass, leveraging the embedded prior knowledge for enhanced generalization. Experimental results on both object-level and scene-level reconstruction tasks demonstrate that our method outperforms state-of-the-art approaches with comparable or smaller model sizes, achieving notable improvements in representing fine details.
