Table of Contents
Fetching ...

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

Seungtae Nam, Xiangyu Sun, Gyeongjin Kang, Younggeun Lee, Seungjun Oh, Eunbyung Park

TL;DR

Generative Densification (GD) tackles the challenge of representing high-frequency details in generalized feed-forward Gaussian models for 3D reconstruction by densifying coarse Gaussians through up-sampling learned feature representations in a single forward pass. It selects the most informative Gaussians using view-space gradients, propagates refinement across $L$ layers with up-sampling, learnable masking, and a Gaussian head, and augments local features with global adaptive normalization via serialized attention. Integrated into LaRa (object-level) and MVSplat (scene-level), GD achieves state-of-the-art or competitive results on Gobjaverse and RE10K with substantially fewer parameters, and demonstrates robust cross-dataset generalization. Ablation studies confirm the value of gradient-guided selection and learnable masking for efficiency, while qualitative analyses show finer geometric details are captured by the generated fine Gaussians.

Abstract

Generalized feed-forward Gaussian models have achieved significant progress in sparse-view 3D reconstruction by leveraging prior knowledge from large multi-view datasets. However, these models often struggle to represent high-frequency details due to the limited number of Gaussians. While the densification strategy used in per-scene 3D Gaussian splatting (3D-GS) optimization can be adapted to the feed-forward models, it may not be ideally suited for generalized scenarios. In this paper, we propose Generative Densification, an efficient and generalizable method to densify Gaussians generated by feed-forward models. Unlike the 3D-GS densification strategy, which iteratively splits and clones raw Gaussian parameters, our method up-samples feature representations from the feed-forward models and generates their corresponding fine Gaussians in a single forward pass, leveraging the embedded prior knowledge for enhanced generalization. Experimental results on both object-level and scene-level reconstruction tasks demonstrate that our method outperforms state-of-the-art approaches with comparable or smaller model sizes, achieving notable improvements in representing fine details.

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

TL;DR

Generative Densification (GD) tackles the challenge of representing high-frequency details in generalized feed-forward Gaussian models for 3D reconstruction by densifying coarse Gaussians through up-sampling learned feature representations in a single forward pass. It selects the most informative Gaussians using view-space gradients, propagates refinement across layers with up-sampling, learnable masking, and a Gaussian head, and augments local features with global adaptive normalization via serialized attention. Integrated into LaRa (object-level) and MVSplat (scene-level), GD achieves state-of-the-art or competitive results on Gobjaverse and RE10K with substantially fewer parameters, and demonstrates robust cross-dataset generalization. Ablation studies confirm the value of gradient-guided selection and learnable masking for efficiency, while qualitative analyses show finer geometric details are captured by the generated fine Gaussians.

Abstract

Generalized feed-forward Gaussian models have achieved significant progress in sparse-view 3D reconstruction by leveraging prior knowledge from large multi-view datasets. However, these models often struggle to represent high-frequency details due to the limited number of Gaussians. While the densification strategy used in per-scene 3D Gaussian splatting (3D-GS) optimization can be adapted to the feed-forward models, it may not be ideally suited for generalized scenarios. In this paper, we propose Generative Densification, an efficient and generalizable method to densify Gaussians generated by feed-forward models. Unlike the 3D-GS densification strategy, which iteratively splits and clones raw Gaussian parameters, our method up-samples feature representations from the feed-forward models and generates their corresponding fine Gaussians in a single forward pass, leveraging the embedded prior knowledge for enhanced generalization. Experimental results on both object-level and scene-level reconstruction tasks demonstrate that our method outperforms state-of-the-art approaches with comparable or smaller model sizes, achieving notable improvements in representing fine details.

Paper Structure

This paper contains 37 sections, 10 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Our method selectively densifies (a) coarse Gaussians from generalized feed-forward models. (c) The top $K$ Gaussians with large view-space positional gradients are selected, and (d-e) their fine Gaussians are generated in each densification layer. (g) The final Gaussians are obtained by combining (b) the remaining (non-selected) Gaussians with (f) the union of each layer's output Gaussians.
  • Figure 2: Generative Densification overview. We selectively densifies the top $K$ Gaussians with large view-space positional gradients.
  • Figure 3: Key components in Generative Densification Module.
  • Figure 4: Overview of the Generative Densification pipelines for object-level (top) and scene-level (bottom) reconstruction tasks.
  • Figure 5: Qualitative comparisons of our object-level model trained for 50 epochs against the original LaRa. The zoomed-in parts within the red boxes are shown on the right side of the second and third columns, focusing on the comparison of fine detail reconstruction. The two images in the rightmost column present the Gaussians input to and output from our generative densification module, respectively.
  • ...and 6 more figures