Table of Contents
Fetching ...

EasySplat: View-Adaptive Learning makes 3D Gaussian Splatting Easy

Ao Gao, Luosong Guo, Tao Chen, Zhao Wang, Ying Tai, Jian Yang, Zhenyu Zhang

TL;DR

This paper tackles the initialization and densification bottlenecks in 3D Gaussian Splatting by introducing EasySplat, which replaces SfM-based initialization with view-adaptive, pointmap priors and adds an adaptive KNN-based densification to better populate Gaussian primitives. The approach first constructs robust global poses and point clouds through groups of image pairs derived from image similarity, using DUSt3R priors to estimate pairwise geometry, then initializes 3D Gaussians from this global structure. A KNN-based densification strategy subdivides large Gaussians based on local neighbor shapes, enabling dense coverage in textureless or underrepresented regions and improving NVS quality and training efficiency. Extensive experiments on Tanks&Temples and CO3DV2 demonstrate state-of-the-art novel view synthesis and pose accuracy, with ablations confirming the effectiveness of both the view-adaptive initialization and the KNN-based densification. Overall, EasySplat provides a practical, scalable pathway to high-quality 3DGS in dense-view scenarios, with potential extension to unified sparse-dense settings.

Abstract

3D Gaussian Splatting (3DGS) techniques have achieved satisfactory 3D scene representation. Despite their impressive performance, they confront challenges due to the limitation of structure-from-motion (SfM) methods on acquiring accurate scene initialization, or the inefficiency of densification strategy. In this paper, we introduce a novel framework EasySplat to achieve high-quality 3DGS modeling. Instead of using SfM for scene initialization, we employ a novel method to release the power of large-scale pointmap approaches. Specifically, we propose an efficient grouping strategy based on view similarity, and use robust pointmap priors to obtain high-quality point clouds and camera poses for 3D scene initialization. After obtaining a reliable scene structure, we propose a novel densification approach that adaptively splits Gaussian primitives based on the average shape of neighboring Gaussian ellipsoids, utilizing KNN scheme. In this way, the proposed method tackles the limitation on initialization and optimization, leading to an efficient and accurate 3DGS modeling. Extensive experiments demonstrate that EasySplat outperforms the current state-of-the-art (SOTA) in handling novel view synthesis.

EasySplat: View-Adaptive Learning makes 3D Gaussian Splatting Easy

TL;DR

This paper tackles the initialization and densification bottlenecks in 3D Gaussian Splatting by introducing EasySplat, which replaces SfM-based initialization with view-adaptive, pointmap priors and adds an adaptive KNN-based densification to better populate Gaussian primitives. The approach first constructs robust global poses and point clouds through groups of image pairs derived from image similarity, using DUSt3R priors to estimate pairwise geometry, then initializes 3D Gaussians from this global structure. A KNN-based densification strategy subdivides large Gaussians based on local neighbor shapes, enabling dense coverage in textureless or underrepresented regions and improving NVS quality and training efficiency. Extensive experiments on Tanks&Temples and CO3DV2 demonstrate state-of-the-art novel view synthesis and pose accuracy, with ablations confirming the effectiveness of both the view-adaptive initialization and the KNN-based densification. Overall, EasySplat provides a practical, scalable pathway to high-quality 3DGS in dense-view scenarios, with potential extension to unified sparse-dense settings.

Abstract

3D Gaussian Splatting (3DGS) techniques have achieved satisfactory 3D scene representation. Despite their impressive performance, they confront challenges due to the limitation of structure-from-motion (SfM) methods on acquiring accurate scene initialization, or the inefficiency of densification strategy. In this paper, we introduce a novel framework EasySplat to achieve high-quality 3DGS modeling. Instead of using SfM for scene initialization, we employ a novel method to release the power of large-scale pointmap approaches. Specifically, we propose an efficient grouping strategy based on view similarity, and use robust pointmap priors to obtain high-quality point clouds and camera poses for 3D scene initialization. After obtaining a reliable scene structure, we propose a novel densification approach that adaptively splits Gaussian primitives based on the average shape of neighboring Gaussian ellipsoids, utilizing KNN scheme. In this way, the proposed method tackles the limitation on initialization and optimization, leading to an efficient and accurate 3DGS modeling. Extensive experiments demonstrate that EasySplat outperforms the current state-of-the-art (SOTA) in handling novel view synthesis.
Paper Structure (10 sections, 7 equations, 5 figures, 4 tables)

This paper contains 10 sections, 7 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparison with existing methods. (a) Compared with other SOTA methods, our method achieves the best performance in rendering quality. (b) In contrast to the regular densification used in the vanilla 3DGS, our KNN-based densification effectively grows points in areas where the initial point cloud is insufficient, leading to more accurate and detailed results.
  • Figure 2: Overview of proposed EasySplat. Given $N$ images, we first construct image pairs based on view similarity to estimate paired point clouds, followed by global alignment to estimate camera poses and point clouds. During training, we use a KNN-based adaptive division to control the density of Gaussian distributions while optimizing camera poses.
  • Figure 3: KNN-based Densification. After the KNN-based splitting, the large Gaussians are decomposed into smaller Gaussians, leading to significant improvements on smaller targets, such as the car depicted in the figure.
  • Figure 4: Qualitative comparison for novel view synthesis on Tanks&Temples. Our approach produces much more high-quality and detailed images than the baselines.
  • Figure 5: Qualitative comparison for novel view synthesis and camera pose estimation on CO3DV2.