Table of Contents
Fetching ...

EG-Gaussian: Epipolar Geometry and Graph Network Enhanced 3D Gaussian Splatting

Beizhen Zhao, Yifan Zhou, Zijian Wang, Hao Wang

TL;DR

This work proposes a novel framework EG-Gaussian, which utilizes epipolar geometry and graph networks for 3D scene reconstruction and specifically design a graph learning module to refine 3DGS spatial features, in which both spatial coordinates and angular relationships among neighboring points are incorporated.

Abstract

In this paper, we explore an open research problem concerning the reconstruction of 3D scenes from images. Recent methods have adopt 3D Gaussian Splatting (3DGS) to produce 3D scenes due to its efficient training process. However, these methodologies may generate incomplete 3D scenes or blurred multiviews. This is because of (1) inaccurate 3DGS point initialization and (2) the tendency of 3DGS to flatten 3D Gaussians with the sparse-view input. To address these issues, we propose a novel framework EG-Gaussian, which utilizes epipolar geometry and graph networks for 3D scene reconstruction. Initially, we integrate epipolar geometry into the 3DGS initialization phase to enhance initial 3DGS point construction. Then, we specifically design a graph learning module to refine 3DGS spatial features, in which we incorporate both spatial coordinates and angular relationships among neighboring points. Experiments on indoor and outdoor benchmark datasets demonstrate that our approach significantly improves reconstruction accuracy compared to 3DGS-based methods.

EG-Gaussian: Epipolar Geometry and Graph Network Enhanced 3D Gaussian Splatting

TL;DR

This work proposes a novel framework EG-Gaussian, which utilizes epipolar geometry and graph networks for 3D scene reconstruction and specifically design a graph learning module to refine 3DGS spatial features, in which both spatial coordinates and angular relationships among neighboring points are incorporated.

Abstract

In this paper, we explore an open research problem concerning the reconstruction of 3D scenes from images. Recent methods have adopt 3D Gaussian Splatting (3DGS) to produce 3D scenes due to its efficient training process. However, these methodologies may generate incomplete 3D scenes or blurred multiviews. This is because of (1) inaccurate 3DGS point initialization and (2) the tendency of 3DGS to flatten 3D Gaussians with the sparse-view input. To address these issues, we propose a novel framework EG-Gaussian, which utilizes epipolar geometry and graph networks for 3D scene reconstruction. Initially, we integrate epipolar geometry into the 3DGS initialization phase to enhance initial 3DGS point construction. Then, we specifically design a graph learning module to refine 3DGS spatial features, in which we incorporate both spatial coordinates and angular relationships among neighboring points. Experiments on indoor and outdoor benchmark datasets demonstrate that our approach significantly improves reconstruction accuracy compared to 3DGS-based methods.

Paper Structure

This paper contains 25 sections, 14 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overview of EG-Gaussian. We begin by initializing the 3D points using epipolar geometry to improve their accuracy. Next, we voxelize the 3D points, associating each voxel with multiple Gaussians. A graph-based module is then employed to model the spatial correlations between the nodes. Finally, a composite loss function is computed, incorporating pixel-based loss, NCC loss, and Laplacian pyramid loss, which are evaluated through the rendered RGB images and the original images.
  • Figure 2: Epipolar enhanced initialization. In scenarios with sparse viewpoint constraints, the localization of 3D points may exhibit ambiguity. To mitigate this issue, we leverage epipolar geometry to establish correspondences between different camera perspectives, thereby enabling the generation of more accurate and robust 3D points.
  • Figure 3: Graph aggregation and spatial encode. After voxelization, we design a graph module to capture spatial structure information. Using the 3D coordinates, we construct the neighborhood of each node, embedding the angular relationships among neighboring nodes to extract spatial structural details. A self-attention mechanism is then employed to further aggregate and refine the spatial features.
  • Figure 4: Computational Flow of our Multi-Head Attention Mechanism. The standard self-attention mechanism is extended by incorporating angular embedding information, which is fused into the attention computation to derive the final scores. More details can be found in supplementary material.
  • Figure 5: Qualitative comparision of EG-Gaussian and Octree-GS. Visual differences are highlighted with yellow insets for better clarity. Our approach consistently outperforms Octree-GS, which achieved the second-best performance on the Mip-NeRF360 dataset, demonstrating clear advantages in challenging scenarios such as thin geometries and fine-scale details. Best viewed in color.