VA-GS: Enhancing the Geometric Representation of Gaussian Splatting via View Alignment
Qing Li, Huifang Feng, Xun Gong, Yu-Shen Liu
TL;DR
This work addresses geometry reconstruction in 3D Gaussian Splatting by introducing VA-GS, a view-aligned framework that augments traditional image-based supervision with edge cues, visibility-aware multi-view alignment, normal-based constraints, and cross-view deep feature consistency. The method jointly optimizes five losses—edge-aware image reconstruction, normal alignment, normal smoothing, photometric multi-view alignment, and feature alignment—through a final objective that enforces cross-view geometric fidelity while mitigating illumination artifacts. Experiments across DTU, TNT, and Mip-NeRF 360 demonstrate state-of-the-art performance in both surface reconstruction and novel view synthesis, validating the effectiveness of integrating geometry priors with view-consistent supervision. The approach advances practical 3D reconstruction from Gaussian splats, enabling more accurate meshes and photorealistic renderings in challenging lighting and boundary conditions, with potential implications for 3D modeling and AR/VR applications.
Abstract
3D Gaussian Splatting has recently emerged as an efficient solution for high-quality and real-time novel view synthesis. However, its capability for accurate surface reconstruction remains underexplored. Due to the discrete and unstructured nature of Gaussians, supervision based solely on image rendering loss often leads to inaccurate geometry and inconsistent multi-view alignment. In this work, we propose a novel method that enhances the geometric representation of 3D Gaussians through view alignment (VA). Specifically, we incorporate edge-aware image cues into the rendering loss to improve surface boundary delineation. To enforce geometric consistency across views, we introduce a visibility-aware photometric alignment loss that models occlusions and encourages accurate spatial relationships among Gaussians. To further mitigate ambiguities caused by lighting variations, we incorporate normal-based constraints to refine the spatial orientation of Gaussians and improve local surface estimation. Additionally, we leverage deep image feature embeddings to enforce cross-view consistency, enhancing the robustness of the learned geometry under varying viewpoints and illumination. Extensive experiments on standard benchmarks demonstrate that our method achieves state-of-the-art performance in both surface reconstruction and novel view synthesis. The source code is available at https://github.com/LeoQLi/VA-GS.
