CrossView-GS: Cross-view Gaussian Splatting For Large-scale Scene Reconstruction
Chenhao Zhang, Yuanping Cao, Lei Zhang
TL;DR
CrossView-GS tackles cross-view, large-scale scene reconstruction with 3D Gaussian Splatting by building multiple branch priors from different view sets, initializing a cross-view model from distant-view data, and applying gradient-aware regularization guided by pseudo-labels from branches. It further fuses complementary information via a unique Gaussian supplementation step, and fine-tunes the resulting model. Empirical results across aerial-ground, pure aerial, and pure ground cross-view datasets show consistent improvements over state-of-the-art methods in novel view synthesis, with notable gains in aerial views and significant efficiency advantages. The approach offers a practical, scalable path for high-fidelity cross-view reconstructions with applications in VR, smart cities, and GIS, while acknowledging limitations with dynamic objects.
Abstract
3D Gaussian Splatting (3DGS) leverages densely distributed Gaussian primitives for high-quality scene representation and reconstruction. While existing 3DGS methods perform well in scenes with minor view variation, large view changes from cross-view data pose optimization challenges for these methods. To address these issues, we propose a novel cross-view Gaussian Splatting method for large-scale scene reconstruction based on multi-branch construction and fusion. Our method independently reconstructs models from different sets of views as multiple independent branches to establish the baselines of Gaussian distribution, providing reliable priors for cross-view reconstruction during initialization and densification. Specifically, a gradient-aware regularization strategy is introduced to mitigate smoothing issues caused by significant view disparities. Additionally, a unique Gaussian supplementation strategy is utilized to incorporate complementary information of multi-branch into the cross-view model. Extensive experiments on benchmark datasets demonstrate that our method achieves superior performance in novel view synthesis compared to state-of-the-art methods.
