G3Splat: Geometrically Consistent Generalizable Gaussian Splatting
Mehdi Hosseinzadeh, Shin-Fang Chng, Yi Xu, Simon Lucey, Ian Reid, Ravi Garg
TL;DR
<3-5 sentence high-level summary>G3Splat tackles geometric inconsistencies in generalizable Gaussian splatting under self-supervision by introducing explicit geometric priors. It enforces orientation alignment with local surface normals and pixel-ray alignment of Gaussians, integrated with both DUSt3R and VGGT backbones to produce pixel-aligned, geometrically coherent splats. Evaluations on RealEstate10K and zero-shot tests on ScanNet/ACID show state-of-the-art geometry, pose estimation, and novel-view synthesis, with robust depth and mesh reconstructions enabled by depth rendering via expected depth D_exp. The work includes comprehensive ablations and provides code and pretrained models to facilitate replication and further research in geometrically consistent 3D scene recovery from unposed views.
Abstract
3D Gaussians have recently emerged as an effective scene representation for real-time splatting and accurate novel-view synthesis, motivating several works to adapt multi-view structure prediction networks to regress per-pixel 3D Gaussians from images. However, most prior work extends these networks to predict additional Gaussian parameters -- orientation, scale, opacity, and appearance -- while relying almost exclusively on view-synthesis supervision. We show that a view-synthesis loss alone is insufficient to recover geometrically meaningful splats in this setting. We analyze and address the ambiguities of learning 3D Gaussian splats under self-supervision for pose-free generalizable splatting, and introduce G3Splat, which enforces geometric priors to obtain geometrically consistent 3D scene representations. Trained on RE10K, our approach achieves state-of-the-art performance in (i) geometrically consistent reconstruction, (ii) relative pose estimation, and (iii) novel-view synthesis. We further demonstrate strong zero-shot generalization on ScanNet, substantially outperforming prior work in both geometry recovery and relative pose estimation. Code and pretrained models are released on our project page (https://m80hz.github.io/g3splat/).
