Self-Calibrating Gaussian Splatting for Large Field of View Reconstruction
Youming Deng, Wenqi Xian, Guandao Yang, Leonidas Guibas, Gordon Wetzstein, Steve Marschner, Paul Debevec
TL;DR
This paper tackles the challenge of reconstructing wide-field scenes from uncalibrated, highly distorted fisheye imagery. It introduces Self-Calibrating Gaussian Splatting, a differentiable pipeline that jointly optimizes camera intrinsics, extrinsics, lens distortion, and 3D Gaussian scene representations, all within an end-to-end framework. A hybrid distortion field (combining invertible residual networks with an explicit control grid) together with cubemap-based resampling enables accurate, artifact-free reconstruction across the full field of view, without pre-calibration or restrictive projection models. The approach achieves state-of-the-art performance on synthetic and real-world data, demonstrates robust reconstruction with few input views, and provides broad applicability across diverse wide-angle lenses, making wide-FOV NVS more practical for applications in robotics, virtual reality, and autonomous systems.
Abstract
In this paper, we present a self-calibrating framework that jointly optimizes camera parameters, lens distortion and 3D Gaussian representations, enabling accurate and efficient scene reconstruction. In particular, our technique enables high-quality scene reconstruction from Large field-of-view (FOV) imagery taken with wide-angle lenses, allowing the scene to be modeled from a smaller number of images. Our approach introduces a novel method for modeling complex lens distortions using a hybrid network that combines invertible residual networks with explicit grids. This design effectively regularizes the optimization process, achieving greater accuracy than conventional camera models. Additionally, we propose a cubemap-based resampling strategy to support large FOV images without sacrificing resolution or introducing distortion artifacts. Our method is compatible with the fast rasterization of Gaussian Splatting, adaptable to a wide variety of camera lens distortion, and demonstrates state-of-the-art performance on both synthetic and real-world datasets.
