Towards Understanding 3D Vision: the Role of Gaussian Curvature
Sherlon Almeida da Silva, Davi Geiger, Luiz Velho, Moacir Antonelli Ponti
TL;DR
This work investigates Gaussian curvature (GC) as an intrinsic, observer-invariant descriptor for 3D surface geometry and its role in depth reconstruction. It introduces a sparse-prior formulation $P(K)=e^{-\mathcal{L}(K)}$ with $\mathcal{L}(K)=-\ln h(K)$ derived from GC histograms, and a practical surrogate $\mathcal{L}(K)=\alpha\sqrt{|\kappa_1\kappa_2|}$, linking GC sparsity to a new Low Gaussian Curvature (LGC) metric that gauges reconstruction quality without supervision. Empirical analyses on Middlebury and controlled 3D synthetic scenes reveal GC is sparsely distributed in real-world surfaces, and modern SOTA stereo methods tend to minimize GC (improving LGC) while preserving 3D structure; however, the exact modules implementing this prior remain implicit. The findings suggest GC-based priors and the LGC metric can enhance interpretability, regularization, and evaluation of stereo and monocular depth methods, guiding future 3D vision systems toward geometry-aware design.
Abstract
Recent advances in computer vision have predominantly relied on data-driven approaches that leverage deep learning and large-scale datasets. Deep neural networks have achieved remarkable success in tasks such as stereo matching and monocular depth reconstruction. However, these methods lack explicit models of 3D geometry that can be directly analyzed, transferred across modalities, or systematically modified for controlled experimentation. We investigate the role of Gaussian curvature in 3D surface modeling. Besides Gaussian curvature being an invariant quantity under change of observers or coordinate systems, we demonstrate using the Middlebury stereo dataset that it offers a sparse and compact description of 3D surfaces. Furthermore, we show a strong correlation between the performance rank of top state-of-the-art stereo and monocular methods and the low total absolute Gaussian curvature. We propose that this property can serve as a geometric prior to improve future 3D reconstruction algorithms.
