Back to the Future Cyclopean Stereo: a human perception approach combining deep and geometric constraints
Sherlon Almeida da Silva, Davi Geiger, Luiz Velho, Moacir Antonelli Ponti
TL;DR
Back to the Future Cyclopean Stereo (B2FS) tackles the need for interpretable stereo by coupling a cyclopean geometry, expressed in the XD space, with deep visual features. It introduces two geometric constraints (GC1 and GC2) and a DP-based fusion with monocular priors to fill occluded and textureless regions, further refined by a Fully Convolutional Regression Network guided by a Hybrid Attention Transformer. The method achieves competitive depth accuracy and superior structural detail on Middlebury at 256×256, particularly in depth discontinuities and in low-resolution scenarios. By blending explicit 3D geometry with learning-based cues, B2FS demonstrates a path toward more robust, explainable stereo systems with potential impact on virtual reality, robotics, and autonomous navigation.
Abstract
We innovate in stereo vision by explicitly providing analytical 3D surface models as viewed by a cyclopean eye model that incorporate depth discontinuities and occlusions. This geometrical foundation combined with learned stereo features allows our system to benefit from the strengths of both approaches. We also invoke a prior monocular model of surfaces to fill in occlusion regions or texture-less regions where data matching is not sufficient. Our results already are on par with the state-of-the-art purely data-driven methods and are of much better visual quality, emphasizing the importance of the 3D geometrical model to capture critical visual information. Such qualitative improvements may find applicability in virtual reality, for a better human experience, as well as in robotics, for reducing critical errors. Our approach aims to demonstrate that understanding and modeling geometrical properties of 3D surfaces is beneficial to computer vision research.
