A Revisit of the Normalized Eight-Point Algorithm and A Self-Supervised Deep Solution
Bin Fan, Yuchao Dai, Yongduek Seo, Mingyi He
TL;DR
This work revisits the normalized eight-point algorithm and demonstrates that perfect conditioning $\,\kappa(\hat{A})=1$ is unattainable with linear normalizations, motivating a data-driven approach. It introduces a self-supervised CNN that outputs the normalization matrices $T$ and $T'$ (with a parametric affine form $T_L$ controlled by $\alpha_1$, $\alpha_2$, and $\theta$) to improve the conditioning of the coefficient matrix used in DLT for fundamental matrix estimation, while enforcing the singularity constraint. The network is permutation-invariant and trained via a self-supervised symmetry epipolar loss that backpropagates through the SVD, enabling training without ground-truth labels. Experimental results on KITTI, TUM, and Cambridge datasets show per-sample improvements over Hartley’s normalization, with good generalization and reasonable integration with RANSAC, indicating practical benefits for robust two-view geometry pipelines. The approach offers an interpretable, data-driven alternative to hand-crafted normalization that can enhance initialization and conditioning in multi-view geometry tasks.
Abstract
The normalized eight-point algorithm has been widely viewed as the cornerstone in two-view geometry computation, where the seminal Hartley's normalization has greatly improved the performance of the direct linear transformation algorithm. A natural question is, whether there exists and how to find other normalization methods that may further improve the performance as per each input sample. In this paper, we provide a novel perspective and propose two contributions to this fundamental problem: 1) we revisit the normalized eight-point algorithm and make a theoretical contribution by presenting the existence of different and better normalization algorithms; 2) we introduce a deep convolutional neural network with a self-supervised learning strategy for normalization. Given eight pairs of correspondences, our network directly predicts the normalization matrices, thus learning to normalize each input sample. Our learning-based normalization module can be integrated with both traditional (e.g., RANSAC) and deep learning frameworks (affording good interpretability) with minimal effort. Extensive experiments on both synthetic and real images demonstrate the effectiveness of our proposed approach.
