Table of Contents
Fetching ...

NeSS-ST: Detecting Good and Stable Keypoints with a Neural Stability Score and the Shi-Tomasi Detector

Konstantin Pakulev, Alexander Vakhitov, Gonzalo Ferrer

TL;DR

This work combines a hand-crafted Shi-Tomasi detector, a specially designed metric that assesses the quality of keypoints, the stability score (SS), and a neural network to build on the principled and localized keypoints provided by the Shi-Tomasi detector and learn the neural network to select good feature points via the stability score.

Abstract

Learning a feature point detector presents a challenge both due to the ambiguity of the definition of a keypoint and, correspondingly, the need for specially prepared ground truth labels for such points. In our work, we address both of these issues by utilizing a combination of a hand-crafted Shi-Tomasi detector, a specially designed metric that assesses the quality of keypoints, the stability score (SS), and a neural network. We build on the principled and localized keypoints provided by the Shi-Tomasi detector and learn the neural network to select good feature points via the stability score. The neural network incorporates the knowledge from the training targets in the form of the neural stability score (NeSS). Therefore, our method is named NeSS-ST since it combines the Shi-Tomasi detector and the properties of the neural stability score. It only requires sets of images for training without dataset pre-labeling or the need for reconstructed correspondence labels. We evaluate NeSS-ST on HPatches, ScanNet, MegaDepth and IMC-PT demonstrating state-of-the-art performance and good generalization on downstream tasks.

NeSS-ST: Detecting Good and Stable Keypoints with a Neural Stability Score and the Shi-Tomasi Detector

TL;DR

This work combines a hand-crafted Shi-Tomasi detector, a specially designed metric that assesses the quality of keypoints, the stability score (SS), and a neural network to build on the principled and localized keypoints provided by the Shi-Tomasi detector and learn the neural network to select good feature points via the stability score.

Abstract

Learning a feature point detector presents a challenge both due to the ambiguity of the definition of a keypoint and, correspondingly, the need for specially prepared ground truth labels for such points. In our work, we address both of these issues by utilizing a combination of a hand-crafted Shi-Tomasi detector, a specially designed metric that assesses the quality of keypoints, the stability score (SS), and a neural network. We build on the principled and localized keypoints provided by the Shi-Tomasi detector and learn the neural network to select good feature points via the stability score. The neural network incorporates the knowledge from the training targets in the form of the neural stability score (NeSS). Therefore, our method is named NeSS-ST since it combines the Shi-Tomasi detector and the properties of the neural stability score. It only requires sets of images for training without dataset pre-labeling or the need for reconstructed correspondence labels. We evaluate NeSS-ST on HPatches, ScanNet, MegaDepth and IMC-PT demonstrating state-of-the-art performance and good generalization on downstream tasks.
Paper Structure (35 sections, 8 equations, 18 figures, 13 tables)

This paper contains 35 sections, 8 equations, 18 figures, 13 tables.

Figures (18)

  • Figure 1: The method applies the Shi-Tomasi detector to an input image to get the Shi-Tomasi score $\mathbf{S}$. Next, a binary mask of extrema $\tilde{\mathbf{S}}$ is obtained via non-maximum suppression of $\mathbf{S}$. Simultaneously, the method uses the neural network to regress the neural stability score $\hat{\mathbf{\Lambda}}$. A set of best feature points $\{\mathbf{k}_i\}^n_{i=1}$ is selected from the combined score map $\hat{\mathbf{\Lambda}}^{\mathit{final}}$ that is a multiplicative combination of $\tilde{\mathbf{S}}$ with the negative exponential of $\hat{\mathbf{\Lambda}}$. Obtained keypoints are localized using $\mathbf{S}$ and are provided with corresponding $\{\mathbf{s}_i\}^n_{i=1}$ and $\{\hat{\mathbf{\lambda}}_i\}^n_{i=1}$.
  • Figure 2: Evaluation on HPatches balntas2017hpatches with 2048 keypoints and full resolution images. We report homography estimation accuracy detone2018superpointwang2020learning in %.
  • Figure 3: Evaluation on HPatches balntas2017hpatches with 2048 keypoints and full resolution images. We report MMA mikolajczyk2005performancedusmanu2019d2.
  • Figure 4: For each selected point $\mathbf{k}_i$ we calculate the ground truth stability score $\mathbf{\lambda}_i$. Firstly, we generate a set of deformed patches $\{\mathbf{P}_j\}^m_{j=1}$ and run the Shi-Tomasi detector on patches to obtain a set of score patches $\{\mathbf{S}^{\mathit{patch}}_j\}^{m}_{j=1}$. For each $\mathbf{S}^{\mathit{patch}}_j$ we extract the location of its maximum score $\hat{\mathbf{l}}_j$ getting a set $\{\hat{\mathbf{l}}_j\}^m_{j=1}$. By transforming the elements of the set with $\{\mathcal{H}^{-1}_j\}^{m}_{j=1}$, we estimate $\mathbf{\Sigma}_i$ and calculate $\mathbf{\lambda}_i$.
  • Figure 5: NeSS-ST models trained with different values of $t_\mathit{Shi}$ (0.0, 0.003, 0.005, 0.006 and 0.007) evaluated on the validation set of IMC-PT jin2021image with 2048 keypoints and full resolution images. We report mAA yi2018learningjin2021image up to a 10 degrees threshold for rotation and translation.
  • ...and 13 more figures