Table of Contents
Fetching ...

GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring

Emanuele Santellani, Martin Zach, Christian Sormann, Mattia Rossi, Andreas Kuhn, Friedrich Fraundorfer

TL;DR

This work proposes a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method, and demonstrates that, when applied to popular keypoint detectors, it consistently improves the repeatability of keypoints as well as their performance in homography and two/multiple-view pose recovery tasks.

Abstract

The extraction of keypoints in images is at the basis of many computer vision applications, from localization to 3D reconstruction. Keypoints come with a score permitting to rank them according to their quality. While learned keypoints often exhibit better properties than handcrafted ones, their scores are not easily interpretable, making it virtually impossible to compare the quality of individual keypoints across methods. We propose a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method. Our approach leverages a modified robust Gaussian Mixture Model fit designed to both reject non-robust keypoints and refine the remaining ones. Our score comprises two components: one relates to the probability of extracting the same keypoint in an image captured from another viewpoint, the other relates to the localization accuracy of the keypoint. These two interpretable components permit a comparison of individual keypoints extracted across different methods. Through extensive experiments we demonstrate that, when applied to popular keypoint detectors, our framework consistently improves the repeatability of keypoints as well as their performance in homography and two/multiple-view pose recovery tasks.

GMM-IKRS: Gaussian Mixture Models for Interpretable Keypoint Refinement and Scoring

TL;DR

This work proposes a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method, and demonstrates that, when applied to popular keypoint detectors, it consistently improves the repeatability of keypoints as well as their performance in homography and two/multiple-view pose recovery tasks.

Abstract

The extraction of keypoints in images is at the basis of many computer vision applications, from localization to 3D reconstruction. Keypoints come with a score permitting to rank them according to their quality. While learned keypoints often exhibit better properties than handcrafted ones, their scores are not easily interpretable, making it virtually impossible to compare the quality of individual keypoints across methods. We propose a framework that can refine, and at the same time characterize with an interpretable score, the keypoints extracted by any method. Our approach leverages a modified robust Gaussian Mixture Model fit designed to both reject non-robust keypoints and refine the remaining ones. Our score comprises two components: one relates to the probability of extracting the same keypoint in an image captured from another viewpoint, the other relates to the localization accuracy of the keypoint. These two interpretable components permit a comparison of individual keypoints extracted across different methods. Through extensive experiments we demonstrate that, when applied to popular keypoint detectors, our framework consistently improves the repeatability of keypoints as well as their performance in homography and two/multiple-view pose recovery tasks.
Paper Structure (14 sections, 12 equations, 7 figures, 4 tables)

This paper contains 14 sections, 12 equations, 7 figures, 4 tables.

Figures (7)

  • Figure 1: Visualization of the input keypoints and their refined positions. The keypoints from the original image are represented as + in red, while the red circles represent back-projected keypoints detected in the warped images. The Gaussian fit at each keypoint cluster is represented as a set of concentric circles whose spread encodes the variance. The refined keypoints are the centers of the Gaussians, with robustness and deviation represented by the number next to the Gaussian and by its spread, respectively.
  • Figure 2: Sketch of the proposed refinement and scoring framework. A set of image warping augmentations are applied to the input image. The chosen keypoint detector is applied to all the generated images, and detections in the warped images are projected back to the input image. The local maxima in the estimated density are used as initialization for a GMM fit. After convergence, each Gaussian component represents a refined keypoint characterized by the robustness and the deviation scores. This procedure adds additional robust keypoints not detected in the original image.
  • Figure 3: Keypoint clusters and their two scores: robustness and deviation. Robustness measures the likelihood of detecting the keypoint again, while deviation measures its localization accuracy. A desirable keypoint has high robustness and low deviation, as in the bottom-left square.
  • Figure 4: Qualitative comparison between the best clusters found by different methods in the first 5 scenes of HPatches. For each scene (one per row), we select the lowest deviation cluster (best localization accuracy) among all the ones with robustness = 21.0 (which represents keypoints that have been detected in all warps).
  • Figure 5: Outlier weighting functions
  • ...and 2 more figures